doc/filter_design.txt - third_party/ffmpeg - Git at Google

 Filter design
 =============

 This document explains guidelines that should be observed (or ignored with
 good reason) when writing filters for libavfilter.

 In this document, the word “frame” indicates either a video frame or a group
 of audio samples, as stored in an AVFilterBuffer structure.


 Format negotiation
 ==================

   The query_formats method should set, for each input and each output links,
   the list of supported formats.

   For video links, that means pixel format. For audio links, that means
   channel layout, sample format (the sample packing is implied by the sample
   format) and sample rate.

   The lists are not just lists, they are references to shared objects. When
   the negotiation mechanism computes the intersection of the formats
   supported at each end of a link, all references to both lists are replaced
   with a reference to the intersection. And when a single format is
   eventually chosen for a link amongst the remaining list, again, all
   references to the list are updated.

   That means that if a filter requires that its input and output have the
   same format amongst a supported list, all it has to do is use a reference
   to the same list of formats.

   query_formats can leave some formats unset and return AVERROR(EAGAIN) to
   cause the negotiation mechanism to try again later. That can be used by
   filters with complex requirements to use the format negotiated on one link
   to set the formats supported on another.


 Buffer references ownership and permissions
 ===========================================

   Principle
   ---------

     Audio and video data are voluminous; the buffer and buffer reference
     mechanism is intended to avoid, as much as possible, expensive copies of
     that data while still allowing the filters to produce correct results.

     The data is stored in buffers represented by AVFilterBuffer structures.
     They must not be accessed directly, but through references stored in
     AVFilterBufferRef structures. Several references can point to the
     same buffer; the buffer is automatically deallocated once all
     corresponding references have been destroyed.

     The characteristics of the data (resolution, sample rate, etc.) are
     stored in the reference; different references for the same buffer can
     show different characteristics. In particular, a video reference can
     point to only a part of a video buffer.

     A reference is usually obtained as input to the start_frame or
     filter_frame method or requested using the ff_get_video_buffer or
     ff_get_audio_buffer functions. A new reference on an existing buffer can
     be created with the avfilter_ref_buffer. A reference is destroyed using
     the avfilter_unref_bufferp function.

   Reference ownership
   -------------------

     At any time, a reference “belongs” to a particular piece of code,
     usually a filter. With a few caveats that will be explained below, only
     that piece of code is allowed to access it. It is also responsible for
     destroying it, although this is sometimes done automatically (see the
     section on link reference fields).

     Here are the (fairly obvious) rules for reference ownership:

     * A reference received by the filter_frame method (or its start_frame
       deprecated version) belongs to the corresponding filter.

       Special exception: for video references: the reference may be used
       internally for automatic copying and must not be destroyed before
       end_frame; it can be given away to ff_start_frame.

     * A reference passed to ff_filter_frame (or the deprecated
       ff_start_frame) is given away and must no longer be used.

     * A reference created with avfilter_ref_buffer belongs to the code that
       created it.

     * A reference obtained with ff_get_video_buffer or ff_get_audio_buffer
       belongs to the code that requested it.

     * A reference given as return value by the get_video_buffer or
       get_audio_buffer method is given away and must no longer be used.

   Link reference fields
   ---------------------

     The AVFilterLink structure has a few AVFilterBufferRef fields. The
     cur_buf and out_buf were used with the deprecated
     start_frame/draw_slice/end_frame API and should no longer be used.
     src_buf, cur_buf_copy and partial_buf are used by libavfilter internally
     and must not be accessed by filters.

   Reference permissions
   ---------------------

     The AVFilterBufferRef structure has a perms field that describes what
     the code that owns the reference is allowed to do to the buffer data.
     Different references for the same buffer can have different permissions.

     For video filters that implement the deprecated
     start_frame/draw_slice/end_frame API, the permissions only apply to the
     parts of the buffer that have already been covered by the draw_slice
     method.

     The value is a binary OR of the following constants:

     * AV_PERM_READ: the owner can read the buffer data; this is essentially
       always true and is there for self-documentation.

     * AV_PERM_WRITE: the owner can modify the buffer data.

     * AV_PERM_PRESERVE: the owner can rely on the fact that the buffer data
       will not be modified by previous filters.

     * AV_PERM_REUSE: the owner can output the buffer several times, without
       modifying the data in between.

     * AV_PERM_REUSE2: the owner can output the buffer several times and
       modify the data in between (useless without the WRITE permissions).

     * AV_PERM_ALIGN: the owner can access the data using fast operations
       that require data alignment.

     The READ, WRITE and PRESERVE permissions are about sharing the same
     buffer between several filters to avoid expensive copies without them
     doing conflicting changes on the data.

     The REUSE and REUSE2 permissions are about special memory for direct
     rendering. For example a buffer directly allocated in video memory must
     not modified once it is displayed on screen, or it will cause tearing;
     it will therefore not have the REUSE2 permission.

     The ALIGN permission is about extracting part of the buffer, for
     copy-less padding or cropping for example.


     References received on input pads are guaranteed to have all the
     permissions stated in the min_perms field and none of the permissions
     stated in the rej_perms.

     References obtained by ff_get_video_buffer and ff_get_audio_buffer are
     guaranteed to have at least all the permissions requested as argument.

     References created by avfilter_ref_buffer have the same permissions as
     the original reference minus the ones explicitly masked; the mask is
     usually ~0 to keep the same permissions.

     Filters should remove permissions on reference they give to output
     whenever necessary. It can be automatically done by setting the
     rej_perms field on the output pad.

     Here are a few guidelines corresponding to common situations:

     * Filters that modify and forward their frame (like drawtext) need the
       WRITE permission.

     * Filters that read their input to produce a new frame on output (like
       scale) need the READ permission on input and and must request a buffer
       with the WRITE permission.

     * Filters that intend to keep a reference after the filtering process
       is finished (after filter_frame returns) must have the PRESERVE
       permission on it and remove the WRITE permission if they create a new
       reference to give it away.

     * Filters that intend to modify a reference they have kept after the end
       of the filtering process need the REUSE2 permission and must remove
       the PRESERVE permission if they create a new reference to give it
       away.


 Frame scheduling
 ================

   The purpose of these rules is to ensure that frames flow in the filter
   graph without getting stuck and accumulating somewhere.

   Simple filters that output one frame for each input frame should not have
   to worry about it.

   filter_frame
   ------------

     This method is called when a frame is pushed to the filter's input. It
     can be called at any time except in a reentrant way.

     If the input frame is enough to produce output, then the filter should
     push the output frames on the output link immediately.

     As an exception to the previous rule, if the input frame is enough to
     produce several output frames, then the filter needs output only at
     least one per link. The additional frames can be left buffered in the
     filter; these buffered frames must be flushed immediately if a new input
     produces new output.

     (Example: frame rate-doubling filter: filter_frame must (1) flush the
     second copy of the previous frame, if it is still there, (2) push the
     first copy of the incoming frame, (3) keep the second copy for later.)

     If the input frame is not enough to produce output, the filter must not
     call request_frame to get more. It must just process the frame or queue
     it. The task of requesting more frames is left to the filter's
     request_frame method or the application.

     If a filter has several inputs, the filter must be ready for frames
     arriving randomly on any input. Therefore, any filter with several inputs
     will most likely require some kind of queuing mechanism. It is perfectly
     acceptable to have a limited queue and to drop frames when the inputs
     are too unbalanced.

   request_frame
   -------------

     This method is called when a frame is wanted on an output.

     For an input, it should directly call filter_frame on the corresponding
     output.

     For a filter, if there are queued frames already ready, one of these
     frames should be pushed. If not, the filter should request a frame on
     one of its inputs, repeatedly until at least one frame has been pushed.

     Return values:
     if request_frame could produce a frame, it should return 0;
     if it could not for temporary reasons, it should return AVERROR(EAGAIN);
     if it could not because there are no more frames, it should return
     AVERROR_EOF.

     The typical implementation of request_frame for a filter with several
     inputs will look like that:

         if (frames_queued) {
             push_one_frame();
             return 0;
         }
         while (!frame_pushed) {
             input = input_where_a_frame_is_most_needed();
             ret = ff_request_frame(input);
             if (ret == AVERROR_EOF) {
                 process_eof_on_input();
             } else if (ret < 0) {
                 return ret;
             }
         }
         return 0;

     Note that, except for filters that can have queued frames, request_frame
     does not push frames: it requests them to its input, and as a reaction,
     the filter_frame method will be called and do the work.

 Legacy API
 ==========

   Until libavfilter 3.23, the filter_frame method was split:

   - for video filters, it was made of start_frame, draw_slice (that could be
     called several times on distinct parts of the frame) and end_frame;

   - for audio filters, it was called filter_samples.
	Filter design
	=============

	This document explains guidelines that should be observed (or ignored with
	good reason) when writing filters for libavfilter.

	In this document, the word “frame” indicates either a video frame or a group
	of audio samples, as stored in an AVFilterBuffer structure.


	Format negotiation
	==================

	The query_formats method should set, for each input and each output links,
	the list of supported formats.

	For video links, that means pixel format. For audio links, that means
	channel layout, sample format (the sample packing is implied by the sample
	format) and sample rate.

	The lists are not just lists, they are references to shared objects. When
	the negotiation mechanism computes the intersection of the formats
	supported at each end of a link, all references to both lists are replaced
	with a reference to the intersection. And when a single format is
	eventually chosen for a link amongst the remaining list, again, all
	references to the list are updated.

	That means that if a filter requires that its input and output have the
	same format amongst a supported list, all it has to do is use a reference
	to the same list of formats.

	query_formats can leave some formats unset and return AVERROR(EAGAIN) to
	cause the negotiation mechanism to try again later. That can be used by
	filters with complex requirements to use the format negotiated on one link
	to set the formats supported on another.


	Buffer references ownership and permissions
	===========================================

	Principle
	---------

	Audio and video data are voluminous; the buffer and buffer reference
	mechanism is intended to avoid, as much as possible, expensive copies of
	that data while still allowing the filters to produce correct results.

	The data is stored in buffers represented by AVFilterBuffer structures.
	They must not be accessed directly, but through references stored in
	AVFilterBufferRef structures. Several references can point to the
	same buffer; the buffer is automatically deallocated once all
	corresponding references have been destroyed.

	The characteristics of the data (resolution, sample rate, etc.) are
	stored in the reference; different references for the same buffer can
	show different characteristics. In particular, a video reference can
	point to only a part of a video buffer.

	A reference is usually obtained as input to the start_frame or
	filter_frame method or requested using the ff_get_video_buffer or
	ff_get_audio_buffer functions. A new reference on an existing buffer can
	be created with the avfilter_ref_buffer. A reference is destroyed using
	the avfilter_unref_bufferp function.

	Reference ownership
	-------------------

	At any time, a reference “belongs” to a particular piece of code,
	usually a filter. With a few caveats that will be explained below, only
	that piece of code is allowed to access it. It is also responsible for
	destroying it, although this is sometimes done automatically (see the
	section on link reference fields).

	Here are the (fairly obvious) rules for reference ownership:

	* A reference received by the filter_frame method (or its start_frame
	deprecated version) belongs to the corresponding filter.

	Special exception: for video references: the reference may be used
	internally for automatic copying and must not be destroyed before
	end_frame; it can be given away to ff_start_frame.

	* A reference passed to ff_filter_frame (or the deprecated
	ff_start_frame) is given away and must no longer be used.

	* A reference created with avfilter_ref_buffer belongs to the code that
	created it.

	* A reference obtained with ff_get_video_buffer or ff_get_audio_buffer
	belongs to the code that requested it.

	* A reference given as return value by the get_video_buffer or
	get_audio_buffer method is given away and must no longer be used.

	Link reference fields
	---------------------

	The AVFilterLink structure has a few AVFilterBufferRef fields. The
	cur_buf and out_buf were used with the deprecated
	start_frame/draw_slice/end_frame API and should no longer be used.
	src_buf, cur_buf_copy and partial_buf are used by libavfilter internally
	and must not be accessed by filters.

	Reference permissions
	---------------------

	The AVFilterBufferRef structure has a perms field that describes what
	the code that owns the reference is allowed to do to the buffer data.
	Different references for the same buffer can have different permissions.

	For video filters that implement the deprecated
	start_frame/draw_slice/end_frame API, the permissions only apply to the
	parts of the buffer that have already been covered by the draw_slice
	method.

	The value is a binary OR of the following constants:

	* AV_PERM_READ: the owner can read the buffer data; this is essentially
	always true and is there for self-documentation.

	* AV_PERM_WRITE: the owner can modify the buffer data.

	* AV_PERM_PRESERVE: the owner can rely on the fact that the buffer data
	will not be modified by previous filters.

	* AV_PERM_REUSE: the owner can output the buffer several times, without
	modifying the data in between.

	* AV_PERM_REUSE2: the owner can output the buffer several times and
	modify the data in between (useless without the WRITE permissions).

	* AV_PERM_ALIGN: the owner can access the data using fast operations
	that require data alignment.

	The READ, WRITE and PRESERVE permissions are about sharing the same
	buffer between several filters to avoid expensive copies without them
	doing conflicting changes on the data.

	The REUSE and REUSE2 permissions are about special memory for direct
	rendering. For example a buffer directly allocated in video memory must
	not modified once it is displayed on screen, or it will cause tearing;
	it will therefore not have the REUSE2 permission.

	The ALIGN permission is about extracting part of the buffer, for
	copy-less padding or cropping for example.


	References received on input pads are guaranteed to have all the
	permissions stated in the min_perms field and none of the permissions
	stated in the rej_perms.

	References obtained by ff_get_video_buffer and ff_get_audio_buffer are
	guaranteed to have at least all the permissions requested as argument.

	References created by avfilter_ref_buffer have the same permissions as
	the original reference minus the ones explicitly masked; the mask is
	usually ~0 to keep the same permissions.

	Filters should remove permissions on reference they give to output
	whenever necessary. It can be automatically done by setting the
	rej_perms field on the output pad.

	Here are a few guidelines corresponding to common situations:

	* Filters that modify and forward their frame (like drawtext) need the
	WRITE permission.

	* Filters that read their input to produce a new frame on output (like
	scale) need the READ permission on input and and must request a buffer
	with the WRITE permission.

	* Filters that intend to keep a reference after the filtering process
	is finished (after filter_frame returns) must have the PRESERVE
	permission on it and remove the WRITE permission if they create a new
	reference to give it away.

	* Filters that intend to modify a reference they have kept after the end
	of the filtering process need the REUSE2 permission and must remove
	the PRESERVE permission if they create a new reference to give it
	away.


	Frame scheduling
	================

	The purpose of these rules is to ensure that frames flow in the filter
	graph without getting stuck and accumulating somewhere.

	Simple filters that output one frame for each input frame should not have
	to worry about it.

	filter_frame
	------------

	This method is called when a frame is pushed to the filter's input. It
	can be called at any time except in a reentrant way.

	If the input frame is enough to produce output, then the filter should
	push the output frames on the output link immediately.

	As an exception to the previous rule, if the input frame is enough to
	produce several output frames, then the filter needs output only at
	least one per link. The additional frames can be left buffered in the
	filter; these buffered frames must be flushed immediately if a new input
	produces new output.

	(Example: frame rate-doubling filter: filter_frame must (1) flush the
	second copy of the previous frame, if it is still there, (2) push the
	first copy of the incoming frame, (3) keep the second copy for later.)

	If the input frame is not enough to produce output, the filter must not
	call request_frame to get more. It must just process the frame or queue
	it. The task of requesting more frames is left to the filter's
	request_frame method or the application.

	If a filter has several inputs, the filter must be ready for frames
	arriving randomly on any input. Therefore, any filter with several inputs
	will most likely require some kind of queuing mechanism. It is perfectly
	acceptable to have a limited queue and to drop frames when the inputs
	are too unbalanced.

	request_frame
	-------------

	This method is called when a frame is wanted on an output.

	For an input, it should directly call filter_frame on the corresponding
	output.

	For a filter, if there are queued frames already ready, one of these
	frames should be pushed. If not, the filter should request a frame on
	one of its inputs, repeatedly until at least one frame has been pushed.

	Return values:
	if request_frame could produce a frame, it should return 0;
	if it could not for temporary reasons, it should return AVERROR(EAGAIN);
	if it could not because there are no more frames, it should return
	AVERROR_EOF.

	The typical implementation of request_frame for a filter with several
	inputs will look like that:

	if (frames_queued) {
	push_one_frame();
	return 0;
	}
	while (!frame_pushed) {
	input = input_where_a_frame_is_most_needed();
	ret = ff_request_frame(input);
	if (ret == AVERROR_EOF) {
	process_eof_on_input();
	} else if (ret < 0) {
	return ret;
	}
	}
	return 0;

	Note that, except for filters that can have queued frames, request_frame
	does not push frames: it requests them to its input, and as a reaction,
	the filter_frame method will be called and do the work.

	Legacy API
	==========

	Until libavfilter 3.23, the filter_frame method was split:

	- for video filters, it was made of start_frame, draw_slice (that could be
	called several times on distinct parts of the frame) and end_frame;

	- for audio filters, it was called filter_samples.