#lang scribble/manual @title{FFmpeg Audio Backend} @author{@author+email["Hans Dijkema" "hans@dijkewijk.nl"]} @section{Overview} The FFmpeg audio backend is a small C++ wrapper with a plain C ABI. It hides the FFmpeg data structures from the caller and exposes a simple audio-only decoder interface. The caller does not handle FFmpeg streams, packets, frames, codec contexts or resampler objects. A file is opened, the best audio stream is selected, and decoding is performed by repeatedly calling @tt{fmpg_decode_next}. The output format is fixed: signed 32-bit integer PCM, interleaved, in native endian format. A sample frame means one sample moment across all channels. For stereo S32, one sample frame contains two @tt{int32_t} values and therefore takes 8 bytes. @section{Opaque Instance} @verbatim|{ typedef struct fmpg_instance fmpg_instance; }| The decoder instance is opaque. The caller only receives and passes around a pointer to this type. All FFmpeg state is stored internally. @section{Lifecycle} @verbatim|{ fmpg_instance *fmpg_init(void); }| Creates a new decoder instance. Before allocating the instance, the backend checks whether the FFmpeg major versions used at compile time match the FFmpeg major versions available at runtime. If they do not match, @tt{NULL} is returned. Returns a pointer to a new @tt{fmpg_instance}, or @tt{NULL} on failure. @verbatim|{ void fmpg_free(fmpg_instance *instance); }| Frees the decoder instance. If the instance still has an open input, it is closed as part of destruction. @verbatim|{ int fmpg_open_file(fmpg_instance *instance, const char *filename); }| Opens a media file, selects the best audio stream, initializes the decoder and initializes the resampler. After a successful call, stream information, duration and metadata can be read using the getter functions. Returns @tt{1} on success and @tt{0} on failure. The call fails if the instance is @tt{NULL}, if a file is already open, if @tt{filename} is @tt{NULL}, if no usable audio stream is found, or if FFmpeg cannot open or initialize the file. @verbatim|{ void fmpg_close(fmpg_instance *instance); }| Closes the current file and releases all FFmpeg state owned by the instance. The instance itself remains valid and may be reused. @verbatim|{ int fmpg_is_open(fmpg_instance *instance); }| Returns @tt{1} if the instance is open and ready to decode. Otherwise returns @tt{0}. @section{Audio Information} @verbatim|{ int fmpg_audio_stream_count(fmpg_instance *instance); int fmpg_audio_sample_rate(fmpg_instance *instance); int fmpg_audio_channels(fmpg_instance *instance); int fmpg_audio_bits_per_sample(fmpg_instance *instance); int fmpg_audio_bytes_per_sample(fmpg_instance *instance); int64_t fmpg_duration_ms(fmpg_instance *instance); int64_t fmpg_duration_samples(fmpg_instance *instance); }| These functions return information about the selected audio stream. @itemlist[ #:style 'compact @item{@tt{fmpg_audio_stream_count} returns the number of audio streams found in the opened file, or @tt{0}.} @item{@tt{fmpg_audio_sample_rate} returns the selected stream's sample rate, or @tt{0}.} @item{@tt{fmpg_audio_channels} returns the selected stream's channel count, or @tt{0}.} @item{@tt{fmpg_audio_bits_per_sample} always returns @tt{32}.} @item{@tt{fmpg_audio_bytes_per_sample} always returns @tt{4}.} @item{@tt{fmpg_duration_ms} returns the duration in milliseconds, or @tt{-1}.} @item{@tt{fmpg_duration_samples} returns the duration in output sample frames, or @tt{-1}.} ] @section{Metadata} @verbatim|{ const char *fmpg_file_title(fmpg_instance *instance); const char *fmpg_file_author(fmpg_instance *instance); const char *fmpg_file_album(fmpg_instance *instance); const char *fmpg_file_genre(fmpg_instance *instance); const char *fmpg_file_comment(fmpg_instance *instance); const char *fmpg_file_copyright(fmpg_instance *instance); int fmpg_file_year(fmpg_instance *instance); int fmpg_file_track(fmpg_instance *instance); int64_t fmpg_file_bitrate(fmpg_instance *instance); }| The metadata getters return values read from the container metadata. A missing string value is returned as an empty string. A missing numeric value is returned as @tt{-1}. @tt{fmpg_file_author} returns the @tt{artist} metadata field. @section{Decoding} @verbatim|{ int fmpg_decode_next(fmpg_instance *instance); }| Decodes the next block of audio. Internally, the backend reads packets from the selected audio stream, feeds them to the FFmpeg decoder, receives all available decoded frames, converts them to signed 32-bit interleaved PCM, and concatenates the result in the instance output buffer. Packets from non-selected streams are skipped internally. Returns @tt{1} if decoded PCM data is available through @tt{fmpg_buffer} and @tt{fmpg_buffer_size}. Returns @tt{0} at EOF or on error. @verbatim|{ int fmpg_seek_ms(fmpg_instance *instance, int64_t target_pos_ms); }| Seeks to an absolute position in milliseconds. FFmpeg may seek to a packet before the requested timestamp. After seeking, this backend discards decoded pre-roll samples until the requested output sample position is reached, when timestamps are available. Returns @tt{1} on success and @tt{0} on failure. @section{Output Buffer and Sample Positions} @verbatim|{ const uint8_t *fmpg_buffer(fmpg_instance *instance); int fmpg_buffer_size(fmpg_instance *instance); int64_t fmpg_buffer_samples(fmpg_instance *instance); int64_t fmpg_buffer_start_sample(fmpg_instance *instance); int64_t fmpg_buffer_end_sample(fmpg_instance *instance); int64_t fmpg_sample_position(fmpg_instance *instance); double fmpg_timecode(fmpg_instance *instance); }| @tt{fmpg_buffer} returns a pointer to the current decoded PCM buffer, or @tt{NULL} if there is no current buffer. The pointer remains valid only until the next API call that decodes, seeks, closes or frees the instance. @tt{fmpg_buffer_size} returns the size of the current buffer in bytes. @tt{fmpg_buffer_samples} returns the number of sample frames in the current buffer. @tt{fmpg_buffer_start_sample} returns the absolute sample-frame index of the first sample frame in the buffer, and @tt{fmpg_buffer_end_sample} returns the absolute sample-frame index just after the current buffer. @tt{fmpg_sample_position} returns the current absolute sample position in the music stream. After a successful @tt{fmpg_decode_next}, this is the same value as @tt{fmpg_buffer_end_sample}. @tt{fmpg_timecode} returns the approximate start time of the current decoded block in seconds. @section{FFmpeg Version Checks} @verbatim|{ const char *fmpg_ffmpeg_version(void); const char *fmpg_int_version2string(unsigned version); int fmpg_compatible_ffmpeg(void); }| @tt{fmpg_ffmpeg_version} returns a string describing the FFmpeg versions used when the backend was compiled. The string includes avformat, avcodec, swresample and avutil. @tt{fmpg_int_version2string} converts an FFmpeg integer version value to a string of the form @tt{major.minor.micro}. @tt{fmpg_compatible_ffmpeg} checks whether the FFmpeg major versions used at compile time match the FFmpeg major versions available at runtime. It returns @tt{1} when the versions are compatible and @tt{0} otherwise. @section{Decoder Model} The backend uses the modern FFmpeg send/receive decoding model. Packets are sent with @tt{avcodec_send_packet}, decoded frames are received with @tt{avcodec_receive_frame}, and conversion to the fixed output format is done with libswresample. The public API intentionally avoids exposing these details. From the caller perspective, decoding is a sequence of calls to @tt{fmpg_decode_next} followed by reading the current output buffer and its sample-position metadata.