Files
racket-audio/scrbl/ffmpeg-decoder.scrbl
2026-05-04 13:27:02 +02:00

128 lines
4.3 KiB
Racket

#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/contract
racket/path
"../ffmpeg-decoder.rkt"))
@title{FFmpeg Decoder}
@author{@author+email["Hans Dijkema" "hans@dijkewijk.nl"]}
@defmodule[racket-audio/ffmpeg-decoder]
This module provides an audio decoder based on the FFmpeg audio shim. It
uses the lower-level @racketmodname[racket-sound/ffmpeg-ffi] module and presents a
callback-based decoder interface comparable to the other audio decoders.
The native FFmpeg layer decodes audio to signed 32-bit interleaved PCM.
The decoder therefore reports 32 bits per sample and 4 bytes per sample
when no more specific information is available.
@defproc[(ffmpeg-valid? [audio-file any/c]) boolean?]{
Returns @racket[#t].
This predicate is deliberately weak. Existence and extension checks are
expected to be performed by the generic audio-decoder layer. Actual file
validation is done when the FFmpeg shim opens the file.
}
@defproc[(ffmpeg-open [audio-file (or/c path? string?)]
[cb-stream-info procedure?]
[cb-audio procedure?])
(or/c any/c #f)]{
Opens @racket[audio-file] and returns an opaque decoder handle, or
@racket[#f] if the file does not exist.
If @racket[audio-file] is a path, it is converted to a string before it
is passed to the native layer.
The @racket[cb-stream-info] callback is called with a mutable hash that
describes the stream. The @racket[cb-audio] callback is called with the
same kind of hash, a PCM buffer pointer and the buffer size in bytes.
}
@defproc[(ffmpeg-read [handle any/c]) any/c]{
Starts reading and decoding audio from @racket[handle].
This function loops until decoding reaches the end of the stream or
until @racket[ffmpeg-stop] requests termination. During the read loop,
pending seek requests made with @racket[ffmpeg-seek] are applied before
the next native read.
The stream-info callback is called when format information becomes
available. The audio callback is called as:
@racketblock[
(cb-audio info buffer size)
]
where @racket[info] is a mutable hash, @racket[buffer] is a pointer to
interleaved signed 32-bit PCM data, and @racket[size] is the size of the
buffer in bytes.
When reading stops, the native FFmpeg instance is closed and deleted.
}
@defproc[(ffmpeg-seek [handle any/c]
[percentage real?])
void?]{
Requests a seek operation.
The @racket[percentage] argument is interpreted as a percentage of the
total number of samples in the stream. Fractional percentages are
allowed. The actual seek is performed by @racket[ffmpeg-read] before the
next native read call.
If the total sample count is unknown or invalid, no seek request is made.
}
@defproc[(ffmpeg-stop [handle any/c]) void?]{
Requests the read loop to stop.
This function waits until @racket[ffmpeg-read] has left its read loop.
It polls the internal reading flag with a short sleep interval.
}
@section{Stream Information}
The stream-info and audio callbacks receive a mutable hash. The decoder
stores at least the following keys:
@itemlist[
#:style 'compact
@item{@racket['sample-rate]}
@item{@racket['channels]}
@item{@racket['bits-per-sample]}
@item{@racket['bytes-per-sample]}
@item{@racket['total-samples]}
@item{@racket['duration]}
]
For audio callbacks, the hash is also updated with:
@itemlist[
#:style 'compact
@item{@racket['sample], the current sample position}
@item{@racket['current-time], the current time in seconds}
]
If the native layer omits format values, the decoder fills in the most
recent known values. Initial defaults are 44100 Hz, 2 channels, 32 bits
per sample and 4 bytes per sample.
@section{Decoding Model}
The decoder keeps a small Racket handle around the native FFmpeg handler.
The handle stores the callbacks, stop and seek state, the current reading
state and the current format hash.
Seeking is asynchronous with respect to @racket[ffmpeg-seek]: the
function only records the requested target sample. The read loop applies
the pending seek request before decoding the next block.
@section{Notes}
The FFmpeg shim output is expected to be signed 32-bit interleaved PCM.
This keeps the decoder interface suitable for a playback pipeline that
feeds decoded audio to libao.