initial import from racket-sound -> racket-audio

This commit is contained in:
2026-05-04 12:07:45 +02:00
parent f500f1711b
commit 87980f508a
28 changed files with 6282 additions and 16 deletions
+259
View File
@@ -0,0 +1,259 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/path
"../audio-decoder.rkt"))
@title{audio-decoder}
@author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]]
@defmodule[(file "../audio-decoder.rkt")]
This module provides a small abstraction layer over concrete audio
decoders. A backend is selected from the filename extension and is then
used through a uniform interface for opening, reading, seeking, and
stopping.
The module includes built-in readers for FLAC and MP3, and it allows
additional backends to be registered with
@racket[audio-register-reader!].
@section{Reader registration}
A reader descriptor stores the extensions handled by a backend together
with the procedures used to validate, open, read, seek, and stop that
backend, plus an audio-output type.
@defproc[(make-audio-reader [exts (listof string?)]
[valid? procedure?]
[open procedure?]
[reader procedure?]
[seeker procedure?]
[stopper procedure?]
[ao-type symbol?])
struct?]{
Creates a reader descriptor.
The @racket[exts] list contains the filename extensions handled by the
reader, without a leading dot. Matching is case-insensitive.
The procedures are used as follows:
@itemlist[#:style 'compact
@item{@racket[valid?] checks whether a file is valid for this reader;}
@item{@racket[open] opens a decoder for a file;}
@item{@racket[reader] reads or continues decoding;}
@item{@racket[seeker] seeks within the audio stream;}
@item{@racket[stopper] stops an active decode loop.}]
The @racket[ao-type] value describes the buffer format exposed to the
audio output layer. The source comments mention values such as
@racket['flac] and @racket['ao]. The value @racket['ao] means that the
buffer can be used directly by the audio-output backend.
}
@defproc[(audio-register-reader! [type symbol?]
[reader struct?])
void?]{
Registers @racket[reader] under @racket[type].
The extensions declared in @racket[reader] are appended to the list
returned by @racket[audio-known-exts?], and the reader becomes
available to @racket[audio-open].
This procedure is the extension point for custom audio decoders.
}
@section{Audio handles}
@defproc[(audio-handle? [v any/c]) boolean?]{
Returns @racket[#t] if @racket[v] is an audio handle, and @racket[#f]
otherwise.
}
@defproc[(audio-kind [handle audio-handle?]) symbol?]{
Returns the reader type stored in @racket[handle].
For the built-in readers this is either @racket['flac] or
@racket['mp3].
}
@section{Known extensions and validation}
@defproc[(audio-known-exts?) (listof string?)]{
Returns the list of known filename extensions.
The initial list contains @racket["flac"] and @racket["mp3"].
Additional extensions are added when readers are registered with
@racket[audio-register-reader!].
}
@defproc[(audio-valid-ext? [ext any/c]) boolean?]{
Returns @racket[#t] if @racket[ext] denotes a known filename
extension, and @racket[#f] otherwise.
The argument is converted to a string. If it starts with a dot, that
dot is removed. Matching is case-insensitive.
}
@defproc[(audio-file-valid? [file (or/c string? path?)]) boolean?]{
Returns @racket[#t] if @racket[file] has a known extension and the
matching registered reader reports the file as valid.
This procedure first derives the filename extension and checks it with
@racket[audio-valid-ext?]. If the extension is known, it then looks up
the matching reader and calls that reader's validity procedure.
}
@section{Opening and callbacks}
@defproc[(audio-open [audio-file (or/c string? path?)]
[cb-stream-info procedure?]
[cb-audio procedure?])
audio-handle?]{
Opens an audio decoder for @racket[audio-file].
If @racket[audio-file] is a path, it is converted to a string before it
is passed to the backend open procedure.
This procedure raises an exception if the file is not considered a
valid audio file, if the file does not exist, or if no registered
reader can be found for the file.
The returned handle stores the selected reader type, the two callback
procedures, the reader descriptor, and the driver-specific handle
returned by the backend open procedure.
The callback procedures are wrapped before they are passed to the
backend.
The stream-info callback is called as:
@racketblock[
(cb-stream-info audio-type ao-type handle meta)
]
where:
@itemlist[#:style 'compact
@item{@racket[audio-type] is the registered reader type, such as
@racket['flac] or @racket['mp3];}
@item{@racket[ao-type] is the audio-output type stored in the reader,
such as @racket['flac] or @racket['ao];}
@item{@racket[handle] is the generic @racket[audio-handle];}
@item{@racket[meta] is a hash table with stream metadata.}]
According to the source comments, @racket[meta] must contain at least:
@itemlist[#:style 'compact
@item{@racket['duration] --- duration of the audio in seconds, possibly
fractional;}
@item{@racket['bits-per-sample] --- number of audio bits per sample;}
@item{@racket['channels] --- number of audio channels;}
@item{@racket['sample-rate] --- number of samples per second per
channel;}
@item{@racket['total-samples] --- total number of samples in the
audio.}]
The audio callback is called as:
@racketblock[
(cb-audio audio-type ao-type handle buf-info buffer buf-size)
]
where:
@itemlist[#:style 'compact
@item{@racket[audio-type] is the registered reader type;}
@item{@racket[ao-type] is the audio-output type stored in the reader;}
@item{@racket[handle] is the generic @racket[audio-handle];}
@item{@racket[buf-info] is a hash table describing the audio buffer;}
@item{@racket[buffer] is a native buffer containing audio data;}
@item{@racket[buf-size] is the size of that buffer in bytes.}]
According to the source comments, the buffer is to be owned and
released by the decoder driver. The comments also note that the
@tt{ao-async} backend copies the data.
According to the source comments, @racket[buf-info] must contain at
least:
@itemlist[#:style 'compact
@item{@racket['duration] --- duration of the audio in seconds, possibly
fractional;}
@item{@racket['bits-per-sample] --- number of audio bits per sample;}
@item{@racket['channels] --- number of audio channels;}
@item{@racket['sample-rate] --- number of samples per second per
channel;}
@item{@racket['total-samples] --- total number of samples in the
audio;}
@item{@racket['sample] --- the current sample to which the audio
buffer applies.}]
}
@section{Reading, seeking, and stopping}
@defproc[(audio-read [handle audio-handle?]) void?]{
Calls the registered reader procedure for @racket[handle].
The concrete reader procedure receives the driver-specific handle
stored in the generic audio handle. Any result value produced by the
backend is discarded.
}
@defproc[(audio-seek [handle audio-handle?]
[percentage number?])
void?]{
Calls the registered seek procedure for @racket[handle].
The @racket[percentage] argument is passed unchanged to the backend
seek procedure.
In this abstraction layer, the parameter represents a relative
position in the full audio stream. A backend registered through
@racket[audio-register-reader!] is expected to follow that
interpretation.
}
@defproc[(audio-stop [handle audio-handle?]) void?]{
Calls the registered stop procedure for @racket[handle].
The concrete stop procedure receives the driver-specific handle stored
in the generic audio handle.
}
@section{Using custom decoders}
Custom audio decoders can be integrated by constructing a reader
descriptor with @racket[make-audio-reader] and registering it with
@racket[audio-register-reader!].
A backend integrated through this interface should provide:
@itemlist[#:style 'compact
@item{a list of handled filename extensions;}
@item{a file-validity procedure;}
@item{an open procedure that accepts a file path, a stream-info
callback, and an audio callback;}
@item{a read procedure that accepts the driver-specific handle;}
@item{a seek procedure that accepts the driver-specific handle and a
numeric relative position;}
@item{a stop procedure that accepts the driver-specific handle;}
@item{an audio-output type symbol describing the kind of buffers the
backend produces.}]
Once registered, files with matching extensions can be opened through
@racket[audio-open] in the same way as the built-in FLAC and MP3
backends.
+173
View File
@@ -0,0 +1,173 @@
#lang scribble/manual
@(require (for-label racket/base
racket/contract
"../audio-sniffer.rkt"))
@title{audio-sniffer}
@author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]]
@defmodule[(file "../audio-sniffer.rkt")]
This module provides functionality to detect audio file formats based on
file contents (signature sniffing) and, optionally, file extensions.
The sniffer prefers binary inspection over extensions and only falls back
to extensions when detection is inconclusive.
@section{Overview}
The detection strategy is as follows:
@itemlist[
#:style 'compact
@item{Read a prefix of the file (default 4096 bytes)}
@item{Match known binary signatures ("magic numbers")}
@item{Apply format-specific heuristics (e.g. MP3 frame sync, AAC ADTS)}
@item{For ISO-BMFF (MP4/M4A), scan both head and tail for codec markers}
@item{If still unknown, optionally fall back to file extension}
]
The result is always a symbol describing the detected format or a status.
@section{Formats}
Known audio formats:
@racketblock[
'(mp3 flac ogg vorbis opus wav aiff
mp4 aac alac encrypted-audio
ac3 ape wavpack wma matroska)
]
Status values:
@racketblock[
'(unknown file-not-found file-not-readable not-a-file)
]
@section{API}
@defproc[(audio-format? [v any/c]) boolean?]{
Returns @racket[#t] if @racket[v] is a known audio format or status symbol.
}
@defproc[(audio-sniff-format [file path-string?]) audio-format?]{
Detects the audio format of @racket[file] using binary inspection only.
Returns one of:
@itemlist[
#:style 'compact
@item{A format symbol such as @racket['mp3], @racket['flac], etc.}
@item{A status symbol such as @racket['file-not-found]}
]
This function does not use the file extension.
}
@defproc[(audio-sniff-format/extension [file path-string?]) audio-format?]{
Like @racket[audio-sniff-format], but falls back to the file extension
if content-based detection returns @racket['unknown].
This is typically the preferred entry point in user-facing code.
}
@defproc[(audio-sniff-extension [file path-string?]) (or/c string? #f)]{
Returns the lowercase file extension (without dot), or @racket[#f]
if no extension is present.
}
@defproc[(audio-format-known? [fmt symbol?]) boolean?]{
Returns @racket[#t] if @racket[fmt] is a known audio format
(excludes status symbols).
}
@defproc[(audio-format-matches? [file path-string?]
[formats (listof symbol?)])
boolean?]{
Returns @racket[#t] if the detected format of @racket[file] matches
one of @racket[formats].
Detection uses @racket[audio-sniff-format/extension].
}
@section{Architecture}
The sniffer is structured as a layered pipeline:
@itemlist[
#:style 'compact
@item{@bold{I/O layer} -- reads byte ranges from the file (head and tail)}
@item{@bold{Signature layer} -- matches fixed binary identifiers}
@item{@bold{Heuristic layer} -- validates formats without fixed headers}
@item{@bold{Container layer} -- inspects structured containers (MP4, Ogg)}
@item{@bold{Fallback layer} -- maps file extensions to formats}
]
Detection proceeds from cheap and deterministic checks to more
expensive or heuristic ones.
MP4/M4A detection is handled separately because codec identifiers may
appear outside the initial header. For this reason both the beginning
and the end of the file are scanned.
The sniffer is deliberately stateless; each call operates only on the
given file and does not cache results.
@section{Detection Details}
Binary signatures are used where possible:
@itemlist[
#:style 'compact
@item{@bold{FLAC}: @"fLaC"}
@item{@bold{Ogg}: @"OggS" + subtype detection (Opus/Vorbis/FLAC)}
@item{@bold{WAV}: RIFF/WAVE}
@item{@bold{AIFF}: FORM/AIFF or AIFC}
@item{@bold{ASF/WMA}: GUID header}
@item{@bold{Matroska}: EBML header}
@item{@bold{AC3}: 0x0B77 sync word}
@item{@bold{APE}: @"MAC "}
@item{@bold{WavPack}: @"wvpk"}
]
Heuristics are applied for:
@itemlist[
#:style 'compact
@item{MP3 (ID3 header or frame sync validation)}
@item{AAC (ADTS sync pattern)}
]
MP4/M4A detection:
@itemlist[
#:style 'compact
@item{Detect ISO-BMFF via @"ftyp"}
@item{Scan for codec markers: @"mp4a", @"alac", @"enca"}
@item{Perform additional scanning near the end of the file}
]
@section{Why not use FFmpeg?}
The primary reason for implementing a custom sniffer is performance.
Format detection in this module is intentionally lightweight: it reads
only small portions of the file and applies simple, deterministic checks.
In most cases, detection completes after inspecting just a few kilobytes.
Using a library such as FFmpeg would significantly increase the cost of
this operation:
@itemlist[
#:style 'compact
@item{@bold{Startup overhead} -- initialization of codec infrastructure}
@item{@bold{I/O overhead} -- more data is typically read than necessary}
@item{@bold{Processing overhead} -- partial parsing of streams or containers}
]
+218
View File
@@ -0,0 +1,218 @@
#lang scribble/manual
@title{FFmpeg Audio Backend}
@author{@author+email["Hans Dijkema" "hans@dijkewijk.nl"]}
@section{Overview}
The FFmpeg audio backend is a small C++ wrapper with a plain C ABI. It hides
the FFmpeg data structures from the caller and exposes a simple
audio-only decoder interface.
The caller does not handle FFmpeg streams, packets, frames, codec
contexts or resampler objects. A file is opened, the best audio stream is
selected, and decoding is performed by repeatedly calling
@tt{fmpg_decode_next}.
The output format is fixed: signed 32-bit integer PCM, interleaved, in
native endian format.
A sample frame means one sample moment across all channels. For stereo
S32, one sample frame contains two @tt{int32_t} values and therefore
takes 8 bytes.
@section{Opaque Instance}
@verbatim|{
typedef struct fmpg_instance fmpg_instance;
}|
The decoder instance is opaque. The caller only receives and passes
around a pointer to this type. All FFmpeg state is stored internally.
@section{Lifecycle}
@verbatim|{
fmpg_instance *fmpg_init(void);
}|
Creates a new decoder instance.
Before allocating the instance, the backend checks whether the FFmpeg major
versions used at compile time match the FFmpeg major versions available
at runtime. If they do not match, @tt{NULL} is returned.
Returns a pointer to a new @tt{fmpg_instance}, or @tt{NULL} on failure.
@verbatim|{
void fmpg_free(fmpg_instance *instance);
}|
Frees the decoder instance. If the instance still has an open input, it
is closed as part of destruction.
@verbatim|{
int fmpg_open_file(fmpg_instance *instance, const char *filename);
}|
Opens a media file, selects the best audio stream, initializes the
decoder and initializes the resampler.
After a successful call, stream information, duration and metadata can be
read using the getter functions.
Returns @tt{1} on success and @tt{0} on failure. The call fails if the
instance is @tt{NULL}, if a file is already open, if @tt{filename} is
@tt{NULL}, if no usable audio stream is found, or if FFmpeg cannot open
or initialize the file.
@verbatim|{
void fmpg_close(fmpg_instance *instance);
}|
Closes the current file and releases all FFmpeg state owned by the
instance. The instance itself remains valid and may be reused.
@verbatim|{
int fmpg_is_open(fmpg_instance *instance);
}|
Returns @tt{1} if the instance is open and ready to decode. Otherwise
returns @tt{0}.
@section{Audio Information}
@verbatim|{
int fmpg_audio_stream_count(fmpg_instance *instance);
int fmpg_audio_sample_rate(fmpg_instance *instance);
int fmpg_audio_channels(fmpg_instance *instance);
int fmpg_audio_bits_per_sample(fmpg_instance *instance);
int fmpg_audio_bytes_per_sample(fmpg_instance *instance);
int64_t fmpg_duration_ms(fmpg_instance *instance);
int64_t fmpg_duration_samples(fmpg_instance *instance);
}|
These functions return information about the selected audio stream.
@itemlist[
#:style 'compact
@item{@tt{fmpg_audio_stream_count} returns the number of audio streams found in the opened file, or @tt{0}.}
@item{@tt{fmpg_audio_sample_rate} returns the selected stream's sample rate, or @tt{0}.}
@item{@tt{fmpg_audio_channels} returns the selected stream's channel count, or @tt{0}.}
@item{@tt{fmpg_audio_bits_per_sample} always returns @tt{32}.}
@item{@tt{fmpg_audio_bytes_per_sample} always returns @tt{4}.}
@item{@tt{fmpg_duration_ms} returns the duration in milliseconds, or @tt{-1}.}
@item{@tt{fmpg_duration_samples} returns the duration in output sample frames, or @tt{-1}.}
]
@section{Metadata}
@verbatim|{
const char *fmpg_file_title(fmpg_instance *instance);
const char *fmpg_file_author(fmpg_instance *instance);
const char *fmpg_file_album(fmpg_instance *instance);
const char *fmpg_file_genre(fmpg_instance *instance);
const char *fmpg_file_comment(fmpg_instance *instance);
const char *fmpg_file_copyright(fmpg_instance *instance);
int fmpg_file_year(fmpg_instance *instance);
int fmpg_file_track(fmpg_instance *instance);
int64_t fmpg_file_bitrate(fmpg_instance *instance);
}|
The metadata getters return values read from the container metadata. A
missing string value is returned as an empty string. A missing numeric
value is returned as @tt{-1}. @tt{fmpg_file_author} returns the
@tt{artist} metadata field.
@section{Decoding}
@verbatim|{
int fmpg_decode_next(fmpg_instance *instance);
}|
Decodes the next block of audio.
Internally, the backend reads packets from the selected audio stream, feeds
them to the FFmpeg decoder, receives all available decoded frames,
converts them to signed 32-bit interleaved PCM, and concatenates the
result in the instance output buffer.
Packets from non-selected streams are skipped internally.
Returns @tt{1} if decoded PCM data is available through
@tt{fmpg_buffer} and @tt{fmpg_buffer_size}. Returns @tt{0} at EOF or on
error.
@verbatim|{
int fmpg_seek_ms(fmpg_instance *instance, int64_t target_pos_ms);
}|
Seeks to an absolute position in milliseconds.
FFmpeg may seek to a packet before the requested timestamp. After
seeking, this backend discards decoded pre-roll samples until the requested
output sample position is reached, when timestamps are available.
Returns @tt{1} on success and @tt{0} on failure.
@section{Output Buffer and Sample Positions}
@verbatim|{
const uint8_t *fmpg_buffer(fmpg_instance *instance);
int fmpg_buffer_size(fmpg_instance *instance);
int64_t fmpg_buffer_samples(fmpg_instance *instance);
int64_t fmpg_buffer_start_sample(fmpg_instance *instance);
int64_t fmpg_buffer_end_sample(fmpg_instance *instance);
int64_t fmpg_sample_position(fmpg_instance *instance);
double fmpg_timecode(fmpg_instance *instance);
}|
@tt{fmpg_buffer} returns a pointer to the current decoded PCM buffer, or
@tt{NULL} if there is no current buffer. The pointer remains valid only
until the next API call that decodes, seeks, closes or frees the
instance.
@tt{fmpg_buffer_size} returns the size of the current buffer in bytes.
@tt{fmpg_buffer_samples} returns the number of sample frames in the
current buffer. @tt{fmpg_buffer_start_sample} returns the absolute
sample-frame index of the first sample frame in the buffer, and
@tt{fmpg_buffer_end_sample} returns the absolute sample-frame index just
after the current buffer.
@tt{fmpg_sample_position} returns the current absolute sample position in
the music stream. After a successful @tt{fmpg_decode_next}, this is the
same value as @tt{fmpg_buffer_end_sample}.
@tt{fmpg_timecode} returns the approximate start time of the current
decoded block in seconds.
@section{FFmpeg Version Checks}
@verbatim|{
const char *fmpg_ffmpeg_version(void);
const char *fmpg_int_version2string(unsigned version);
int fmpg_compatible_ffmpeg(void);
}|
@tt{fmpg_ffmpeg_version} returns a string describing the FFmpeg versions
used when the backend was compiled. The string includes avformat, avcodec,
swresample and avutil.
@tt{fmpg_int_version2string} converts an FFmpeg integer version value to
a string of the form @tt{major.minor.micro}.
@tt{fmpg_compatible_ffmpeg} checks whether the FFmpeg major versions used
at compile time match the FFmpeg major versions available at runtime.
It returns @tt{1} when the versions are compatible and @tt{0} otherwise.
@section{Decoder Model}
The backend uses the modern FFmpeg send/receive decoding model. Packets are
sent with @tt{avcodec_send_packet}, decoded frames are received with
@tt{avcodec_receive_frame}, and conversion to the fixed output format is
done with libswresample.
The public API intentionally avoids exposing these details. From the
caller perspective, decoding is a sequence of calls to
@tt{fmpg_decode_next} followed by reading the current output buffer and
its sample-position metadata.
+128
View File
@@ -0,0 +1,128 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/contract
racket/path
"../ffmpeg-decoder.rkt"))
@title{FFmpeg Decoder}
@author{@author+email["Hans Dijkema" "hans@dijkewijk.nl"]}
@defmodule[(file "../ffmpeg-decoder.rkt")]
This module provides an audio decoder based on the FFmpeg audio shim. It
uses the lower-level @racketmodname[racket-sound/ffmpeg-ffi] module and presents a
callback-based decoder interface comparable to the other audio decoders.
The native FFmpeg layer decodes audio to signed 32-bit interleaved PCM.
The decoder therefore reports 32 bits per sample and 4 bytes per sample
when no more specific information is available.
@defproc[(ffmpeg-valid? [audio-file any/c]) boolean?]{
Returns @racket[#t].
This predicate is deliberately weak. Existence and extension checks are
expected to be performed by the generic audio-decoder layer. Actual file
validation is done when the FFmpeg shim opens the file.
}
@defproc[(ffmpeg-open [audio-file (or/c path? string?)]
[cb-stream-info procedure?]
[cb-audio procedure?])
(or/c any/c #f)]{
Opens @racket[audio-file] and returns an opaque decoder handle, or
@racket[#f] if the file does not exist.
If @racket[audio-file] is a path, it is converted to a string before it
is passed to the native layer.
The @racket[cb-stream-info] callback is called with a mutable hash that
describes the stream. The @racket[cb-audio] callback is called with the
same kind of hash, a PCM buffer pointer and the buffer size in bytes.
}
@defproc[(ffmpeg-read [handle any/c]) any/c]{
Starts reading and decoding audio from @racket[handle].
This function loops until decoding reaches the end of the stream or
until @racket[ffmpeg-stop] requests termination. During the read loop,
pending seek requests made with @racket[ffmpeg-seek] are applied before
the next native read.
The stream-info callback is called when format information becomes
available. The audio callback is called as:
@racketblock[
(cb-audio info buffer size)
]
where @racket[info] is a mutable hash, @racket[buffer] is a pointer to
interleaved signed 32-bit PCM data, and @racket[size] is the size of the
buffer in bytes.
When reading stops, the native FFmpeg instance is closed and deleted.
}
@defproc[(ffmpeg-seek [handle any/c]
[percentage real?])
void?]{
Requests a seek operation.
The @racket[percentage] argument is interpreted as a percentage of the
total number of samples in the stream. Fractional percentages are
allowed. The actual seek is performed by @racket[ffmpeg-read] before the
next native read call.
If the total sample count is unknown or invalid, no seek request is made.
}
@defproc[(ffmpeg-stop [handle any/c]) void?]{
Requests the read loop to stop.
This function waits until @racket[ffmpeg-read] has left its read loop.
It polls the internal reading flag with a short sleep interval.
}
@section{Stream Information}
The stream-info and audio callbacks receive a mutable hash. The decoder
stores at least the following keys:
@itemlist[
#:style 'compact
@item{@racket['sample-rate]}
@item{@racket['channels]}
@item{@racket['bits-per-sample]}
@item{@racket['bytes-per-sample]}
@item{@racket['total-samples]}
@item{@racket['duration]}
]
For audio callbacks, the hash is also updated with:
@itemlist[
#:style 'compact
@item{@racket['sample], the current sample position}
@item{@racket['current-time], the current time in seconds}
]
If the native layer omits format values, the decoder fills in the most
recent known values. Initial defaults are 44100 Hz, 2 channels, 32 bits
per sample and 4 bytes per sample.
@section{Decoding Model}
The decoder keeps a small Racket handle around the native FFmpeg handler.
The handle stores the callbacks, stop and seek state, the current reading
state and the current format hash.
Seeking is asynchronous with respect to @racket[ffmpeg-seek]: the
function only records the requested target sample. The read loop applies
the pending seek request before decoding the next block.
@section{Notes}
The FFmpeg shim output is expected to be signed 32-bit interleaved PCM.
This keeps the decoder interface suitable for a playback pipeline that
feeds decoded audio to libao.
+166
View File
@@ -0,0 +1,166 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/contract
racket/path
"../ffmpeg-ffi.rkt"))
@title{FFmpeg FFI}
@author{@author+email["Hans Dijkema" "hans@dijkewijk.nl"]}
@defmodule[(file "../ffmpeg-ffi.rkt")]
This module provides the low-level Racket FFI binding for the native
FFmpeg audio shim. The native shim exposes an opaque FFmpeg instance and
keeps all decoder state inside that instance.
The output format of the native shim is signed 32-bit interleaved PCM.
The buffer returned by the native layer is copied into Racket-managed
memory before it is passed to higher layers.
@defproc[(fmpg-ffi-decoder-handler) procedure?]{
Creates a new FFmpeg decoder command handler.
The returned procedure manages one native FFmpeg instance. Commands are
sent as a symbol followed by command-specific arguments.
@itemlist[
#:style 'compact
@item{@racket['new] creates the native FFmpeg instance and returns @racket[#t].}
@item{@racket['delete] frees the native FFmpeg instance and returns @racket[#t].}
@item{@racket['init] opens a file and fetches stream and metadata information.}
@item{@racket['close] closes the currently opened file.}
@item{@racket['format] calls a format callback with the current stream format.}
@item{@racket['info] writes stream information to the sound logger.}
@item{@racket['read] decodes the next audio block.}
@item{@racket['seek] seeks to an absolute PCM sample position.}
@item{@racket['tell] returns the current PCM sample position.}
@item{@racket['file] returns the currently opened filename.}
@item{@racket['metadata] returns a hash with file metadata.}
]
}
@section{Command Interface}
The command handler is used as follows:
@racketblock[
(define h (fmpg-ffi-decoder-handler))
(h 'new)
(h 'init filename)
(h 'read audio-callback format-callback)
(h 'close)
(h 'delete)
]
The @racket['new] command must be called before @racket['init]. A
handler owns at most one native FFmpeg instance. Calling @racket['new]
twice without @racket['delete] raises an error.
@section{Format Callback}
The @racket['format] command and the first @racket['read] call report
the stream format by calling the supplied callback as follows:
@racketblock[
(format-callback pcm-pos
sample-rate
channels
bits-per-sample
bytes-per-sample
pcm-length)
]
The @racket[pcm-pos] argument is the current PCM sample position.
The @racket[pcm-length] argument is the total number of PCM samples, or
@racket[-1] when this is not known.
@section{Reading Audio}
The @racket['read] command decodes one audio block. It expects an audio
callback and a format callback:
@racketblock[
(h 'read audio-callback format-callback)
]
On the first read, the format callback is called before audio data is
returned. If decoding produces data, the audio callback is called as:
@racketblock[
(audio-callback 'data pcm-pos buffer size)
]
The @racket[pcm-pos] argument is the absolute sample position of the
first sample frame in the buffer. The @racket[buffer] argument points to
a copied PCM buffer, and @racket[size] is the buffer size in bytes.
When the stream ends, the callback is called as:
@racketblock[
(audio-callback 'done -1 #f 0)
]
The command returns @racket[#t].
@section{Seeking}
The @racket['seek] command takes an absolute PCM sample position:
@racketblock[
(h 'seek pcm-pos)
]
The sample position is converted to milliseconds using the current
sample rate and is then passed to the native FFmpeg shim. After seeking,
the current PCM position is updated from the native decoder.
@section{Metadata}
The @racket['metadata] command returns a mutable hash with the following
keys:
@itemlist[
#:style 'compact
@item{@racket['title]}
@item{@racket['author]}
@item{@racket['album]}
@item{@racket['genre]}
@item{@racket['comment]}
@item{@racket['copyright]}
@item{@racket['year]}
@item{@racket['track]}
@item{@racket['bitrate]}
@item{@racket['duration-ms]}
@item{@racket['audio-streams]}
]
Missing string fields are returned as empty strings. Missing numeric
fields are returned as @racket[-1].
@section{Native Library}
The module loads a shared library named @racket["ffmpeg_audio"] or
@racket["libffmpeg_audio"] using @racket[get-lib].
The native layer is expected to provide an instance-only FFmpeg API.
The relevant C-side properties are:
@itemlist[
#:style 'compact
@item{decoder state is stored in an opaque @tt{fmpg_instance};}
@item{output is signed 32-bit interleaved PCM;}
@item{the native buffer remains valid only until the next decode, seek,
close or free call;}
@item{Racket copies the buffer before passing it upward.}
]
@section{Errors}
Native failures are reported as Racket errors. Examples include failure
to allocate the native instance, failure to open a file and failure to
seek to a requested sample position.
Unknown commands also raise an error.
+155
View File
@@ -0,0 +1,155 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/path
"../flac-decoder.rkt"
"../flac-definitions.rkt"))
@title{flac-decoder}
@author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]]
@defmodule[(file "../flac-decoder.rkt")]
This module provides a small decoder interface on top of the FLAC
FFI layer. It opens a decoder for a file, reads stream metadata,
reads audio frames, exposes the current decoder state, and allows
an active read loop to be stopped. It also re-exports the bindings
from @racketmodname["flac-definitions.rkt"].
A decoder handle stores the native decoder handler together with
optional callbacks for stream metadata and decoded audio.
@section{Procedures}
@defproc[(flac-open [flac-file* (or/c path? string?)]
[cb-stream-info (or/c procedure? #f)]
[cb-audio (or/c procedure? #f)])
(or/c flac-handle? #f)]{
Opens a FLAC decoder for @racket[flac-file*]. If a path is given,
it is converted with @racket[path->string]. If the file does not
exist, the result is @racket[#f].
Otherwise a native decoder handler is created with
@racket[flac-ffi-decoder-handler], initialized with the file, and
wrapped in a @racket[flac-handle]. The given callbacks are stored
in the handle.
When metadata of type @racket['streaminfo] is processed and
@racket[cb-stream-info] is a procedure, it is called with a
@racket[flac-stream-info] value.
When decoded audio data is processed and @racket[cb-audio] is a
procedure, it is called as
@racket[(cb-audio header buffers)], where @racket[header] is a
mutable hash containing the frame header fields plus
@racket['duration], and @racket[buffers] is the decoded channel
data returned by the FFI layer.
}
@defproc[(flac-stream-state [handle flac-handle?])
(or/c 'search-for-metadata
'read-metadata
'search-for-frame-sync
'read-frames
'end-of-stream
'ogg-error
'seek-error
'aborted
'memory-allocation-error
'uninitialized
'end-of-link)]{
Returns the current decoder state reported by the native decoder
handler.
}
@defproc[(flac-read [handle flac-handle?])
(or/c 'stopped-reading
'end-of-stream)]{
Reads the stream by repeatedly calling the native decoder with
@racket['process-single].
Before reading starts, the handle fields @racket[stop-reading]
and @racket[reading] are set to @racket[#f] and @racket[#t]. If a
stop has been requested with @racket[flac-stop], reading ends
with @racket['stopped-reading] and @racket[reading] is reset to
@racket[#f].
Whenever pending metadata is available, it is processed with
@racket[process-meta]. For metadata of type
@racket['streaminfo], a @racket[flac-stream-info] value is
constructed, stored in the handle, and passed to the
stream-info callback.
Whenever pending frame data is available, it is processed with
@racket[process-frame]. The frame header is converted to a
mutable hash, extended with a @racket['duration] entry taken
from @racket[flac-duration], and passed together with the
decoded buffers to the audio callback.
For each processed frame, the module also updates
@racket[last-buffer], @racket[last-buf-len], and @racket[kinds].
The procedure prints diagnostic messages for state changes,
metadata, stream errors, and stop handling.
}
@defproc[(flac-read-meta [handle flac-handle?])
(or/c flac-stream-info? #f)]{
Advances the decoder until the state becomes one of
@racket['read-metadata], @racket['end-of-stream],
@racket['aborted], @racket['memory-allocation-error], or
@racket['uninitialized].
If the resulting state is @racket['read-metadata], pending
metadata is processed and the stored stream info is returned.
Otherwise the result is @racket[#f].
Only metadata of type @racket['streaminfo] is converted into a
@racket[flac-stream-info] value by this module.
}
@defproc[(flac-stop [handle flac-handle?]) void?]{
Requests termination of an active @racket[flac-read] loop by
setting the handle field @racket[stop-reading] to @racket[#t].
The procedure then waits until the handle field
@racket[reading] becomes @racket[#f], sleeping for 10 ms between
checks.
The procedure prints timing information before and after the
wait.
}
@section{Diagnostic bindings}
@defthing[kinds hash?]{
A mutable hash used to record the frame number kinds encountered
during decoding. The keys are the values found in the
frame-header field @racket['number-type].
}
@defthing[last-buffer (or/c #f list?)]{
The most recently decoded buffer set produced by frame
processing.
}
@defthing[last-buf-len (or/c #f exact-integer?)]{
The block size of the most recently processed frame.
}
@section{Notes}
The frame-header hash passed to the audio callback is produced
by @racket[flac-ffi-frame-header]. In this module it is extended
with a @racket['duration] field before the callback is called.
All bindings from @racketmodname["flac-definitions.rkt"] are
re-exported.
+280
View File
@@ -0,0 +1,280 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/contract
racket/path
"../libao.rkt"))
@title{libao}
@author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]]
@defmodule[(file "../libao.rkt")]
This module provides a small high-level interface to an asynchronous
audio output backend. It opens a live output device or a file output,
queues audio buffers for playback, reports playback position, supports
pause and buffer clearing, and exposes a small set of validation
predicates.
The central value is an @tt{ao-handle}, created by
@racket[ao-open-live] or @racket[ao-open-file]. An @tt{ao-handle}
stores the requested playback configuration together with a native
asynchronous player handle. It also records the real bit depth accepted
by the selected libao output device.
@section{Audio handles}
@defproc[(ao-handle? [v any/c]) boolean?]{
Returns @racket[#t] if @racket[v] is an @tt{ao-handle} value, and
@racket[#f] otherwise.
}
@defproc[(ao-valid? [handle ao-handle?]) boolean?]{
Returns @racket[#t] if @racket[handle] still has a native asynchronous
player, and @racket[#f] otherwise.
A handle becomes invalid after @racket[ao-close], or when opening the
native player failed.
}
@defproc[(ao-device-bits [handle ao-handle?]) integer?]{
Returns the real bit depth of the opened output device.
This can differ from the bit depth requested with @racket[ao-open-live]
or @racket[ao-open-file]. For example, when 32-bit output is requested
but the libao driver only accepts 24-bit output, this function returns
@racket[24].
}
@section{Validation predicates}
@defproc[(ao-valid-bits? [bits any/c]) boolean?]{
Returns @racket[#t] if @racket[bits] is one of @racket[8],
@racket[16], @racket[24], or @racket[32], and @racket[#f] otherwise.
}
@defproc[(ao-valid-rate? [rate any/c]) boolean?]{
Returns @racket[#t] if @racket[rate] is one of the sample rates
accepted by this module, and @racket[#f] otherwise.
The accepted rates are:
@itemlist[
#:style 'compact
@item{@racket[8000], @racket[11025], @racket[16000], @racket[22050]}
@item{@racket[44100], @racket[48000], @racket[88200], @racket[96000]}
@item{@racket[176400], @racket[192000], @racket[352800], @racket[384000]}
]
}
@defproc[(ao-valid-channels? [channels any/c]) boolean?]{
Returns @racket[#t] if @racket[channels] is an integer greater than or
equal to @racket[1], and @racket[#f] otherwise.
}
@defproc[(ao-valid-format? [format any/c]) boolean?]{
Returns @racket[#t] if @racket[format] is one of
@racket['little-endian], @racket['big-endian], or
@racket['native-endian], and @racket[#f] otherwise.
}
@defproc[(ao-supported-music-format? [format any/c]) boolean?]{
Returns @racket[#t] if @racket[format] is one of @racket['ao] or
@racket['flac], and @racket[#f] otherwise.
The symbol does not describe an encoded audio format. It describes the
in-memory layout of the PCM buffer passed to @racket[ao-play].
@racket['ao] means interleaved PCM samples. @racket['flac] means
channel-oriented PCM samples, as produced by the FLAC decoder, which
must be converted to interleaved PCM before playback.
}
@section{Opening and closing}
@defproc[(ao-open-live [bits ao-valid-bits?]
[rate ao-valid-rate?]
[channels ao-valid-channels?]
[byte-format ao-valid-format?])
ao-handle?]{
Creates an audio output handle for live playback.
This is equivalent to calling @racket[ao-open-file] with
@racket[#f] as the filename.
The handle stores the requested sample size, sample rate, channel count,
and byte format. The native backend first tries to open the device with
the requested bit depth. If that fails, it may fall back to a lower bit
depth accepted by the selected libao driver.
The requested bit depth describes the buffers supplied by the Racket
side. The real device bit depth describes the format accepted by libao
and can be inspected with @racket[ao-device-bits].
If the native player is created successfully, the returned handle is
valid. If player creation fails, the function still returns an
@tt{ao-handle}, but that handle is marked closed and is not valid
for playback.
A finalizer is registered for the handle and calls @racket[ao-close]
when the handle is reclaimed.
}
@defproc[(ao-open-file [bits ao-valid-bits?]
[rate ao-valid-rate?]
[channels ao-valid-channels?]
[byte-format ao-valid-format?]
[filename (or/c path? string? #f)])
ao-handle?]{
Creates an audio output handle.
If @racket[filename] is @racket[#f], the default live libao output
device is opened. Otherwise the native backend opens a file output
target using the given filename.
The requested bit depth is stored in the handle and describes the input
buffers that will be queued with @racket[ao-play]. The native backend
also records the real bit depth accepted by the output device or file
backend. Use @racket[ao-device-bits] to inspect that value.
}
@defproc[(ao-close [handle ao-handle?]) void?]{
Stops playback for @racket[handle] and releases the native player
reference stored in the handle.
If the handle already has no native player, this procedure has no
effect.
}
@section{Playback}
@defproc[(ao-play [handle ao-handle?]
[music-id integer?]
[at-time-in-s number?]
[music-duration-s number?]
[buffer any/c]
[buf-len integer?]
[buf-type ao-supported-music-format?])
void?]{
Queues audio data for asynchronous playback.
The @racket[music-id] argument identifies the music stream associated
with the buffer. The arguments @racket[at-time-in-s] and
@racket[music-duration-s] describe the position and duration, in
seconds, associated with the buffer. The arguments @racket[buffer] and
@racket[buf-len] provide the audio data and its length. The
@racket[buf-type] argument specifies the in-memory PCM layout.
The buffer description passed to the native layer is completed with the
requested sample size, sample rate, channel count, and byte format
stored in @racket[handle].
Two buffer layouts are supported:
@itemlist[
#:style 'compact
@item{@racket['ao]: interleaved PCM samples, for example @tt{L0 R0 L1 R1}.}
@item{@racket['flac]: channel-oriented PCM samples, for example one channel buffer for left samples and one channel buffer for right samples.}
]
The native backend converts @racket['flac] buffers to interleaved PCM
before playback. It also converts between the requested bit depth and the
real device bit depth when needed. This makes it possible to keep decoder
output at 32-bit signed integer PCM while still playing on devices that
only accept 24-bit or 16-bit integer samples.
The queued buffer is copied by the native backend, so the caller does
not need to keep the original buffer alive after @racket[ao-play]
returns.
If @racket[handle] is not valid, this procedure raises an exception.
}
@defproc[(ao-pause [handle ao-handle?]
[pause boolean?])
void?]{
Pauses or resumes asynchronous playback for @racket[handle].
A true value pauses playback. @racket[#f] resumes playback.
}
@defproc[(ao-clear-async [handle ao-handle?]) void?]{
Clears buffered asynchronous playback data for @racket[handle].
}
@section{Playback state}
@defproc[(ao-at-second [handle ao-handle?]) number?]{
Returns the current playback position, in seconds, as reported by the
native asynchronous player.
}
@defproc[(ao-at-music-id [handle ao-handle?]) integer?]{
Returns the music identifier currently reported by the native
asynchronous player.
}
@defproc[(ao-music-duration [handle ao-handle?]) number?]{
Returns the duration of the current music stream, in seconds, as
reported by the native asynchronous player.
}
@defproc[(ao-bufsize-async [handle ao-handle?]) integer?]{
Returns the current buffered size in bytes for the asynchronous player.
}
@section{Volume control}
@defproc[(ao-set-volume! [handle ao-handle?]
[percentage number?])
void?]{
Sets the playback volume for @racket[handle].
If @racket[percentage] is an exact integer, it is converted to an
inexact number before it is passed to the native layer.
}
@defproc[(ao-volume [handle ao-handle?]) number?]{
Returns the current playback volume as reported by the native
asynchronous player.
}
@section{Notes}
This module is a higher-level wrapper around the asynchronous FFI layer.
It stores the playback configuration in the handle, and reuses that
configuration for each call to @racket[ao-play].
The requested bit depth and the real device bit depth are deliberately
kept separate. The requested value describes the buffers supplied by the
Racket side. The real value describes the format accepted by libao.
The module does not expose the handle fields directly. The public API
is intentionally small: create a handle, queue buffers, inspect
position and buffer state, pause or clear playback, adjust volume, and
close the handle.
A typical usage pattern is to open one live handle for a given stream
format, queue decoded buffers with @racket[ao-play], and query the
playback position with @racket[ao-at-second] while playback proceeds
asynchronously.
+149
View File
@@ -0,0 +1,149 @@
#lang scribble/manual
@(require racket/base
(for-label racket/base
racket/path
racket/contract
"../mp3-decoder.rkt"))
@title{mp3-decoder}
@author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]]
@defmodule[(file "../mp3-decoder.rkt")]
This module provides an MP3 decoder backend. It opens an MP3 file,
reports stream information through a callback, streams decoded PCM
buffers, and supports stopping and seeking.
The module is intended to be used through
@racketmodname[racket-sound/audio-decoder], but its procedures can also
be used directly.
@section{Validation}
@defproc[(mp3-valid? [mp3-file any/c]) boolean?]{
Returns #t.
The current implementation does not inspect mp3-file. This procedure
exists to satisfy the reader interface used by
@racketmodname[racket-sound/audio-decoder].
Basic validation such as file existence and extension matching is
performed in the higher-level module. This procedure therefore acts as
an additional hook and currently accepts all inputs.
}
@section{Opening}
@defproc[(mp3-open [mp3-file* (or/c path? string?)]
[cb-stream-info procedure?]
[cb-audio procedure?])
(or/c struct? #f)]{
Opens an MP3 decoder for mp3-file*.
If mp3-file* is a path, it is converted with path->string. If the file
does not exist, the result is #f.
Otherwise a decoder handle is created and initialized. During
initialization, stream information is collected and stored in a mutable
hash in the handle.
The stream-info callback is invoked once, immediately after
initialization:
@racketblock[
(cb-stream-info info)
]
where info is a mutable hash containing at least:
@itemlist[#:style 'compact
@item{'duration}
@item{'sample-rate}
@item{'channels}
@item{'bits-per-sample}
@item{'bytes-per-sample}
@item{'total-samples}]
}
@section{Reading}
@defproc[(mp3-read [handle struct?]) any/c]{
Starts the decode loop for handle.
The loop repeatedly decodes audio chunks and invokes the audio
callback:
@racketblock[
(cb-audio info buffer size)
]
Before each callback, the info hash is updated in place with:
@itemlist[#:style 'compact
@item{'sample}
@item{'current-time}]
The loop also checks for a pending seek request. If a seek has been
requested, the stored target sample position is forwarded to the
decoder backend and the request is cleared.
The loop terminates when either:
@itemlist[#:style 'compact
@item{the backend reports end-of-stream}
@item{a stop has been requested via mp3-stop}]
If a stop is detected, the procedure returns 'stopped-reading.
After termination, the underlying decoder is closed and released.
The return value is otherwise unspecified.
}
@section{Seeking}
@defproc[(mp3-seek [handle struct?]
[percentage number?])
void?]{
Requests a seek within the stream.
The percentage argument represents a position relative to the full
audio stream, where 0 is the start and 100 is the end. The value may be
fractional.
If the total number of samples is available in the handle, the
procedure computes an absolute target sample and stores it in the
handle as a pending seek request.
The actual seek operation is performed later by mp3-read in its decode
loop.
If the total number of samples is unavailable or equal to -1, this
procedure has no effect.
}
@section{Stopping}
@defproc[(mp3-stop [handle struct?]) void?]{
Requests termination of an active mp3-read loop.
The procedure sets an internal stop flag and waits until the read loop
has terminated, sleeping briefly between checks.
}
@section{Notes}
The stream-info hash is shared between initialization and decoding and
is updated in place during playback.
The audio buffer passed to the callback is managed by the decoder and
should be treated as transient data.
Seeking is implemented as a request stored in the handle and executed
by the decode loop, not directly by mp3-seek.