#lang scribble/manual @(require (for-label racket/base racket/contract racket/path "../audio-encoder.rkt")) @title{Audio Encoding} @author[@author+email["Hans Dijkema" "hans@dijkewijk.nl"]] @defmodule[racket-audio/audio-encoder] The @racketmodname[racket-audio/audio-encoder] module provides the high level file-to-file encoding pipeline. It reuses the existing decoder environment to read the input file and sends the decoded PCM stream to a selected encoder backend. The built-in backends are Opus, implemented with @tt{libopusenc}, and FLAC, implemented with @tt{libFLAC}. This module is intended as the public encoding API. The concrete backend modules are small FFI backends; applications normally call @racket[audio-encode] instead of using those modules directly. @section{Pipeline} Encoding is organised as a streaming pipeline: @racketblock[ input file ;; decoded by audio-decoder.rkt -> PCM buffers ;; optional conversion for FLAC -> encoder backend -> output file] The encoder is selected from @racket[#:encoder] or, when that argument is not provided, from the output filename extension. The initial built-in encoders are @racket['opus] for @filepath{.opus} and @filepath{.oga} files, and @racket['flac] for @filepath{.flac} files. The PCM stream is not collected in memory. Each decoded buffer is forwarded to the selected backend. FLAC encoding may insert a PCM conversion step when the settings request a different sample rate, channel count, or bit depth. Opus encoding feeds floating-point PCM to @tt{libopusenc}; sample-rate conversion for Opus is left to @tt{libopusenc}. @section{Encoding a file} @defproc[(audio-encode [input-file path-string?] [output-file path-string?] [settings hash?] [#:encoder encoder (or/c symbol? #f) #f] [#:copy-tags? copy-tags? boolean? #t] [#:progress-callback progress-callback (or/c procedure? #f) #f]) hash?]{ Encodes @racket[input-file] to @racket[output-file] and returns a result hash. The @racket[settings] hash is interpreted by the selected backend. When @racket[encoder] is @racket[#f], the backend is inferred from the output file extension. Pass @racket['opus] or @racket['flac] to force a backend. When @racket[copy-tags?] is true, common textual tags and an embedded picture are copied from the source file to the destination file. Opus comments and cover art are written before encoding starts through @tt{libopusenc}. FLAC metadata is copied after the encoded file has been written, using the read-write API from @racketmodname[racket-audio/taglib]. When @racket[progress-callback] is a procedure, it is called with a progress hash during encoding. Progress is based on the number of input frames read from the decoder, not on the number of frames written by the encoder. This matters for resampling, because output frame counts can differ from input frame counts.} @racketblock[ (audio-encode "input.flac" "output.opus" (hash 'bitrate 224000 'vbr? #t 'complexity 10) #:encoder 'opus) (audio-encode "input-96k.flac" "output-48k.flac" (hash 'sample-rate 48000 'bits-per-sample 24 'compression-level 8) #:encoder 'flac)] @section{Result hash} The result hash contains the following keys: @itemlist[#:style 'compact @item{@racket['encoder], the selected backend symbol;} @item{@racket['input] and @racket['output], the source and destination paths;} @item{@racket['input-format], the final decoded input format hash seen by the pipeline;} @item{@racket['output-format], the resolved backend output format hash;} @item{@racket['frames-read], the number of input frames consumed;} @item{@racket['frames-written], the number of frames accepted by the backend;} @item{@racket['tag-copy], a hash describing how metadata was handled.}] The @racket['tag-copy] hash contains a @racket['method] key. For Opus the method is @racket['libopusenc-comments], because metadata must be supplied to @tt{libopusenc} before the encoder writes the OpusTags packet. For FLAC the method is @racket['taglib-post-copy], because the encoded file is tagged after encoding. @section{Progress callback} The progress callback receives a hash with at least these keys: @itemlist[#:style 'compact @item{@racket['phase], such as @racket['format], @racket['audio], @racket['finished-encoding], or @racket['finished];} @item{@racket['frames-read] and @racket['frames-written];} @item{@racket['total-frames], when the decoder reported a known input length;} @item{@racket['progress], a number between @racket[0.0] and @racket[1.0] when @racket['total-frames] is known, otherwise @racket[#f];} @item{@racket['input-format] and, after the backend has opened, @racket['output-format].}] A simple command-line style progress callback can print a percentage on one line: @racketblock[ (define (show-progress h) (let ((p (hash-ref h 'progress #f))) (when (number? p) (printf "\rprogress: ~a%" (round (* 100 p))) (flush-output))))] @section{Opus settings} The Opus backend uses @tt{libopusenc}. The input PCM is converted to interleaved floating-point samples in the range @racket[-1.0] to @racket[1.0] and written with @tt{ope_encoder_write_float}. The source sample rate is passed to @tt{libopusenc}; @tt{libopusenc} performs the required internal resampling for Opus output. The following settings are recognised: @itemlist[#:style 'compact @item{@racket['bitrate], bitrate in bits per second. The default is @racket[160000].} @item{@racket['vbr?], whether variable bitrate is enabled. The default is @racket[#t].} @item{@racket['constrained-vbr?], whether constrained VBR is enabled. The default is @racket[#f].} @item{@racket['complexity], encoder complexity. The default is @racket[10].} @item{@racket['comment-padding], Opus comment padding in bytes. The default is @racket[512].} @item{@racket['signal], optionally @racket['auto], @racket['voice], or @racket['music].} @item{@racket['lsb-depth], optionally passed to the encoder as the source least significant bit depth.} @item{@racket['comments], an optional hash of Opus comment strings. When @racket[#:copy-tags?] is true, @racket[audio-encode] fills this from the source tags.} @item{@racket['picture], an optional picture value from @racketmodname[racket-audio/taglib]. When @racket[#:copy-tags?] is true, @racket[audio-encode] fills this from the source tags.}] The first backend version supports mono and stereo input. @section{FLAC settings} The FLAC backend uses the @tt{libFLAC} stream encoder. It writes interleaved integer PCM samples through the FLAC encoder API. When the requested output format differs from the decoded input format, @racketmodname[racket-audio/private/pcm-converter] uses the existing FFmpeg @tt{swresample} layer from @racketmodname[racket-audio/ffmpeg-definitions] to perform PCM normalisation. The following settings are recognised: @itemlist[#:style 'compact @item{@racket['compression-level], FLAC compression level. The default is @racket[5].} @item{@racket['verify?], whether the FLAC encoder verifies encoded output. The default is @racket[#f].} @item{@racket['blocksize], explicit FLAC block size. The default is @racket[0], meaning the library default.} @item{@racket['sample-rate] or @racket['target-sample-rate], target sample rate in Hz. Use @racket['source] or omit the key to keep the source rate.} @item{@racket['channels] or @racket['target-channels], target channel count. Use @racket['source] or omit the key to keep the source channel count.} @item{@racket['bits-per-sample] or @racket['target-bits-per-sample], target bit depth. Use @racket['source] or omit the key to keep the source bit depth.}] For example, a 24-bit 96 kHz FLAC file can be transcoded to 24-bit 48 kHz FLAC with: @racketblock[ (audio-encode "input-96k.flac" "output-48k.flac" (hash 'sample-rate 48000 'bits-per-sample 24 'compression-level 8) #:encoder 'flac)] @section{Encoder registration} @defproc[(audio-supported-encoder-extensions) (listof string?)]{ Returns the extensions supported by the currently registered encoders. The initial list includes @racket["flac"], @racket["opus"], and @racket["oga"].} @defproc[(make-audio-encoder [exts (listof string?)] [open procedure?] [write procedure?] [finish procedure?] [settings procedure?]) audio-encoder?]{ Creates an encoder descriptor. The descriptor is used by @racket[audio-register-encoder!] to register a backend. The @racket[open] procedure receives the output file, settings hash, and input format hash. The @racket[write] procedure receives the backend handle, buffer format hash, byte buffer, and byte length, and returns the number of frames accepted by the backend. The @racket[finish] procedure finalises and releases the backend handle. The @racket[settings] procedure resolves backend defaults against the input format and returns the output format hash.} @defproc[(audio-encoder? [v any/c]) boolean?]{ Returns @racket[#t] when @racket[v] is an encoder descriptor.} @defproc[(audio-register-encoder! [type symbol?] [encoder audio-encoder?]) void?]{ Registers @racket[encoder] under @racket[type]. The encoder's extensions are used for extension-based selection in @racket[audio-encode].}