18e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- 28e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels%!TEX root = Vorbis_I_spec.tex 38e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels% $Id$ 48e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg} 58e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 68e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsection{Overview} 78e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 88e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThis document describes using Ogg logical and physical transport 98e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsstreams to encapsulate Vorbis compressed audio packet data into file 108e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsform. 118e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 128e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe \xref{vorbis:spec:intro} provides an overview of the construction 138e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsof Vorbis audio packets. 148e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 158e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe \href{oggstream.html}{Ogg 168e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsbitstream overview} and \href{framing.html}{Ogg logical 178e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsbitstream and framing spec} provide detailed descriptions of Ogg 188e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelstransport streams. This specification document assumes a working 198e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsknowledge of the concepts covered in these named backround 208e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsdocuments. Please read them first. 218e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 228e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsubsection{Restrictions} 238e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 248e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe Ogg/Vorbis I specification currently dictates that Ogg/Vorbis 258e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsstreams use Ogg transport streams in degenerate, unmultiplexed 268e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsform only. That is: 278e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 288e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\begin{itemize} 298e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item 308e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels A meta-headerless Ogg file encapsulates the Vorbis I packets 318e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 328e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item 338e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links). 348e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 358e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item 368e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link) 378e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 388e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\end{itemize} 398e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 408e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 418e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThis is not to say that it is not currently possible to multiplex 428e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsVorbis with other media types into a multi-stream Ogg file. At the 438e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelstime this document was written, Ogg was becoming a popular container 448e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsfor low-bitrate movies consisting of DivX video and Vorbis audio. 458e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsHowever, a 'Vorbis I audio file' is taken to imply Vorbis audio 468e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsexisting alone within a degenerate Ogg stream. A compliant 'Vorbis 478e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsaudio player' is not required to implement Ogg support beyond the 488e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsspecific support of Vorbis within a degenrate Ogg stream (naturally, 498e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsapplication authors are encouraged to support full multiplexed Ogg 508e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelshandling). 518e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 528e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 538e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 548e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 558e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsubsection{MIME type} 568e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 578e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe MIME type of Ogg files depend on the context. Specifically, complex 588e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsmultimedia and applications should use \literal{application/ogg}, 598e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelswhile visual media should use \literal{video/ogg}, and audio 608e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear 618e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsin any of those types. RTP encapsulated Vorbis should use 628e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\literal{audio/vorbis} + \literal{audio/vorbis-config}. 638e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 648e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 658e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsection{Encapsulation} 668e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 678e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsOgg encapsulation of a Vorbis packet stream is straightforward. 688e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 698e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\begin{itemize} 708e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 718e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 728e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The first Vorbis packet (the identification header), which 738e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels uniquely identifies a stream as Vorbis audio, is placed alone in the 748e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels first page of the logical Ogg stream. This results in a first Ogg 758e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels page of exactly 58 bytes at the very beginning of the logical stream. 768e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 778e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 788e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 798e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels This first page is marked 'beginning of stream' in the page flags. 808e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 818e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 828e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 838e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The second and third vorbis packets (comment and setup 848e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels headers) may span one or more pages beginning on the second page of 858e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels the logical stream. However many pages they span, the third header 868e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels packet finishes the page on which it ends. The next (first audio) packet 878e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels must begin on a fresh page. 888e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 898e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 908e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 918e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The granule position of these first pages containing only headers is zero. 928e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 938e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 948e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 958e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The first audio packet of the logical stream begins a fresh Ogg page. 968e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 978e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 988e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 998e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels Packets are placed into ogg pages in order until the end of stream. 1008e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1018e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1028e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1038e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The last page is marked 'end of stream' in the page flags. 1048e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1058e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1068e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1078e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels Vorbis packets may span page boundaries. 1088e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1098e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1108e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1118e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The granule position of pages containing Vorbis audio is in units 1128e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels of PCM audio samples (per channel; a stereo stream's granule position 1138e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels does not increment at twice the speed of a mono stream). 1148e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1158e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1168e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1178e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The granule position of a page represents the end PCM sample 1188e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels position of the last packet \emph{completed} on that 1198e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels page. The 'last PCM sample' is the last complete sample returned by 1208e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels decode, not an internal sample awaiting lapping with a 1218e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels subsequent block. A page that is entirely spanned by a single 1228e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels packet (that completes on a subsequent page) has no granule 1238e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels position, and the granule position is set to '-1'. 1248e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1258e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1268e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels Note that the last decoded (fully lapped) PCM sample from a packet 1278e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels is not necessarily the middle sample from that block. If, eg, the 1288e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels current Vorbis packet encodes a "long block" and the next Vorbis 1298e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels packet encodes a "short block", the last decodable sample from the 1308e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels current packet be at position (3*long\_block\_length/4) - 1318e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels (short\_block\_length/4). 1328e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1338e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1348e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1358e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels The granule (PCM) position of the first page need not indicate 1368e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels that the stream started at position zero. Although the granule 1378e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels position belongs to the last completed packet on the page and a 1388e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels valid granule position must be positive, by 1398e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels inference it may indicate that the PCM position of the beginning 1408e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels of audio is positive or negative. 1418e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1428e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1438e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \begin{itemize} 1448e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item 1458e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels A positive starting value simply indicates that this stream begins at 1468e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels some positive time offset, potentially within a larger 1478e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels program. This is a common case when connecting to the middle 1488e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels of broadcast stream. 1498e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1508e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item 1518e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels A negative value indicates that 1528e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels output samples preceeding time zero should be discarded during 1538e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels decoding; this technique is used to allow sample-granularity 1548e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels editing of the stream start time of already-encoded Vorbis 1558e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels streams. The number of samples to be discarded must not exceed 1568e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels the overlap-add span of the first two audio packets. 1578e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1588e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \end{itemize} 1598e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1608e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1618e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels In both of these cases in which the initial audio PCM starting 1628e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels offset is nonzero, the second finished audio packet must flush the 1638e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels page on which it appears and the third packet begin a fresh page. 1648e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels This allows the decoder to always be able to perform PCM position 1658e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels adjustments before needing to return any PCM data from synthesis, 1668e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels resulting in correct positioning information without any aditional 1678e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels seeking logic. 1688e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1698e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1708e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \begin{note} 1718e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels Failure to do so should, at worst, cause a 1728e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels decoder implementation to return incorrect positioning information 1738e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels for seeking operations at the very beginning of the stream. 1748e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \end{note} 1758e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1768e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1778e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item 1788e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels A granule position on the final page in a stream that indicates 1798e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels less audio data than the final packet would normally return is used to 1808e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels end the stream on other than even frame boundaries. The difference 1818e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels between the actual available data returned and the declared amount 1828e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels indicates how many trailing samples to discard from the decoding 1838e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels process. 1848e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels 1858e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\end{itemize} 186