18e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
28e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels%!TEX root = Vorbis_I_spec.tex
38e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels% $Id$
48e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
58e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
68e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsection{Overview}
78e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
88e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThis document describes using Ogg logical and physical transport
98e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsstreams to encapsulate Vorbis compressed audio packet data into file
108e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsform.
118e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
128e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe \xref{vorbis:spec:intro} provides an overview of the construction
138e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsof Vorbis audio packets.
148e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
158e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe \href{oggstream.html}{Ogg
168e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsbitstream overview} and \href{framing.html}{Ogg logical
178e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsbitstream and framing spec} provide detailed descriptions of Ogg
188e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelstransport streams. This specification document assumes a working
198e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsknowledge of the concepts covered in these named backround
208e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsdocuments.  Please read them first.
218e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
228e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsubsection{Restrictions}
238e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
248e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
258e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsstreams use Ogg transport streams in degenerate, unmultiplexed
268e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsform only. That is:
278e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
288e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\begin{itemize}
298e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item
308e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  A meta-headerless Ogg file encapsulates the Vorbis I packets
318e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
328e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item
338e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
348e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
358e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels \item
368e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
378e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
388e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\end{itemize}
398e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
408e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
418e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThis is not to say that it is not currently possible to multiplex
428e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsVorbis with other media types into a multi-stream Ogg file.  At the
438e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelstime this document was written, Ogg was becoming a popular container
448e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsfor low-bitrate movies consisting of DivX video and Vorbis audio.
458e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsHowever, a 'Vorbis I audio file' is taken to imply Vorbis audio
468e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsexisting alone within a degenerate Ogg stream.  A compliant 'Vorbis
478e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsaudio player' is not required to implement Ogg support beyond the
488e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsspecific support of Vorbis within a degenrate Ogg stream (naturally,
498e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsapplication authors are encouraged to support full multiplexed Ogg
508e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelshandling).
518e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
528e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
538e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
548e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
558e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsubsection{MIME type}
568e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
578e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsThe MIME type of Ogg files depend on the context.  Specifically, complex
588e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsmultimedia and applications should use \literal{application/ogg},
598e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelswhile visual media should use \literal{video/ogg}, and audio
608e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\literal{audio/ogg}.  Vorbis data encapsulated in Ogg may appear
618e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckelsin any of those types.  RTP encapsulated Vorbis should use
628e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\literal{audio/vorbis} + \literal{audio/vorbis-config}.
638e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
648e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
658e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\subsection{Encapsulation}
668e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
678e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas EckelsOgg encapsulation of a Vorbis packet stream is straightforward.
688e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
698e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\begin{itemize}
708e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
718e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
728e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The first Vorbis packet (the identification header), which
738e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  uniquely identifies a stream as Vorbis audio, is placed alone in the
748e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  first page of the logical Ogg stream.  This results in a first Ogg
758e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  page of exactly 58 bytes at the very beginning of the logical stream.
768e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
778e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
788e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
798e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  This first page is marked 'beginning of stream' in the page flags.
808e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
818e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
828e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
838e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The second and third vorbis packets (comment and setup
848e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  headers) may span one or more pages beginning on the second page of
858e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  the logical stream.  However many pages they span, the third header
868e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  packet finishes the page on which it ends.  The next (first audio) packet
878e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  must begin on a fresh page.
888e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
898e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
908e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
918e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The granule position of these first pages containing only headers is zero.
928e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
938e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
948e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
958e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The first audio packet of the logical stream begins a fresh Ogg page.
968e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
978e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
988e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
998e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  Packets are placed into ogg pages in order until the end of stream.
1008e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1018e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1028e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1038e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The last page is marked 'end of stream' in the page flags.
1048e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1058e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1068e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1078e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  Vorbis packets may span page boundaries.
1088e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1098e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1108e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1118e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The granule position of pages containing Vorbis audio is in units
1128e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  of PCM audio samples (per channel; a stereo stream's granule position
1138e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  does not increment at twice the speed of a mono stream).
1148e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1158e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1168e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1178e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  The granule position of a page represents the end PCM sample
1188e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  position of the last packet \emph{completed} on that
1198e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  page.  The 'last PCM sample' is the last complete sample returned by
1208e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  decode, not an internal sample awaiting lapping with a
1218e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  subsequent block.  A page that is entirely spanned by a single
1228e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  packet (that completes on a subsequent page) has no granule
1238e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  position, and the granule position is set to '-1'.
1248e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1258e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1268e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  Note that the last decoded (fully lapped) PCM sample from a packet
1278e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  is not necessarily the middle sample from that block. If, eg, the
1288e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  current Vorbis packet encodes a "long block" and the next Vorbis
1298e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  packet encodes a "short block", the last decodable sample from the
1308e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  current packet be at position (3*long\_block\_length/4) -
1318e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  (short\_block\_length/4).
1328e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1338e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1348e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1358e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    The granule (PCM) position of the first page need not indicate
1368e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    that the stream started at position zero.  Although the granule
1378e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    position belongs to the last completed packet on the page and a
1388e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    valid granule position must be positive, by
1398e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    inference it may indicate that the PCM position of the beginning
1408e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    of audio is positive or negative.
1418e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1428e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1438e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  \begin{itemize}
1448e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    \item
1458e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        A positive starting value simply indicates that this stream begins at
1468e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        some positive time offset, potentially within a larger
1478e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        program. This is a common case when connecting to the middle
1488e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        of broadcast stream.
1498e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1508e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    \item
1518e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        A negative value indicates that
1528e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        output samples preceeding time zero should be discarded during
1538e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        decoding; this technique is used to allow sample-granularity
1548e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        editing of the stream start time of already-encoded Vorbis
1558e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        streams.  The number of samples to be discarded must not exceed
1568e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels        the overlap-add span of the first two audio packets.
1578e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1588e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  \end{itemize}
1598e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1608e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1618e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    In both of these cases in which the initial audio PCM starting
1628e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    offset is nonzero, the second finished audio packet must flush the
1638e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    page on which it appears and the third packet begin a fresh page.
1648e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    This allows the decoder to always be able to perform PCM position
1658e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    adjustments before needing to return any PCM data from synthesis,
1668e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    resulting in correct positioning information without any aditional
1678e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    seeking logic.
1688e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1698e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1708e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  \begin{note}
1718e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    Failure to do so should, at worst, cause a
1728e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    decoder implementation to return incorrect positioning information
1738e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels    for seeking operations at the very beginning of the stream.
1748e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  \end{note}
1758e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1768e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1778e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\item
1788e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  A granule position on the final page in a stream that indicates
1798e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  less audio data than the final packet would normally return is used to
1808e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  end the stream on other than even frame boundaries.  The difference
1818e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  between the actual available data returned and the declared amount
1828e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  indicates how many trailing samples to discard from the decoding
1838e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels  process.
1848e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels
1858e01cdce135d5d816f92d7bb83f9a930aa1b45aeLucas Eckels\end{itemize}
186