1% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
2%!TEX root = Vorbis_I_spec.tex
3% $Id$
4\section{Residue setup and decode} \label{vorbis:spec:residue}
5
6\subsection{Overview}
7
8A residue vector represents the fine detail of the audio spectrum of
9one channel in an audio frame after the encoder subtracts the floor
10curve and performs any channel coupling.  A residue vector may
11represent spectral lines, spectral magnitude, spectral phase or
12hybrids as mixed by channel coupling.  The exact semantic content of
13the vector does not matter to the residue abstraction.
14
15Whatever the exact qualities, the Vorbis residue abstraction codes the
16residue vectors into the bitstream packet, and then reconstructs the
17vectors during decode.  Vorbis makes use of three different encoding
18variants (numbered 0, 1 and 2) of the same basic vector encoding
19abstraction.
20
21
22
23\subsection{Residue format}
24
25Residue format partitions each vector in the vector bundle into chunks,
26classifies each chunk, encodes the chunk classifications and finally
27encodes the chunks themselves using the the specific VQ arrangement
28defined for each selected classification.
29The exact interleaving and partitioning vary by residue encoding number,
30however the high-level process used to classify and encode the residue
31vector is the same in all three variants.
32
33A set of coded residue vectors are all of the same length.  High level
34coding structure, ignoring for the moment exactly how a partition is
35encoded and simply trusting that it is, is as follows:
36
37\begin{itemize}
38\item Each vector is partitioned into multiple equal sized chunks
39according to configuration specified.  If we have a vector size of
40\emph{n}, a partition size \emph{residue_partition_size}, and a total
41of \emph{ch} residue vectors, the total number of partitioned chunks
42coded is \emph{n}/\emph{residue_partition_size}*\emph{ch}.  It is
43important to note that the integer division truncates.  In the below
44example, we assume an example \emph{residue_partition_size} of 8.
45
46\item Each partition in each vector has a classification number that
47specifies which of multiple configured VQ codebook setups are used to
48decode that partition.  The classification numbers of each partition
49can be thought of as forming a vector in their own right, as in the
50illustration below.  Just as the residue vectors are coded in grouped
51partitions to increase encoding efficiency, the classification vector
52is also partitioned into chunks.  The integer elements of each scalar
53in a classification chunk are built into a single scalar that
54represents the classification numbers in that chunk.  In the below
55example, the classification codeword encodes two classification
56numbers.
57
58\item The values in a residue vector may be encoded monolithically in a
59single pass through the residue vector, but more often efficient
60codebook design dictates that each vector is encoded as the additive
61sum of several passes through the residue vector using more than one
62VQ codebook.  Thus, each residue value potentially accumulates values
63from multiple decode passes.  The classification value associated with
64a partition is the same in each pass, thus the classification codeword
65is coded only in the first pass.
66
67\end{itemize}
68
69
70\begin{center}
71\includegraphics[width=\textwidth]{residue-pack}
72\captionof{figure}{illustration of residue vector format}
73\end{center}
74
75
76
77\subsection{residue 0}
78
79Residue 0 and 1 differ only in the way the values within a residue
80partition are interleaved during partition encoding (visually treated
81as a black box--or cyan box or brown box--in the above figure).
82
83Residue encoding 0 interleaves VQ encoding according to the
84dimension of the codebook used to encode a partition in a specific
85pass.  The dimension of the codebook need not be the same in multiple
86passes, however the partition size must be an even multiple of the
87codebook dimension.
88
89As an example, assume a partition vector of size eight, to be encoded
90by residue 0 using codebook sizes of 8, 4, 2 and 1:
91
92\begin{programlisting}
93
94            original residue vector: [ 0 1 2 3 4 5 6 7 ]
95
96codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
97
98codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
99
100codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
101
102codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
103
104\end{programlisting}
105
106It is worth mentioning at this point that no configurable value in the
107residue coding setup is restricted to a power of two.
108
109
110
111\subsection{residue 1}
112
113Residue 1 does not interleave VQ encoding.  It represents partition
114vector scalars in order.  As with residue 0, however, partition length
115must be an integer multiple of the codebook dimension, although
116dimension may vary from pass to pass.
117
118As an example, assume a partition vector of size eight, to be encoded
119by residue 0 using codebook sizes of 8, 4, 2 and 1:
120
121\begin{programlisting}
122
123            original residue vector: [ 0 1 2 3 4 5 6 7 ]
124
125codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
126
127codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
128
129codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
130
131codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
132
133\end{programlisting}
134
135
136
137\subsection{residue 2}
138
139Residue type two can be thought of as a variant of residue type 1.
140Rather than encoding multiple passed-in vectors as in residue type 1,
141the \emph{ch} passed in vectors of length \emph{n} are first
142interleaved and flattened into a single vector of length
143\emph{ch}*\emph{n}.  Encoding then proceeds as in type 1. Decoding is
144as in type 1 with decode interleave reversed. If operating on a single
145vector to begin with, residue type 1 and type 2 are equivalent.
146
147\begin{center}
148\includegraphics[width=\textwidth]{residue2}
149\captionof{figure}{illustration of residue type 2}
150\end{center}
151
152
153\subsection{Residue decode}
154
155\subsubsection{header decode}
156
157Header decode for all three residue types is identical.
158\begin{programlisting}
159  1) [residue_begin] = read 24 bits as unsigned integer
160  2) [residue_end] = read 24 bits as unsigned integer
161  3) [residue_partition_size] = read 24 bits as unsigned integer and add one
162  4) [residue_classifications] = read 6 bits as unsigned integer and add one
163  5) [residue_classbook] = read 8 bits as unsigned integer
164\end{programlisting}
165
166\varname{[residue_begin]} and
167\varname{[residue_end]} select the specific sub-portion of
168each vector that is actually coded; it implements akin to a bandpass
169where, for coding purposes, the vector effectively begins at element
170\varname{[residue_begin]} and ends at
171\varname{[residue_end]}.  Preceding and following values in
172the unpacked vectors are zeroed.  Note that for residue type 2, these
173values as well as \varname{[residue_partition_size]}apply to
174the interleaved vector, not the individual vectors before interleave.
175\varname{[residue_partition_size]} is as explained above,
176\varname{[residue_classifications]} is the number of possible
177classification to which a partition can belong and
178\varname{[residue_classbook]} is the codebook number used to
179code classification codewords.  The number of dimensions in book
180\varname{[residue_classbook]} determines how many
181classification values are grouped into a single classification
182codeword.  Note that the number of entries and dimensions in book
183\varname{[residue_classbook]}, along with
184\varname{[residue_classifications]}, overdetermines to
185possible number of classification codewords.  
186If \varname{[residue_classifications]}\^{}\varname{[residue_classbook]}.dimensions
187exceeds \varname{[residue_classbook]}.entries, the
188bitstream should be regarded to be undecodable.
189
190Next we read a bitmap pattern that specifies which partition classes
191code values in which passes.
192
193\begin{programlisting}
194  1) iterate [i] over the range 0 ... [residue_classifications]-1 {
195
196       2) [high_bits] = 0
197       3) [low_bits] = read 3 bits as unsigned integer
198       4) [bitflag] = read one bit as boolean
199       5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
200       6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
201     }
202  7) done
203\end{programlisting}
204
205Finally, we read in a list of book numbers, each corresponding to
206specific bit set in the cascade bitmap.  We loop over the possible
207codebook classifications and the maximum possible number of encoding
208stages (8 in Vorbis I, as constrained by the elements of the cascade
209bitmap being eight bits):
210
211\begin{programlisting}
212  1) iterate [i] over the range 0 ... [residue_classifications]-1 {
213
214       2) iterate [j] over the range 0 ... 7 {
215
216            3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
217
218                 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
219
220               } else {
221
222                 5) array [residue_books] element [i][j] = unused
223
224               }
225          }
226      }
227
228  6) done
229\end{programlisting}
230
231An end-of-packet condition at any point in header decode renders the
232stream undecodable.  In addition, any codebook number greater than the
233maximum numbered codebook set up in this stream also renders the
234stream undecodable. All codebooks in array [residue_books] are
235required to have a value mapping.  The presence of codebook in array
236[residue_books] without a value mapping (maptype equals zero) renders
237the stream undecodable.
238
239
240
241\subsubsection{packet decode}
242
243Format 0 and 1 packet decode is identical except for specific
244partition interleave.  Format 2 packet decode can be built out of the
245format 1 decode process.  Thus we describe first the decode
246infrastructure identical to all three formats.
247
248In addition to configuration information, the residue decode process
249is passed the number of vectors in the submap bundle and a vector of
250flags indicating if any of the vectors are not to be decoded.  If the
251passed in number of vectors is 3 and vector number 1 is marked 'do not
252decode', decode skips vector 1 during the decode loop.  However, even
253'do not decode' vectors are allocated and zeroed.
254
255Depending on the values of \varname{[residue_begin]} and
256\varname{[residue_end]}, it is obvious that the encoded
257portion of a residue vector may be the entire possible residue vector
258or some other strict subset of the actual residue vector size with
259zero padding at either uncoded end.  However, it is also possible to
260set \varname{[residue_begin]} and
261\varname{[residue_end]} to specify a range partially or
262wholly beyond the maximum vector size.  Before beginning residue
263decode, limit \varname{[residue_begin]} and
264\varname{[residue_end]} to the maximum possible vector size
265as follows.  We assume that the number of vectors being encoded,
266\varname{[ch]} is provided by the higher level decoding
267process.
268
269\begin{programlisting}
270  1) [actual_size] = current blocksize/2;
271  2) if residue encoding is format 2
272       3) [actual_size] = [actual_size] * [ch];
273  4) [limit_residue_begin] = maximum of ([residue_begin],[actual_size]);
274  5) [limit_residue_end] = maximum of ([residue_end],[actual_size]);
275\end{programlisting}
276
277The following convenience values are conceptually useful to clarifying
278the decode process:
279
280\begin{programlisting}
281  1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
282  2) [n_to_read] = [limit_residue_end] - [limit_residue_begin]
283  3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
284\end{programlisting}
285
286Packet decode proceeds as follows, matching the description offered earlier in the document.
287\begin{programlisting}
288  1) allocate and zero all vectors that will be returned.
289  2) if ([n_to_read] is zero), stop; there is no residue to decode.
290  3) iterate [pass] over the range 0 ... 7 {
291
292       4) [partition_count] = 0
293
294       5) while [partition_count] is less than [partitions_to_read]
295
296            6) if ([pass] is zero) {
297
298                 7) iterate [j] over the range 0 .. [ch]-1 {
299
300                      8) if vector [j] is not marked 'do not decode' {
301
302                           9) [temp] = read from packet using codebook [residue_classbook] in scalar context
303                          10) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
304
305                               11) array [classifications] element [j],([i]+[partition_count]) =
306                                   [temp] integer modulo [residue_classifications]
307                               12) [temp] = [temp] / [residue_classifications] using integer division
308
309                              }
310
311                         }
312
313                    }
314
315               }
316
317           13) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
318               is also less than [partitions_to_read] {
319
320                 14) iterate [j] over the range 0 .. [ch]-1 {
321
322                      15) if vector [j] is not marked 'do not decode' {
323
324                           16) [vqclass] = array [classifications] element [j],[partition_count]
325                           17) [vqbook] = array [residue_books] element [vqclass],[pass]
326                           18) if ([vqbook] is not 'unused') {
327
328                                19) decode partition into output vector number [j], starting at scalar
329                                    offset [limit_residue_begin]+[partition_count]*[residue_partition_size] using
330                                    codebook number [vqbook] in VQ context
331                          }
332                     }
333
334                 20) increment [partition_count] by one
335
336               }
337          }
338     }
339
340 21) done
341
342\end{programlisting}
343
344An end-of-packet condition during packet decode is to be considered a
345nominal occurrence.  Decode returns the result of vector decode up to
346that point.
347
348
349
350\subsubsection{format 0 specifics}
351
352Format zero decodes partitions exactly as described earlier in the
353'Residue Format: residue 0' section.  The following pseudocode
354presents the same algorithm. Assume:
355
356\begin{itemize}
357\item  \varname{[n]} is the value in \varname{[residue_partition_size]}
358\item \varname{[v]} is the residue vector
359\item \varname{[offset]} is the beginning read offset in [v]
360\end{itemize}
361
362
363\begin{programlisting}
364 1) [step] = [n] / [codebook_dimensions]
365 2) iterate [i] over the range 0 ... [step]-1 {
366
367      3) vector [entry_temp] = read vector from packet using current codebook in VQ context
368      4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
369
370           5) vector [v] element ([offset]+[i]+[j]*[step]) =
371	        vector [v] element ([offset]+[i]+[j]*[step]) +
372                vector [entry_temp] element [j]
373
374         }
375
376    }
377
378  6) done
379
380\end{programlisting}
381
382
383
384\subsubsection{format 1 specifics}
385
386Format 1 decodes partitions exactly as described earlier in the
387'Residue Format: residue 1' section.  The following pseudocode
388presents the same algorithm. Assume:
389
390\begin{itemize}
391\item  \varname{[n]} is the value in
392\varname{[residue_partition_size]}
393\item \varname{[v]} is the residue vector
394\item \varname{[offset]} is the beginning read offset in [v]
395\end{itemize}
396
397
398\begin{programlisting}
399 1) [i] = 0
400 2) vector [entry_temp] = read vector from packet using current codebook in VQ context
401 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
402
403      4) vector [v] element ([offset]+[i]) =
404	  vector [v] element ([offset]+[i]) +
405          vector [entry_temp] element [j]
406      5) increment [i]
407
408    }
409
410  6) if ( [i] is less than [n] ) continue at step 2
411  7) done
412\end{programlisting}
413
414
415
416\subsubsection{format 2 specifics}
417
418Format 2 is reducible to format 1.  It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
419
420
421Format 2 handles 'do not decode' vectors differently than residue 0 or
4221; if all vectors are marked 'do not decode', no decode occurrs.
423However, if at least one vector is to be decoded, all the vectors are
424decoded.  We then request normal format 1 to decode a single vector
425representing all output channels, rather than a vector for each
426channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:
427
428\begin{enumerate}
429 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step.
430 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}.
431 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to:
432  \begin{programlisting}
433  1) iterate [i] over the range 0 ... [n]-1 {
434
435       2) iterate [j] over the range 0 ... [ch]-1 {
436
437            3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
438
439          }
440     }
441
442  4) done
443  \end{programlisting}
444
445\end{enumerate}
446
447
448
449
450
451
452
453