1File format
2===========
3
4  <beginning_of_file>
5  [data block 1]
6  [data block 2]
7  ...
8  [data block N]
9  [meta block 1]
10  ...
11  [meta block K]
12  [metaindex block]
13  [index block]
14  [Footer]        (fixed size; starts at file_size - sizeof(Footer))
15  <end_of_file>
16
17The file contains internal pointers.  Each such pointer is called
18a BlockHandle and contains the following information:
19  offset:	    varint64
20  size:		    varint64
21See https://developers.google.com/protocol-buffers/docs/encoding#varints
22for an explanation of varint64 format.
23
24(1) The sequence of key/value pairs in the file are stored in sorted
25order and partitioned into a sequence of data blocks.  These blocks
26come one after another at the beginning of the file.  Each data block
27is formatted according to the code in block_builder.cc, and then
28optionally compressed.
29
30(2) After the data blocks we store a bunch of meta blocks.  The
31supported meta block types are described below.  More meta block types
32may be added in the future.  Each meta block is again formatted using
33block_builder.cc and then optionally compressed.
34
35(3) A "metaindex" block.  It contains one entry for every other meta
36block where the key is the name of the meta block and the value is a
37BlockHandle pointing to that meta block.
38
39(4) An "index" block.  This block contains one entry per data block,
40where the key is a string >= last key in that data block and before
41the first key in the successive data block.  The value is the
42BlockHandle for the data block.
43
44(6) At the very end of the file is a fixed length footer that contains
45the BlockHandle of the metaindex and index blocks as well as a magic number.
46       metaindex_handle: char[p];    // Block handle for metaindex
47       index_handle:     char[q];    // Block handle for index
48       padding:          char[40-p-q]; // zeroed bytes to make fixed length
49                                       // (40==2*BlockHandle::kMaxEncodedLength)
50       magic:            fixed64;    // == 0xdb4775248b80fb57 (little-endian)
51
52"filter" Meta Block
53-------------------
54
55If a "FilterPolicy" was specified when the database was opened, a
56filter block is stored in each table.  The "metaindex" block contains
57an entry that maps from "filter.<N>" to the BlockHandle for the filter
58block where "<N>" is the string returned by the filter policy's
59"Name()" method.
60
61The filter block stores a sequence of filters, where filter i contains
62the output of FilterPolicy::CreateFilter() on all keys that are stored
63in a block whose file offset falls within the range
64
65    [ i*base ... (i+1)*base-1 ]
66
67Currently, "base" is 2KB.  So for example, if blocks X and Y start in
68the range [ 0KB .. 2KB-1 ], all of the keys in X and Y will be
69converted to a filter by calling FilterPolicy::CreateFilter(), and the
70resulting filter will be stored as the first filter in the filter
71block.
72
73The filter block is formatted as follows:
74
75     [filter 0]
76     [filter 1]
77     [filter 2]
78     ...
79     [filter N-1]
80
81     [offset of filter 0]                  : 4 bytes
82     [offset of filter 1]                  : 4 bytes
83     [offset of filter 2]                  : 4 bytes
84     ...
85     [offset of filter N-1]                : 4 bytes
86
87     [offset of beginning of offset array] : 4 bytes
88     lg(base)                              : 1 byte
89
90The offset array at the end of the filter block allows efficient
91mapping from a data block offset to the corresponding filter.
92
93"stats" Meta Block
94------------------
95
96This meta block contains a bunch of stats.  The key is the name
97of the statistic.  The value contains the statistic.
98TODO(postrelease): record following stats.
99  data size
100  index size
101  key size (uncompressed)
102  value size (uncompressed)
103  number of entries
104  number of data blocks
105