ncdu 2.6

2024-09-27 11:18:10 +02:00 · 2024-09-27 11:18:10 +02:00 · b3c9c7407c
commit b3c9c7407c
parent 477101e733
27 changed files with 523 additions and 26 deletions
--- a/dat/ncdu/binfmt.md
+++ b/dat/ncdu/binfmt.md
@ -0,0 +1,364 @@
+% Ncdu Binary Export File Format
+
+This document describes the new binary file format added in ncdu 2.6.  This
+format offers the following advantages compared to the [JSON export file
+format](/ncdu/jsonfmt):
+
+- Support for exporting data from a multithreaded filesystem scan with minimal
+  thread-local buffering and minimal synchronisation between threads.
+- Support for reading the directory tree in depth-first, breath-first and mixed
+  iteration order, thus permitting interactive browsing through the tree
+  without reading the entire file.
+- Cumulative directory sizes are included in the exported data, allowing
+  readers to display this data without walking through the entire tree.
+- Built-in support for compression.
+
+These features come at the cost of increased complexity. The JSON format is
+generally easier to work with and therefore still the recommended approach for
+external tooling to interact with ncdu's export/import functionality.
+
+A binary export can be created with the `-O` option to ncdu. It is also
+possible to convert to and from the JSON format:
+
+```
+ncdu -O export.ncdu /         # Scan root, write to 'export.ncdu'
+ncdu -f in.json -O out.ncdu   # Convert from JSON to binary
+ncdu -f in.ncdu -o out.json   # Convert from binary to JSON
+```
+
+# Format description
+
+## File signature
+
+An exported file starts with the following file signature (in hex):
+
+```
+bf 6e 63 64 75 45 58 31
+```
+
+Formatted as a C string, that is `"\xbfncduEX1"`.
+
+Non-backwards compatible changes to the export format should use a different
+file signature. N.B. A different compression algorithm is a non-backwards
+compatible change.
+
+## Block format
+
+The file signature is followed by one or more *blocks*. A block has the
+following format:
+
+-------  -------   ---------------
+TypeLen  4 bytes   Big-endian type + length of this block
+Content  n bytes   n = Length - 8
+TypeLen  4 bytes   Repeat of TypeLen
+-------  -------   ---------------
+
+The high 4 bits of the *TypeLen* indicate the block type, the lower 28 bits
+encode the length of the block, including the header and footer.
+
+The *TypeLen* is repeated at the end of the block to allow for reading the file
+in both forwards and backwards direction.
+
+The block type determines how the *Content* should be interpreted. There are
+currently two block types:
+
+Type  Meaning
+----  --
+   0  Data block
+   1  Index block
+----  --
+
+Parsers should ignore blocks with an unknown type.
+
+A valid file must have at least one data block and exactly one index block. The
+index block must be the last block in the file.
+
+## Data blocks
+
+Data blocks have the following contents:
+
+----------------  -------  ------
+Number            4 bytes  Big-endian unsigned block number
+Compressed\_data  n bytes
+----------------  -------  ------
+
+Every data block must have a unique number, starting from zero and ideally (but
+not necessarily) allocated without gaps. Data blocks may appear in a different
+order than their numbering.
+
+Data is compressed with [Zstandard](http://www.zstd.net/). Data must be
+compressed in a single frame and the uncompressed size must be available
+through `ZSTD_getFrameContentSize()`, so that readers can pre-allocate a
+properly-sized buffer for decompression.
+
+The total length of a data block, including block header and footer, must not
+exceed 16 MiB minus one byte. The total size of the decompressed data must also
+not exceed 16 MiB minus one byte.
+
+The decompressed data consists of a stream of one or more *Items* (see below).
+
+## Index block
+
+The index block provides a lookup table for data blocks and a reference to the
+root item:
+
+---------------  ----------
+Block\_pointers  n\*8 bytes
+Root\_itemref    8 bytes
+---------------  ----------
+
+*Block\_pointers* is an array containing an 8-byte pointer for each data block
+in the file. Pointers are indexed by block number, so the first pointer is for
+block number 0, the second pointer for block number 1, etc. Each pointer is
+interpreted as a 64bit big-endian unsigned integer. The higher 40 bits indicate
+the byte offset of the data block header, relative to the start of the file.
+The lower 24 bits indicate the block length and must be equivalent to the
+length in the *TypeLen* of the corresponding data block. An all-zero value
+indicates that there is no block with this number in the file.
+
+The last 8 bytes of the index block represent an unsigned big-endian integer
+that refers to the root item of the directory tree. See *Itemref* below.
+
+## Itemref
+
+An *Itemref* encodes a reference to an *Item*, there are two types:
+
+Absolute
+:   An absolute *Itemref* is a 64bit unsigned integer that encodes a block
+    number in the higher 40 bits and a byte offset of the start of the item
+    within the block in the lower 24 bits. Every item in the file has exactly
+    one absolute *Itemref* value. The *Root\_itemref* in the index block must
+    be absolute.
+
+Relative
+:   A relative *Itemref* is a negative integer that represents the byte offset
+    of the referenced item relative to the start of the item containing the
+    reference. Relative references can only reference a previously written item
+    within the same block.
+
+## Item
+
+An *Item* represents a file or directory entry, encoded as a
+[CBOR](https://cbor.io/) map.  Key/value pairs may be encoded in any order and
+unknown keys are ignored.  Summary of keys recognized by ncdu:
+
+ Key  Field     Value
+----  --------  --------
+   0  type      i32
+   1  name      String
+   2  prev      Itemref
+   3  asize     u64
+   4  dsize     u64
+   5  dev       u64
+   6  rderr     bool
+   7  cumasize  u64
+   8  cumdsize  u64
+   9  shrasize  u64
+  10  shrdsize  u64
+  11  items     u64
+  12  sub       Itemref
+  13  ino       u64
+  14  nlink     u32
+  15  uid       u32
+  16  gid       u32
+  17  mode      u16
+  18  mtime     u64
+----  --------  --------
+
+**Common fields for all items**
+
+type
+:   Mandatory. A negative value indicates that the item that has been excluded
+    from the size calculations for some reason, positive values are used for
+    different item types:
+
+    ---  --
+     -4  Excluded with `--exclude-kernfs`
+     -3  Excluded with `-x`
+     -2  Excluded by pattern match
+     -1  Error while reading this entry
+      0  Directory
+      1  Regular file
+      2  Non-regular file (symlink, device, etc)
+      3  Hardlink candidate (i.e. stat().st_nlink > 1)
+    ---  --
+
+    Unrecognized negative values are treated as equivalent to -2, unrecognized
+    positive values are treated as a non-regular file (type=2).
+
+name
+:   Mandatory. Ncdu always encodes the name as a byte string, but also accepts
+    UTF-8 text strings. Ncdu does not support indefinite-length CBOR strings,
+    the name must be encoded with a known length.
+
+prev
+:   Reference to the previous item in the same directory. This field must be
+    absent if this is the first item in a directory. This field forms a
+    singly-linked list of all items in a directory.
+
+**Fields for type >= 0**
+
+asize
+:   Apparent size of this file/directory as reported by `stat().st_size`.
+    Optional, defaults to 0.
+
+dsize
+:   Disk usage of this file/directory as reported by `stat().st_blocks`
+    multiplied by the block size. Optional, defaults to 0.
+
+**Fields for type = 0**
+
+dev
+:   Device number. Optional, defaults to the same device number as the parent
+    directory, or 0 of this is the root item.
+
+rderr
+:   Whether an error occurred while reading this directory. When *true*, an
+    error occurred while reading the directory list itself and the list may
+    therefore be incomplete. When *false*, an error occurred while reading a
+    child item. This implies that somewhere in this sub-tree there must be at
+    least one item of `type=-1` or a directory with `rderr=true`.
+
+cumasize
+:   Cumulative apparent size of this directory. Optional, defaults to 0.
+
+cumdsize
+:   Cumulative disk usage of this directory. Optional, defaults to 0.
+
+shrasize
+:   Shared apparent size. Optional, defaults to 0.
+
+shrdsize
+:   Shared disk usage. Optional, defaults to 0.
+
+items
+:   Cumulative number of items in this directory. Ncdu currently caps this
+    number to `2^32-1` when reading, but supports larger numbers when
+    exporting. Optional, defaults to 0.
+
+sub
+:   Reference to the last item in this directory, or absent if the directory is
+    empty.
+
+**Fields for type=3**
+
+ino
+:   Inode number.
+
+nlink
+:   Number of links to this inode.
+
+**Extended information**
+
+These fields are only exported when the `-e` flag is passed to ncdu. They are
+relevant to all items with type >= 0.
+
+uid
+:   User id.
+
+gid
+:   Group id.
+
+mode
+:   File mode.
+
+mtime
+:   Last modification time as a UNIX timestamp.
+
+# Limitations
+
+Compressed data block size
+:   16 MiB minus 1 byte. This limit comes from *Block\_pointers* in the index
+    block using 24 bits to encode the block length.
+
+Uncompressed data block size
+:   16 MiB minus 1 byte. This limit comes from *Itemref* encoding item offset
+    in 24 bits.
+
+Largest data block number
+:   33,554,428. The size of the index block is limited by the 28-bit length in
+    the block's *TypeLen* header, which limits the number of *Block\_pointers*
+    it can hold to `((2^28 - 1) - 16) / 8` (subtract one to get the maximum
+    block number because counting starts at 0).
+
+Compressed data size
+:   Excluding block overhead, the total amount of compressed data is limited to
+    about 1 TiB. This is limited by *Block\_pointers* using 40 bits to encode
+    the data block offset within the file.
+
+Uncompressed data size
+:   Limited by either the maximum number of data blocks or the compressed data
+    size, depending on compression ratio and the chosen data block size.
+    Assuming the number of data blocks is the limit, about 512 TiB of
+    uncompressed data can be stored with the maximum data block size of 16 MiB.
+    Ncdu's adaptive block size selection has a limit of about 40 TiB.
+
+The real question is how many items an export can hold with the above limits in
+place. This will heavily depend on the average encoded item size and the
+compression ratio, both of which can vary wildly from one directory structure
+to another.
+
+I've had one report with ~1.4 billion files resulting in a ~21 GiB file.
+Extrapolating from that and assuming the compressed data size is the limiting
+factor, this format could hold ~68 billion items. Increasing the compression
+level and using larger data block sizes to further improve compression ratio,
+one could perhaps store about 100 billion items. On the one hand, that sounds
+like an insane number nobody will ever reach. On the other hand, a decade ago I
+couldn't imagine people having more than 100 million files, yet here we are.
+
+On the upside, all the major limitations can be attributed to the maximum size
+and format of the index block. It's possible to implement an alternative index
+format in the future that can be automatically switched to whenever any of the
+above limits are exceeded, thus providing a seamless upgrade path without
+breaking compatibility for the existing exports that do fit within the limits.
+
+# Security considerations
+
+Directory trees can get very large and you can easily exceed available RAM when
+attempting to read everything into memory. Reading only small parts of the tree
+can help cut down on memory use, but it's still a good idea to implement limits
+or detect and handle when you're about to run out of memory.
+
+There are several places in the format where byte offets are used to refer to
+blocks or items. These offets must be validated to ensure that they stay within
+the bounds of the respective file or block. In particular, itemref offsets
+could potentially refer to memory before (in the case of a relative itemref) or
+after (absolute itemref) the decompressed data, and pointers in the index block
+could refer to offsets beyond the end of the file.
+
+The CBOR encoding used for items is self-delimiting, but a badly formatted item
+may not be properly terminated before the end of the decompressed block
+contents. Readers should take care that this does not lead to reading past the
+allocated buffer.
+
+In a well-formed directory tree, each item is referenced exactly once by either
+the *Root\_itemref* or a *pref* or *sub* field. However, it is also possible to
+construct a file where this is not the case, and implementers should be aware
+that itemref loops are possible.
+
+# Implementation notes
+
+Data block size
+:   It is up to the file writer to choose a suitable data block size. This is a
+    compromise between compression efficiency and memory use: larger blocks
+    compress better but also require more memory, both for reading and writing.
+    Ncdu currently keeps 8 uncompressed blocks in memory when reading and one
+    block per thread when writing. Ncdu starts with blocks of 64 KiB, but
+    gradually increases the size to 2 MiB for very large directory trees in
+    order to not bloat the index size too much and to prevent running into the
+    maximum data block number limit.
+
+Testing
+:   If you're implementing a custom writer for this format, make sure to check
+    out the
+    [ncdubinexp.pl](https://code.blicky.net/yorhel/ncdu/src/ncdubinexp.pl)
+    script in the git repository. Ncdu only reads the parts of a file that it
+    actually needs, so passing a file to ncdu is no guarantee that it is
+    well-formed. The ncdubinexp.pl script is more thorough in validating file
+    correctness but misses a few invariants that ncdu does check for, so
+    the best way to verify a file is to run both:
+
+    ```
+    ncdu -f file.ncdu -o/dev/null   # Read entire tree and export to /dev/null
+    ncdubinfmt.pl <file.ncdu        # Read and verify the entire file
+    ```