ncdu 2.6
This commit is contained in:
parent
477101e733
commit
b3c9c7407c
27 changed files with 523 additions and 26 deletions
364
dat/ncdu/binfmt.md
Normal file
364
dat/ncdu/binfmt.md
Normal file
|
|
@ -0,0 +1,364 @@
|
|||
% Ncdu Binary Export File Format
|
||||
|
||||
This document describes the new binary file format added in ncdu 2.6. This
|
||||
format offers the following advantages compared to the [JSON export file
|
||||
format](/ncdu/jsonfmt):
|
||||
|
||||
- Support for exporting data from a multithreaded filesystem scan with minimal
|
||||
thread-local buffering and minimal synchronisation between threads.
|
||||
- Support for reading the directory tree in depth-first, breath-first and mixed
|
||||
iteration order, thus permitting interactive browsing through the tree
|
||||
without reading the entire file.
|
||||
- Cumulative directory sizes are included in the exported data, allowing
|
||||
readers to display this data without walking through the entire tree.
|
||||
- Built-in support for compression.
|
||||
|
||||
These features come at the cost of increased complexity. The JSON format is
|
||||
generally easier to work with and therefore still the recommended approach for
|
||||
external tooling to interact with ncdu's export/import functionality.
|
||||
|
||||
A binary export can be created with the `-O` option to ncdu. It is also
|
||||
possible to convert to and from the JSON format:
|
||||
|
||||
```
|
||||
ncdu -O export.ncdu / # Scan root, write to 'export.ncdu'
|
||||
ncdu -f in.json -O out.ncdu # Convert from JSON to binary
|
||||
ncdu -f in.ncdu -o out.json # Convert from binary to JSON
|
||||
```
|
||||
|
||||
# Format description
|
||||
|
||||
## File signature
|
||||
|
||||
An exported file starts with the following file signature (in hex):
|
||||
|
||||
```
|
||||
bf 6e 63 64 75 45 58 31
|
||||
```
|
||||
|
||||
Formatted as a C string, that is `"\xbfncduEX1"`.
|
||||
|
||||
Non-backwards compatible changes to the export format should use a different
|
||||
file signature. N.B. A different compression algorithm is a non-backwards
|
||||
compatible change.
|
||||
|
||||
## Block format
|
||||
|
||||
The file signature is followed by one or more *blocks*. A block has the
|
||||
following format:
|
||||
|
||||
------- ------- ---------------
|
||||
TypeLen 4 bytes Big-endian type + length of this block
|
||||
Content n bytes n = Length - 8
|
||||
TypeLen 4 bytes Repeat of TypeLen
|
||||
------- ------- ---------------
|
||||
|
||||
The high 4 bits of the *TypeLen* indicate the block type, the lower 28 bits
|
||||
encode the length of the block, including the header and footer.
|
||||
|
||||
The *TypeLen* is repeated at the end of the block to allow for reading the file
|
||||
in both forwards and backwards direction.
|
||||
|
||||
The block type determines how the *Content* should be interpreted. There are
|
||||
currently two block types:
|
||||
|
||||
Type Meaning
|
||||
---- --
|
||||
0 Data block
|
||||
1 Index block
|
||||
---- --
|
||||
|
||||
Parsers should ignore blocks with an unknown type.
|
||||
|
||||
A valid file must have at least one data block and exactly one index block. The
|
||||
index block must be the last block in the file.
|
||||
|
||||
## Data blocks
|
||||
|
||||
Data blocks have the following contents:
|
||||
|
||||
---------------- ------- ------
|
||||
Number 4 bytes Big-endian unsigned block number
|
||||
Compressed\_data n bytes
|
||||
---------------- ------- ------
|
||||
|
||||
Every data block must have a unique number, starting from zero and ideally (but
|
||||
not necessarily) allocated without gaps. Data blocks may appear in a different
|
||||
order than their numbering.
|
||||
|
||||
Data is compressed with [Zstandard](http://www.zstd.net/). Data must be
|
||||
compressed in a single frame and the uncompressed size must be available
|
||||
through `ZSTD_getFrameContentSize()`, so that readers can pre-allocate a
|
||||
properly-sized buffer for decompression.
|
||||
|
||||
The total length of a data block, including block header and footer, must not
|
||||
exceed 16 MiB minus one byte. The total size of the decompressed data must also
|
||||
not exceed 16 MiB minus one byte.
|
||||
|
||||
The decompressed data consists of a stream of one or more *Items* (see below).
|
||||
|
||||
## Index block
|
||||
|
||||
The index block provides a lookup table for data blocks and a reference to the
|
||||
root item:
|
||||
|
||||
--------------- ----------
|
||||
Block\_pointers n\*8 bytes
|
||||
Root\_itemref 8 bytes
|
||||
--------------- ----------
|
||||
|
||||
*Block\_pointers* is an array containing an 8-byte pointer for each data block
|
||||
in the file. Pointers are indexed by block number, so the first pointer is for
|
||||
block number 0, the second pointer for block number 1, etc. Each pointer is
|
||||
interpreted as a 64bit big-endian unsigned integer. The higher 40 bits indicate
|
||||
the byte offset of the data block header, relative to the start of the file.
|
||||
The lower 24 bits indicate the block length and must be equivalent to the
|
||||
length in the *TypeLen* of the corresponding data block. An all-zero value
|
||||
indicates that there is no block with this number in the file.
|
||||
|
||||
The last 8 bytes of the index block represent an unsigned big-endian integer
|
||||
that refers to the root item of the directory tree. See *Itemref* below.
|
||||
|
||||
## Itemref
|
||||
|
||||
An *Itemref* encodes a reference to an *Item*, there are two types:
|
||||
|
||||
Absolute
|
||||
: An absolute *Itemref* is a 64bit unsigned integer that encodes a block
|
||||
number in the higher 40 bits and a byte offset of the start of the item
|
||||
within the block in the lower 24 bits. Every item in the file has exactly
|
||||
one absolute *Itemref* value. The *Root\_itemref* in the index block must
|
||||
be absolute.
|
||||
|
||||
Relative
|
||||
: A relative *Itemref* is a negative integer that represents the byte offset
|
||||
of the referenced item relative to the start of the item containing the
|
||||
reference. Relative references can only reference a previously written item
|
||||
within the same block.
|
||||
|
||||
## Item
|
||||
|
||||
An *Item* represents a file or directory entry, encoded as a
|
||||
[CBOR](https://cbor.io/) map. Key/value pairs may be encoded in any order and
|
||||
unknown keys are ignored. Summary of keys recognized by ncdu:
|
||||
|
||||
Key Field Value
|
||||
---- -------- --------
|
||||
0 type i32
|
||||
1 name String
|
||||
2 prev Itemref
|
||||
3 asize u64
|
||||
4 dsize u64
|
||||
5 dev u64
|
||||
6 rderr bool
|
||||
7 cumasize u64
|
||||
8 cumdsize u64
|
||||
9 shrasize u64
|
||||
10 shrdsize u64
|
||||
11 items u64
|
||||
12 sub Itemref
|
||||
13 ino u64
|
||||
14 nlink u32
|
||||
15 uid u32
|
||||
16 gid u32
|
||||
17 mode u16
|
||||
18 mtime u64
|
||||
---- -------- --------
|
||||
|
||||
**Common fields for all items**
|
||||
|
||||
type
|
||||
: Mandatory. A negative value indicates that the item that has been excluded
|
||||
from the size calculations for some reason, positive values are used for
|
||||
different item types:
|
||||
|
||||
--- --
|
||||
-4 Excluded with `--exclude-kernfs`
|
||||
-3 Excluded with `-x`
|
||||
-2 Excluded by pattern match
|
||||
-1 Error while reading this entry
|
||||
0 Directory
|
||||
1 Regular file
|
||||
2 Non-regular file (symlink, device, etc)
|
||||
3 Hardlink candidate (i.e. stat().st_nlink > 1)
|
||||
--- --
|
||||
|
||||
Unrecognized negative values are treated as equivalent to -2, unrecognized
|
||||
positive values are treated as a non-regular file (type=2).
|
||||
|
||||
name
|
||||
: Mandatory. Ncdu always encodes the name as a byte string, but also accepts
|
||||
UTF-8 text strings. Ncdu does not support indefinite-length CBOR strings,
|
||||
the name must be encoded with a known length.
|
||||
|
||||
prev
|
||||
: Reference to the previous item in the same directory. This field must be
|
||||
absent if this is the first item in a directory. This field forms a
|
||||
singly-linked list of all items in a directory.
|
||||
|
||||
**Fields for type >= 0**
|
||||
|
||||
asize
|
||||
: Apparent size of this file/directory as reported by `stat().st_size`.
|
||||
Optional, defaults to 0.
|
||||
|
||||
dsize
|
||||
: Disk usage of this file/directory as reported by `stat().st_blocks`
|
||||
multiplied by the block size. Optional, defaults to 0.
|
||||
|
||||
**Fields for type = 0**
|
||||
|
||||
dev
|
||||
: Device number. Optional, defaults to the same device number as the parent
|
||||
directory, or 0 of this is the root item.
|
||||
|
||||
rderr
|
||||
: Whether an error occurred while reading this directory. When *true*, an
|
||||
error occurred while reading the directory list itself and the list may
|
||||
therefore be incomplete. When *false*, an error occurred while reading a
|
||||
child item. This implies that somewhere in this sub-tree there must be at
|
||||
least one item of `type=-1` or a directory with `rderr=true`.
|
||||
|
||||
cumasize
|
||||
: Cumulative apparent size of this directory. Optional, defaults to 0.
|
||||
|
||||
cumdsize
|
||||
: Cumulative disk usage of this directory. Optional, defaults to 0.
|
||||
|
||||
shrasize
|
||||
: Shared apparent size. Optional, defaults to 0.
|
||||
|
||||
shrdsize
|
||||
: Shared disk usage. Optional, defaults to 0.
|
||||
|
||||
items
|
||||
: Cumulative number of items in this directory. Ncdu currently caps this
|
||||
number to `2^32-1` when reading, but supports larger numbers when
|
||||
exporting. Optional, defaults to 0.
|
||||
|
||||
sub
|
||||
: Reference to the last item in this directory, or absent if the directory is
|
||||
empty.
|
||||
|
||||
**Fields for type=3**
|
||||
|
||||
ino
|
||||
: Inode number.
|
||||
|
||||
nlink
|
||||
: Number of links to this inode.
|
||||
|
||||
**Extended information**
|
||||
|
||||
These fields are only exported when the `-e` flag is passed to ncdu. They are
|
||||
relevant to all items with type >= 0.
|
||||
|
||||
uid
|
||||
: User id.
|
||||
|
||||
gid
|
||||
: Group id.
|
||||
|
||||
mode
|
||||
: File mode.
|
||||
|
||||
mtime
|
||||
: Last modification time as a UNIX timestamp.
|
||||
|
||||
# Limitations
|
||||
|
||||
Compressed data block size
|
||||
: 16 MiB minus 1 byte. This limit comes from *Block\_pointers* in the index
|
||||
block using 24 bits to encode the block length.
|
||||
|
||||
Uncompressed data block size
|
||||
: 16 MiB minus 1 byte. This limit comes from *Itemref* encoding item offset
|
||||
in 24 bits.
|
||||
|
||||
Largest data block number
|
||||
: 33,554,428. The size of the index block is limited by the 28-bit length in
|
||||
the block's *TypeLen* header, which limits the number of *Block\_pointers*
|
||||
it can hold to `((2^28 - 1) - 16) / 8` (subtract one to get the maximum
|
||||
block number because counting starts at 0).
|
||||
|
||||
Compressed data size
|
||||
: Excluding block overhead, the total amount of compressed data is limited to
|
||||
about 1 TiB. This is limited by *Block\_pointers* using 40 bits to encode
|
||||
the data block offset within the file.
|
||||
|
||||
Uncompressed data size
|
||||
: Limited by either the maximum number of data blocks or the compressed data
|
||||
size, depending on compression ratio and the chosen data block size.
|
||||
Assuming the number of data blocks is the limit, about 512 TiB of
|
||||
uncompressed data can be stored with the maximum data block size of 16 MiB.
|
||||
Ncdu's adaptive block size selection has a limit of about 40 TiB.
|
||||
|
||||
The real question is how many items an export can hold with the above limits in
|
||||
place. This will heavily depend on the average encoded item size and the
|
||||
compression ratio, both of which can vary wildly from one directory structure
|
||||
to another.
|
||||
|
||||
I've had one report with ~1.4 billion files resulting in a ~21 GiB file.
|
||||
Extrapolating from that and assuming the compressed data size is the limiting
|
||||
factor, this format could hold ~68 billion items. Increasing the compression
|
||||
level and using larger data block sizes to further improve compression ratio,
|
||||
one could perhaps store about 100 billion items. On the one hand, that sounds
|
||||
like an insane number nobody will ever reach. On the other hand, a decade ago I
|
||||
couldn't imagine people having more than 100 million files, yet here we are.
|
||||
|
||||
On the upside, all the major limitations can be attributed to the maximum size
|
||||
and format of the index block. It's possible to implement an alternative index
|
||||
format in the future that can be automatically switched to whenever any of the
|
||||
above limits are exceeded, thus providing a seamless upgrade path without
|
||||
breaking compatibility for the existing exports that do fit within the limits.
|
||||
|
||||
# Security considerations
|
||||
|
||||
Directory trees can get very large and you can easily exceed available RAM when
|
||||
attempting to read everything into memory. Reading only small parts of the tree
|
||||
can help cut down on memory use, but it's still a good idea to implement limits
|
||||
or detect and handle when you're about to run out of memory.
|
||||
|
||||
There are several places in the format where byte offets are used to refer to
|
||||
blocks or items. These offets must be validated to ensure that they stay within
|
||||
the bounds of the respective file or block. In particular, itemref offsets
|
||||
could potentially refer to memory before (in the case of a relative itemref) or
|
||||
after (absolute itemref) the decompressed data, and pointers in the index block
|
||||
could refer to offsets beyond the end of the file.
|
||||
|
||||
The CBOR encoding used for items is self-delimiting, but a badly formatted item
|
||||
may not be properly terminated before the end of the decompressed block
|
||||
contents. Readers should take care that this does not lead to reading past the
|
||||
allocated buffer.
|
||||
|
||||
In a well-formed directory tree, each item is referenced exactly once by either
|
||||
the *Root\_itemref* or a *pref* or *sub* field. However, it is also possible to
|
||||
construct a file where this is not the case, and implementers should be aware
|
||||
that itemref loops are possible.
|
||||
|
||||
# Implementation notes
|
||||
|
||||
Data block size
|
||||
: It is up to the file writer to choose a suitable data block size. This is a
|
||||
compromise between compression efficiency and memory use: larger blocks
|
||||
compress better but also require more memory, both for reading and writing.
|
||||
Ncdu currently keeps 8 uncompressed blocks in memory when reading and one
|
||||
block per thread when writing. Ncdu starts with blocks of 64 KiB, but
|
||||
gradually increases the size to 2 MiB for very large directory trees in
|
||||
order to not bloat the index size too much and to prevent running into the
|
||||
maximum data block number limit.
|
||||
|
||||
Testing
|
||||
: If you're implementing a custom writer for this format, make sure to check
|
||||
out the
|
||||
[ncdubinexp.pl](https://code.blicky.net/yorhel/ncdu/src/ncdubinexp.pl)
|
||||
script in the git repository. Ncdu only reads the parts of a file that it
|
||||
actually needs, so passing a file to ncdu is no guarantee that it is
|
||||
well-formed. The ncdubinexp.pl script is more thorough in validating file
|
||||
correctness but misses a few invariants that ncdu does check for, so
|
||||
the best way to verify a file is to run both:
|
||||
|
||||
```
|
||||
ncdu -f file.ncdu -o/dev/null # Read entire tree and export to /dev/null
|
||||
ncdubinfmt.pl <file.ncdu # Read and verify the entire file
|
||||
```
|
||||
Loading…
Add table
Add a link
Reference in a new issue