Add page for yxml + misc changes

This commit is contained in:
Yorhel 2013-09-25 15:26:43 +02:00
parent 60c0840b94
commit 17e0e2b91c
7 changed files with 179 additions and 19 deletions

View file

@ -11,10 +11,6 @@ used.
B<Source code: > L<nccolour.c|http://dev.yorhel.nl/download/code/nccolour.c>
(L<syntax highlighed version|http://p.blicky.net/xu35c>)
Some screenshots can be found below, but more screenshots are always welcome!
Please send your (.png) screenshots to projects@yorhel.nl.
=head2 Notes / observations
=over

View file

@ -80,14 +80,6 @@ on OS X, L<this stackoverflow answer|http://stackoverflow.com/a/438892>
may be helpful.
=head2 This "Generating certificates..." is taking ages!
If you're on Linux with ncdc 1.14 or older and a GnuTLS older than 3.0, then
creating the certificates required access to C</dev/random> to obtain random
data. This could take a while to complete, especially on headless servers and
low-end devices. Updating to ncdc 1.15 or later should fix this.
=head2 Ncdc crashes a lot!
Ncdc 1.17 has no known bugs that may cause a crash. If you're running an

View file

@ -49,7 +49,7 @@ L<Alpine Linux|http://alpinelinux.org/packages?title_op=%3D&title=ncdu> -
L<ALT Linux|http://sisyphus.ru/en/srpm/Sisyphus/ncdu> -
L<Arch Linux|http://www.archlinux.org/packages/?q=ncdu> -
L<CRUX|http://crux.nu/portdb/?q=ncdu&a=search> -
L<Cygwin|http://cygwin.com/packages/ncdu/> -
L<Cygwin|http://cygwin.com/cgi-bin2/package-grep.cgi?grep=ncdu> -
L<Debian|http://packages.debian.org/ncdu> -
L<Fedora|https://admin.fedoraproject.org/pkgdb/acls/name/ncdu> -
L<FreeBSD|http://www.freshports.org/sysutils/ncdu/> -
@ -61,13 +61,16 @@ Mac OS X (L<Fink|http://pdb.finkproject.org/pdb/package.php/ncdu> - L<Homebrew|h
L<Pardus|http://packages.pardus.org.tr/info/2011/testing/source/ncdu.html> -
L<Puppy Linux|http://www.murga-linux.com/puppy/viewtopic.php?t=35024> -
L<Solaris|http://www.opencsw.org/packages/ncdu> -
Slackware (L<Slackbuilds|http://slackbuilds.org/repository/13.37/system/ncdu/> - L<Slackers.it|http://www.slackers.it/repository/ncdu/>) -
Slackware (L<Slackbuilds|http://slackbuilds.org/repository/14.0/system/ncdu/> - L<Slackers.it|http://www.slackers.it/repository/ncdu/>) -
L<Ubuntu|http://packages.ubuntu.com/search?searchon=sourcenames&keywords=ncdu> -
L<Zenwalk|http://zur.zenwalk.org/view/package/name/ncdu>
Packages for CentOS, RHEL and (open)SUSE can be found on the
L<Open Build Service|https://build.opensuse.org/package/show?package=ncdu&project=utilities>.
Packages for NetBSD, DragonFlyBSD, MirBSD and others and be found on
L<pkgsrc|http://pkgsrc.se/sysutils/ncdu>.
=head2 Similar projects

View file

@ -1,6 +1,6 @@
=pod
This document describes the file format that ncdu 1.9 uses for its
This document describes the file format that ncdu 1.9 and 1.10 uses for its
export/import feature (the C<-o> and C<-f> options). Check the L<ncdu
manual|http://dev.yorhel.nl/ncdu/man> for a description on how to use that
feature.
@ -29,7 +29,7 @@ the existing format.
=head2 Metadata
The C<< <metadata> >> element is a JSON object holding whatever (short)
metadata you'd want. This block is currently (1.9) ignored by ncdu when
metadata you'd want. This block is currently (1.9-1.10) ignored by ncdu when
importing, but it writes out the following keys when exporting:
=over
@ -40,7 +40,7 @@ String, name of the program that generated the file, i.e. C<"ncdu">.
=item progver
String, version of the program that generated the file, e.g. C<"1.9">.
String, version of the program that generated the file, e.g. C<"1.10">.
=item timestamp

166
dat/yxml Normal file
View file

@ -0,0 +1,166 @@
=pod
I<*But see the L<Bugs and Limitations|/Bugs and Limitations> below.>
Yxml is a small (C<6 KiB>) non-validating yet mostly conforming XML parser
written in C. Its primary goals are small binary size, simplicity and
correctness. It also happens to be L<pretty fast|/Comparison>.
The code can be obtained from the L<git repo|http://g.blicky.net/yxml.git> and
is available under a permissive MIT license. The only two files you need are
L<yxml.c|http://g.blicky.net/yxml.git/plain/yxml.c> and
L<yxml.h|http://g.blicky.net/yxml.git/plain/yxml.h>, which can easily be
included and compiled as part of your project. Minimal documentation is
included in yxml.h, more complete documentation is pending.
The API follows a simple, mostly buffer-less design and only consists of two
functions:
void yxml_init(yxml_t *x, char *stack, size_t stacksize);
yxml_ret_t yxml_parse(yxml_t *x, int ch);
Be aware that I<simple> is not necessarily I<easy> or I<convenient>. The API is
relatively low-level and designed to integrate into pretty much any application
and for any use case. This includes incrementally parsing data from a socket in
an event-driven fashion and parsing large XML files on memory-restricted
devices. It is possible to implement a more convenient and high-level API on
top of yxml, but I'm not very fond of libraries that do more than what I
strictly need.
Yxml is still in a beta stage and hasn't been very thoroughly tested yet. There
are no tarball releases available at the moment. The API and ABI may still
change a bit, so I strongly advise against dynamic linking (I'm not sure if
I'll ever promise a stable ABI, but the API should certainly get stabilized at
some point).
=head3 Features
=over
=item * Simple and low-level API.
=item * Does not require C<malloc()>.
=item * Pure C, should be very portable.
=item * Recognizes and consumes the UTF-8 BOM.
=item * Parses entity references (C<&amp;>) and character references (C<&#x26;>).
=item * Verifies most well-formedness constraints, including the correct
nesting of elements.
=item * Parses XML documents in any ASCII-compatible encoding.
=back
But let's not be I<too> optimistic, because there are also...
=head3 Bugs and Limitations
=over
=item * Element and Attribute names may only consist of ASCII characters.
=item * Does not verify that non-ASCII characters in attribute values or
element contents are within the allowed character ranges.
=item * A conditional section in a C<< <!DOCTYPE ..> >> declaration will result
in a parse error.
=item * Allows multiple C<< <!DOCTYPE ..> >> declarations.
=item * Information encoded in the XML and doctype declarations is currently
not available through the API.
=back
I hope to have these issues fixed in the near future.
=head3 Non-features
And now follows a list of things that are not supported and probably never will
be. Most items on this list can be implemented on top of yxml.
=over
=item * Does not verify all well-formedness constraints. In particular, does
not verify that attribute names within the same element are unique, and does
not verify that the contents of a C<< <!DOCTYPE ..> >> declaration follow the
XML grammar.
=item * No helper functions to deal with namespaces. Yxml will parse XML files
with namespaces just fine, but it's up to the application to do the rest.
=item * No support for custom entity references, neither through the API nor
using C<< <!ENTITY> >>.
=item * No DTD or XML Schema validation.
=item * No XSLT.
=item * No XPath.
=item * Can't parse documents in a non-ASCII-compatible encoding. You'll have
to convert it to UTF-8 or something similar first.
=item * Doesn't do your household chores.
=back
=head2 Comparison
The following benchmark compares L<expat|http://expat.sourceforge.net/>,
L<libxml2|http://xmlsoft.org/> and
L<Mini-XML|http://www.msweet.org/projects.php?Z3> with yxml. A L<strlen(3)>
implementation is also included as an indication of the "theoretical" minimum.
SIZE PERFORMANCE
LIB VER LICENSE OBJ STATIC WIKI DISCOGS
strlen 25 816 0.16 0.09
expat 2.1.0 MIT 162 139 194 432 1.47 1.09
libxml2 2.9.1 MIT 464 328 518 816 2.53 1.75
mxml 2.7 LGPL2+static 32 733 75 832 12.38 7.80
yxml git MIT 6 015 31 448 1.18 0.73
The code for these benchmarks is available in the
L<bench/|http://g.blicky.net/yxml.git/tree/bench> directory on git. Some
explanatory notes:
=over
=item * C<OBJ> is the total size of all object code of the library, measured
with L<size(1)>.
=item * C<STATIC> is the file size of a minimal statically linked binary when
linked against L<musl|http://www.musl-libc.org/> 0.9.13, measured with
L<wc(1)> after running L<strip(1)>.
=item * The performance is the time, in seconds, to load a large XML file.
C<WIKI> refers to C<enwiki-20130805-abstract5.xml> (162 MiB) from a L<Wikipedia
Dump|http://dumps.wikimedia.org/enwiki/>, C<DISCOGS> refers to
C<discogs_20130801_labels.xml> (94 MiB) from a L<Discogs Data
Dump|http://www.discogs.com/data/>.
=item * Libxml2 has been compiled with most of its features disabled with
C<./configure>, but it still manages to be the very definition of bloat.
=item * Everything has been compiled with gcc 4.8.1 at C<-O2>.
=item * Benchmarks are run on Linux 3.10.7 with a 3 Ghz Intel Core Duo E8400
and with 4GB RAM.
=back
And just for fun, here's the same comparison when compiled with C<-Os>, i.e.
optimized for small size. Interestingly enough, Mini-XML actually runs faster
with C<-Os> than with C<-O2>.
SIZE PERFORMANCE
LIB VER LICENSE OBJ STATIC WIKI DISCOGS
strlen 25 816 0.16 0.09
expat 2.1.0 MIT 113 314 145 632 1.58 1.20
libxml2 2.9.1 MIT 356 948 412 256 3.01 2.08
mxml 2.7 LGPL2+static 27 725 71 704 11.70 7.44
yxml git MIT 4 835 30 264 1.72 1.05