ncdc 1.18.1 + yxml manual + dcstats + minor restyle
...I need to commit more often.
This commit is contained in:
parent
610b0fb31c
commit
57e7bb546e
20 changed files with 339 additions and 56 deletions
5
dat/doc
5
dat/doc
|
|
@ -6,6 +6,11 @@ rare occasions are published on this page.
|
|||
|
||||
=over
|
||||
|
||||
=item C<2014-01-09 > - L<Some Measurements on Direct Connect File Lists|http://dev.yorhel.nl/doc/dcstats>
|
||||
|
||||
The report of a short measurement study on the file lists obtained from a
|
||||
Direct Connect hub. Lots of graphs!
|
||||
|
||||
=item C<2012-02-15 > - L<A Distributed Communication System for Modular Applications|http://dev.yorhel.nl/doc/commvis>
|
||||
|
||||
In this article I explain a vision of mine, and the results of a small research
|
||||
|
|
|
|||
222
dat/doc-dcstats
Normal file
222
dat/doc-dcstats
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
Some Measurements on Direct Connect File Lists
|
||||
|
||||
=pod
|
||||
|
||||
(Published on B<2014-01-09>.)
|
||||
|
||||
=head1 Introduction
|
||||
|
||||
I've been working on Direct Connect related projects for a while now. This
|
||||
includes maintaining L<ncdc|http://dev.yorhel.nl/ncdc> and
|
||||
L<Globster|http://dev.yorhel.nl/globster>, and doing a bit of research into
|
||||
improving the downloading performance and scalability (to be published at some
|
||||
later date). Whether I'm writing code or trying to setup experiments for
|
||||
research, there's one thing that helps a lot in making decisions. Measurements
|
||||
from an actual network.
|
||||
|
||||
Because useful measurements are often missing, I decided to do some myself.
|
||||
There's a lot to measure in an actual P2P network, but I restricted myself to
|
||||
information that can be gathered quite easily from file lists.
|
||||
|
||||
|
||||
=head1 Obtaining the Data
|
||||
|
||||
Different hubs will likely have totally different patterns in terms of what is
|
||||
being shared. In order to keep this experiment simple, I limited myself to a
|
||||
single hub. And in order to get as much data as possible, I chose the hub that
|
||||
is commonly known as "Ozerki", famous for being one of the larger hubs in
|
||||
existence.
|
||||
|
||||
My approach to getting as many file lists as possible from this hub was perhaps
|
||||
a bit too simple. I simply modified ncdc to have an "add the file list from all
|
||||
users to the download queue" key, and to save all downloaded lists to a
|
||||
directory instead of opening them.
|
||||
|
||||
I started this downloading process on a Monday around noon when there were a
|
||||
little over 11k users online. I hit my hacked download-all-filelists-key two
|
||||
more times later that day in order to get the file lists from those users who
|
||||
joined the hub at a later time. I let this downloading process running until
|
||||
the evening.
|
||||
|
||||
One thing I learned from this experience was that the downloading algorithm in
|
||||
ncdc (1.18.1) does not scale particularly well. Every 60 seconds, it would try
|
||||
to open a connection with B<all> users listed in the download queue. You can
|
||||
imagine that trying to connect to 11k users simultaneously put a significantly
|
||||
heavier load on the hub than would have been necessary. Not good. Not something
|
||||
a well-behaving netizen would do. Surprisingly enough, the hub didn't seem to
|
||||
mind too much and handled the load fine. This might have been because Mondays
|
||||
are typically not the most busy days in P2P land. Weekends tend to be busier.
|
||||
|
||||
Despite that scalability issue, I successfully managed to download the file
|
||||
lists of almost everyone who remained online for long enough to finally get
|
||||
their list downloaded. In total I managed to download 14143 file lists (that's
|
||||
one list too many for C<10000*sqrt(2)>, I should have stopped the process a bit
|
||||
earlier). The total bzip2-compressed size of these lists is 6.5 GiB.
|
||||
|
||||
For obvious reasons, I won't be sharing my modifications to ncdc. I already
|
||||
tarnished the reputation of ncdc enough in that single day. If you wish to
|
||||
repeat this experiment, please do so with a scalable downloading
|
||||
implementation. :-)
|
||||
|
||||
|
||||
=head1 Obtaining the Stats
|
||||
|
||||
And then comes the challenge of aggregating statistics on 6.5 GiB of compressed
|
||||
XML files. This didn't really sound like much of a challenge. After all, all
|
||||
one needs to do is decompress the file lists, do some XML parsing and update
|
||||
some values. Most of the CPU time in this process would likely be spent on
|
||||
bzip2 decompression, so I figured I'd just pipe the output of L<bzcat(1)> to a
|
||||
Perl script and be done with it.
|
||||
|
||||
To get the statistics on the sizes and the distribution of unique files, a data
|
||||
structure containing information on all unique files in the lists was
|
||||
necessary. Perl being the perfect language for data manipulation, I made use of
|
||||
its great support for hash tables to store this information. It turned out,
|
||||
rather unsurprisingly, that Perl isn't all that conservative with respect to
|
||||
memory usage. Neither my 4GB or RAM nor the extra 4GB of swap turned out to be
|
||||
enough to run the script to completion. I tried rewriting the script to use a
|
||||
disk-based data structure, but that slowed things down to a crawl. Some other
|
||||
solution was needed.
|
||||
|
||||
When faced with such a problem, some people will try to optimize the algorithm,
|
||||
others will throw extra hardware at it, and I did what I do best: Optimize away
|
||||
the constants. That is, I rewrote the data analysis program in C. Using the
|
||||
excellent L<khash|https://github.com/attractivechaos/klib> hash table library
|
||||
to keep track of the file information and the equally awesome
|
||||
L<yxml|http://dev.yorhel.nl/yxml> library (a little bit of self-promotion
|
||||
doesn't hurt, right?) to do the XML parsing, I was able to do all the necessary
|
||||
processing in 30 minutes using at most 3.6GB of RAM.
|
||||
|
||||
Long story short, here's my analysis program:
|
||||
L<dcfilestats.c|http://g.blicky.net/dcstats.git/tree/dcfilestats.c>.
|
||||
|
||||
|
||||
=head1 A Look at the Stats
|
||||
|
||||
Some lists didn't decompress/parse correctly, so the actual number of file
|
||||
lists used in these stats is B<14137>. The total compressed size of these lists
|
||||
is B<6,945,269,469> bytes (6.5 GiB), and uncompressed B<25,533,519,352> bytes
|
||||
(24 GiB). In total these lists mentioned B<197,413,253> files. After taking
|
||||
duplicate listings in account, there's still B<84,131,932> unique files.
|
||||
|
||||
And now for some graphs...
|
||||
|
||||
=head2 Size of the File Lists
|
||||
|
||||
Behold, the compressed and uncompressed size of the downloaded file lists:
|
||||
|
||||
[img graph dclistsize.png ]
|
||||
|
||||
Nothing too surprising here, I guess. 100 KiB seems to be a common size for a
|
||||
compressed file lists, but lists of 1 MiB aren't too weird, either. The largest
|
||||
file list in this set is 34.8 MiB compressed and 120 MiB uncompressed. The
|
||||
uncompressed size of a list tends to be (*gasp*) a bit larger, but we can't
|
||||
easily infer the compression ratio from this graph. Hence, another graph:
|
||||
|
||||
[img graph dclistcomp.png ]
|
||||
|
||||
Most file lists compress to about 24% - 35% of their original size. This seems
|
||||
to be consistent with L<similar
|
||||
measurements|http://forum.dcbase.org/viewtopic.php?f=18&t=667> done in 2010.
|
||||
|
||||
The raw data for these graphs is found in
|
||||
L<dclistsize|http://g.blicky.net/dcstats.git/tree/dclistsize>, which lists the
|
||||
compressed and uncompressed size, respectively, for each file list. The gnuplot
|
||||
script for the first graph is
|
||||
L<dclistsize.plot|http://g.blicky.net/dcstats.git/tree/dclistsize.plot> and
|
||||
L<dclistcomp.plot|http://g.blicky.net/dcstats.git/tree/dclistcomp.plot> for the
|
||||
second.
|
||||
|
||||
=head2 Number of Files Per List
|
||||
|
||||
So how many files are people sharing? Let's find out.
|
||||
|
||||
[img graph dcnumfiles.png ]
|
||||
|
||||
As expected, this graph looks very similar to the one about the size of the
|
||||
file list. The size of a list tends to be linear in the number of items it
|
||||
holds, after all.
|
||||
|
||||
The raw data for this graph is found in
|
||||
L<dcnumfiles|http://g.blicky.net/dcstats.git/tree/dcnumfiles>, which lists the
|
||||
unique and total number of files, respectively, for each file list. The gnuplot
|
||||
script is
|
||||
L<dcnumfiles.plot|http://g.blicky.net/dcstats.git/tree/dcnumfiles.plot>.
|
||||
|
||||
=head2 File Sizes
|
||||
|
||||
And how large are the files being shared? Well,
|
||||
|
||||
[img graph dcfilesize.png ]
|
||||
|
||||
This graph is fun, and rather hard to explain without knowing what kind of
|
||||
files we're dealing with. I'm not going to do any further analysis on what kind
|
||||
of files these file sizes represent exactly, but I am going to make some
|
||||
guesses. The files below 1 MiB could be anything, text files, images,
|
||||
subtitles, source code, etc. And considering that the hub in question doesn't
|
||||
put a whole lot of effort in weeding out spammers and bots, it's likely that
|
||||
some malicious users will be sharing small variations of the same virus within
|
||||
the 100 KiB range. The peak of files between 7 and 10 MiB would likely be
|
||||
audio files. The number of files larger than, say, 20 MiB drop significantly,
|
||||
but there are still a few million files in the 20 MiB to 1 GiB range.
|
||||
|
||||
I cut off the graph after 10 GiB, but there's apparently someone who claims to
|
||||
share a file between 1 and 2 TiB (don't know the exact size due to the
|
||||
binning). Since I can't imagine why someone would share a file that large, I
|
||||
expect it to be a fake file list entry. Note that there could be more fakes in
|
||||
my data set. I can't tell which files are fake and which are genuine from the
|
||||
information in the file lists, but I don't expect the number of fake files to
|
||||
be very significant.
|
||||
|
||||
The "raw" data for this graph is found in
|
||||
L<dcfilesize|http://g.blicky.net/dcstats.git/tree/dcfilesize>. Because I wasn't
|
||||
interested in dealing with a text file of 84 million lines, the data is already
|
||||
binned. The first column is the bin number and the second column the number of
|
||||
unique files in that bin. The file sizes that each bin represents are between
|
||||
C<2^(bin+9)> and C<2^(bin+10)>, with the exception of bin 0, which starts at a
|
||||
file size of 0. The source of the gnuplot script is
|
||||
L<dcfilesize.plot|http://g.blicky.net/dcstats.git/tree/dcfilesize.plot>.
|
||||
|
||||
=head2 Distribution of Files
|
||||
|
||||
Another interesting thing to measure is how often files are shared. That is,
|
||||
how many users have the same file?
|
||||
|
||||
[img graph dcfiledist.png ]
|
||||
|
||||
Many files are only available from a single user. That's not really a good sign
|
||||
when you wish to download such a file, but luckily there are also tons of files
|
||||
that I<are> available from multiple users. What is interesting in this graph
|
||||
isn't that it follows the L<power law|https://en.wikipedia.org/wiki/Power_law>,
|
||||
but it's wondering what those outliers could possibly be. There's a collection
|
||||
of 269 files that has been shared among 831 users, and there appears to be a
|
||||
similar group of around 510-515 files that is shared among 20 or so users. I've
|
||||
honestly no idea what those collections could be. Well, yes, I could probably
|
||||
figure that out from the file lists, but my analysis program doesn't tell me
|
||||
which files it's talking about and I'm too lazy to fix that.
|
||||
|
||||
The graph has been clipped to 600, but there's another interesting outlier. A
|
||||
single file that has been shared by 5668 users. I'm going to guess that this is
|
||||
the empty file. There are so many ways to get an empty file somewhere in your
|
||||
filesystem, after all.
|
||||
|
||||
The raw data for this graph is found in
|
||||
L<dcfiledist|http://g.blicky.net/dcstats.git/tree/dcfiledist>, which lists the
|
||||
number of times shared and the aggregate number of files. The gnuplot script is
|
||||
L<dcfiledist.plot|http://g.blicky.net/dcstats.git/tree/dcfiledist.plot>.
|
||||
|
||||
|
||||
=head1 Final Notes
|
||||
|
||||
So, erm, what conclusions can we draw from this? That stats are fun, I guess.
|
||||
If anyone (including me) is going to repeat this experiment on a fresh data
|
||||
set, make sure to use a more scalable downloading process that I did. My
|
||||
approach shouldn't be repeated if we wish to keep the Direct Connect network
|
||||
alive.
|
||||
|
||||
Furthermore, keep in mind that this is just a snapshot of a single day on a
|
||||
single hub. The graphs may look very different when the file lists are
|
||||
harvested at some other time. And it's also quite likely that different hubs
|
||||
will have very different share profiles. It could be interesting to try and
|
||||
graph everything, but I don't have I<that> kind of free time.
|
||||
|
||||
12
dat/ncdc
12
dat/ncdc
|
|
@ -10,14 +10,14 @@ ncurses interface.
|
|||
|
||||
=item Latest version
|
||||
|
||||
1.18 ([dllink ncdc-1.18.tar.gz download]
|
||||
1.18.1 ([dllink ncdc-1.18.1.tar.gz download]
|
||||
- L<changes|http://dev.yorhel.nl/ncdc/changes>
|
||||
- L<mirror|https://sourceforge.net/projects/ncdc/files/ncdc/>)
|
||||
|
||||
Convenient static binaries for Linux:
|
||||
L<64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.tar.gz> -
|
||||
L<32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.tar.gz> -
|
||||
L<ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.tar.gz>. Check the
|
||||
L<64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.1.tar.gz> -
|
||||
L<32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.1.tar.gz> -
|
||||
L<ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.1.tar.gz>. Check the
|
||||
L<installation instructions|http://dev.yorhel.nl/ncdc/install> for more info.
|
||||
|
||||
=item Development version
|
||||
|
|
@ -45,6 +45,7 @@ C<adc://dc.blicky.net:2780/> - If the mailing list is too slow for you.
|
|||
|
||||
Are available for the following systems:
|
||||
L<Arch Linux|http://aur.archlinux.org/packages.php?ID=50949> -
|
||||
L<Fedora|https://apps.fedoraproject.org/packages/ncdc/overview/> -
|
||||
L<FreeBSD|http://www.freshports.org/net-p2p/ncdc/> -
|
||||
L<Frugalware|http://frugalware.org/packages/136807> -
|
||||
L<Gentoo|http://packages.gentoo.org/package/net-p2p/ncdc> -
|
||||
|
|
@ -55,6 +56,9 @@ L<OpenSUSE|http://packman.links2linux.org/package/ncdc>
|
|||
I also have a few packages on the L<Open Build
|
||||
Service|https://build.opensuse.org/package/show?package=ncdc&project=home%3Ayorhel>.
|
||||
|
||||
An convenient installer is available for
|
||||
L<Android|http://code.ivysaur.me/ncdcinstaller.html>.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
|
|
|||
|
|
@ -1,3 +1,8 @@
|
|||
1.18.1 - 2013-10-05
|
||||
- Fix crash when downloading files from multiple sources
|
||||
- Use the yxml library to parse files.xml.bz2 files
|
||||
- Fix various XML conformance bugs in parsing files.xml.bz2 files
|
||||
|
||||
1.18 - 2013-09-25
|
||||
- Add support for segmented downloading
|
||||
- Support $MyINFO without flags byte on NMDC hubs
|
||||
|
|
|
|||
|
|
@ -38,11 +38,11 @@ compiling and/or installing it, I also offer statically linked binaries:
|
|||
|
||||
=over
|
||||
|
||||
=item * L<Linux, 64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.tar.gz>
|
||||
=item * L<Linux, 64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.1.tar.gz>
|
||||
|
||||
=item * L<Linux, 32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.tar.gz>
|
||||
=item * L<Linux, 32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.1.tar.gz>
|
||||
|
||||
=item * L<Linux, ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.tar.gz>
|
||||
=item * L<Linux, ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.1.tar.gz>
|
||||
|
||||
=back
|
||||
|
||||
|
|
@ -58,6 +58,12 @@ architecture, please bug me and I'll see what I can do.
|
|||
|
||||
=head1 System-specific instructions
|
||||
|
||||
=head2 Android
|
||||
|
||||
An L<convenient installer|http://code.ivysaur.me/ncdcinstaller.html> is
|
||||
available for Android 2.3 and later, which makes use of the static binary.
|
||||
|
||||
|
||||
=head2 Arch Linux
|
||||
|
||||
Ncdc is available on L<AUR|https://aur.archlinux.org/packages.php?ID=50949>, to
|
||||
|
|
@ -70,6 +76,15 @@ favorite, go for the manual approach:
|
|||
makepkg -si
|
||||
|
||||
|
||||
=head2 Fedora
|
||||
|
||||
There's a L<package|https://apps.fedoraproject.org/packages/ncdc/overview/>
|
||||
available for Fedora.
|
||||
|
||||
Alternatively, I also have packages on the L<Open Build
|
||||
Service|http://software.opensuse.org/download/package?project=home:yorhel&package=ncdc>.
|
||||
|
||||
|
||||
|
||||
=head2 FreeBSD
|
||||
|
||||
|
|
@ -115,9 +130,9 @@ First install some required packages (as root):
|
|||
|
||||
Then, fetch the ncdc source tarball, extract and build as follows:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
export PATH="$PATH:/usr/perl5/5.10.0/bin"
|
||||
./configure --prefix=/usr LDFLAGS='-L/usr/gnu/lib -R/usr/gnu/lib'
|
||||
make
|
||||
|
|
@ -165,9 +180,9 @@ required libraries:
|
|||
|
||||
Then run the following commands to download and install ncdc:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
./configure --prefix=/usr
|
||||
make
|
||||
sudo make install
|
||||
|
|
@ -209,8 +224,8 @@ website|http://cygwin.com/> and use it to install the following packages:
|
|||
Then open a Cygwin terminal and run the following commands to download,
|
||||
compile, and install ncdc:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
./configure --prefix=/usr
|
||||
make install
|
||||
|
|
|
|||
4
dat/ncdu
4
dat/ncdu
|
|
@ -41,7 +41,7 @@ notifications for new releases.
|
|||
|
||||
=head2 Packages and ports
|
||||
|
||||
Ncdu has been packaged for quite a few systems already, here's a list of the ones I am aware of:
|
||||
Ncdu has been packaged for quite a few systems, here's a list of the ones I am aware of:
|
||||
|
||||
L<AgiliaLinux|http://packages.agilialinux.ru/search.php?tag=sys-fs> -
|
||||
L<AIX|http://www.perzl.org/aix/index.php?n=Main.Ncdu> -
|
||||
|
|
@ -68,7 +68,7 @@ L<Zenwalk|http://zur.zenwalk.org/view/package/name/ncdu>
|
|||
Packages for CentOS, RHEL and (open)SUSE can be found on the
|
||||
L<Open Build Service|https://build.opensuse.org/package/show?package=ncdu&project=utilities>.
|
||||
|
||||
Packages for NetBSD, DragonFlyBSD, MirBSD and others and be found on
|
||||
Packages for NetBSD, DragonFlyBSD, MirBSD and others can be found on
|
||||
L<pkgsrc|http://pkgsrc.se/sysutils/ncdu>.
|
||||
|
||||
|
||||
|
|
|
|||
35
dat/yxml
35
dat/yxml
|
|
@ -11,14 +11,15 @@ The code can be obtained from the L<git repo|http://g.blicky.net/yxml.git> and
|
|||
is available under a permissive MIT license. The only two files you need are
|
||||
L<yxml.c|http://g.blicky.net/yxml.git/plain/yxml.c> and
|
||||
L<yxml.h|http://g.blicky.net/yxml.git/plain/yxml.h>, which can easily be
|
||||
included and compiled as part of your project. Minimal documentation is
|
||||
included in yxml.h, more complete documentation is pending.
|
||||
included and compiled as part of your project. Complete API documentation is
|
||||
available in L<the manual|http://dev.yorhel.nl/yxml/man>.
|
||||
|
||||
The API follows a simple, mostly buffer-less design and only consists of two
|
||||
functions:
|
||||
The API follows a simple and mostly buffer-less design, and only consists of
|
||||
three functions:
|
||||
|
||||
void yxml_init(yxml_t *x, char *stack, size_t stacksize);
|
||||
void yxml_init(yxml_t *x, void *buf, size_t bufsize);
|
||||
yxml_ret_t yxml_parse(yxml_t *x, int ch);
|
||||
yxml_ret_t yxml_eof(yxml_t *x);
|
||||
|
||||
Be aware that I<simple> is not necessarily I<easy> or I<convenient>. The API is
|
||||
relatively low-level and designed to integrate into pretty much any application
|
||||
|
|
@ -28,11 +29,9 @@ devices. It is possible to implement a more convenient and high-level API on
|
|||
top of yxml, but I'm not very fond of libraries that do more than what I
|
||||
strictly need.
|
||||
|
||||
Yxml is still in a beta stage and hasn't been very thoroughly tested yet. There
|
||||
are no tarball releases available at the moment. The API and ABI may still
|
||||
change a bit, so I strongly advise against dynamic linking (I'm not sure if
|
||||
I'll ever promise a stable ABI, but the API should certainly get stabilized at
|
||||
some point).
|
||||
There are no tarball releases available at the moment. The API is relatively
|
||||
stable, but I won't currently promise any ABI stability. Dynamic linking
|
||||
against yxml is therefore not a very good idea.
|
||||
|
||||
=head3 Features
|
||||
|
||||
|
|
@ -95,11 +94,11 @@ using C<< <!ENTITY> >>.
|
|||
=back
|
||||
|
||||
These conformance issues are the result of the byte-oriented and minimal design
|
||||
of yxml, and I do not intent to fix these directly within the library. All of
|
||||
the above mentioned issues can be fixed on top of yxml (by the application, or
|
||||
by a wrapper) if strict conformance is required. With the exception of custom
|
||||
entity references, but I have a simple idea on how to support that in the
|
||||
future, too.
|
||||
of yxml, and I do not intent to fix these directly within the library. The
|
||||
intention is to make sure that all of the above mentioned issues can be fixed
|
||||
on top of yxml (by the application, or by a wrapper) if strict conformance is
|
||||
required, but the required functionality to support custom entity references
|
||||
and DTD handling has not been implemented yet.
|
||||
|
||||
=head3 Non-features
|
||||
|
||||
|
|
@ -136,7 +135,7 @@ implementation is also included as an indication of the "theoretical" minimum.
|
|||
expat 2.1.0 MIT 162 139 194 432 1.47 1.09
|
||||
libxml2 2.9.1 MIT 464 328 518 816 2.53 1.75
|
||||
mxml 2.7 LGPL2+static 32 733 75 832 12.38 7.80
|
||||
yxml git MIT 5 935 31 384 1.14 0.74
|
||||
yxml git MIT 5 971 31 416 1.15 0.74
|
||||
|
||||
The code for these benchmarks is available in the
|
||||
L<bench/|http://g.blicky.net/yxml.git/tree/bench> directory on git. Some
|
||||
|
|
@ -177,7 +176,7 @@ with C<-Os> than with C<-O2>.
|
|||
expat 2.1.0 MIT 113 314 145 632 1.58 1.20
|
||||
libxml2 2.9.1 MIT 356 948 412 256 3.01 2.08
|
||||
mxml 2.7 LGPL2+static 27 725 71 704 11.70 7.44
|
||||
yxml git MIT 4 835 30 264 1.72 1.05
|
||||
yxml git MIT 4 955 30 392 1.67 1.02
|
||||
|
||||
|
||||
=head2 Validating vs. non-validating
|
||||
|
|
@ -204,6 +203,6 @@ It should be noted that a lot of XML documents found in the wild are not
|
|||
described with a DTD, but instead use an alternative technology such as XML
|
||||
schema. Wikipedia L<has more
|
||||
information|https://en.wikipedia.org/wiki/XML#Schemas_and_validation> on this.
|
||||
Using a validating parser for such documents would only introduce bloat and may
|
||||
Using a validating parser for such documents would only add bloat and may
|
||||
introduce L<potential security
|
||||
vulnerabilities|https://en.wikipedia.org/wiki/Billion_laughs>.
|
||||
|
|
|
|||
1
dat/yxml-man
Symbolic link
1
dat/yxml-man
Symbolic link
|
|
@ -0,0 +1 @@
|
|||
../../yxml/yxml.pod
|
||||
Loading…
Add table
Add a link
Reference in a new issue