ncdc 1.18.1 + yxml manual + dcstats + minor restyle
...I need to commit more often.
This commit is contained in:
parent
610b0fb31c
commit
57e7bb546e
20 changed files with 339 additions and 56 deletions
4
Bug.pm
4
Bug.pm
|
|
@ -110,6 +110,10 @@ sub dbSave {
|
|||
# TODO: pagination / filtering
|
||||
sub htmlListing {
|
||||
my($s, $l, $lnk) = @_;
|
||||
if(!@$l) {
|
||||
p class => 'bug_nolisting', 'No bugs found! Yay?';
|
||||
return;
|
||||
}
|
||||
table class => 'bug_listing';
|
||||
thead; Tr;
|
||||
td class => 'bug_col_id', 'Id';
|
||||
|
|
|
|||
5
dat/doc
5
dat/doc
|
|
@ -6,6 +6,11 @@ rare occasions are published on this page.
|
|||
|
||||
=over
|
||||
|
||||
=item C<2014-01-09 > - L<Some Measurements on Direct Connect File Lists|http://dev.yorhel.nl/doc/dcstats>
|
||||
|
||||
The report of a short measurement study on the file lists obtained from a
|
||||
Direct Connect hub. Lots of graphs!
|
||||
|
||||
=item C<2012-02-15 > - L<A Distributed Communication System for Modular Applications|http://dev.yorhel.nl/doc/commvis>
|
||||
|
||||
In this article I explain a vision of mine, and the results of a small research
|
||||
|
|
|
|||
222
dat/doc-dcstats
Normal file
222
dat/doc-dcstats
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
Some Measurements on Direct Connect File Lists
|
||||
|
||||
=pod
|
||||
|
||||
(Published on B<2014-01-09>.)
|
||||
|
||||
=head1 Introduction
|
||||
|
||||
I've been working on Direct Connect related projects for a while now. This
|
||||
includes maintaining L<ncdc|http://dev.yorhel.nl/ncdc> and
|
||||
L<Globster|http://dev.yorhel.nl/globster>, and doing a bit of research into
|
||||
improving the downloading performance and scalability (to be published at some
|
||||
later date). Whether I'm writing code or trying to setup experiments for
|
||||
research, there's one thing that helps a lot in making decisions. Measurements
|
||||
from an actual network.
|
||||
|
||||
Because useful measurements are often missing, I decided to do some myself.
|
||||
There's a lot to measure in an actual P2P network, but I restricted myself to
|
||||
information that can be gathered quite easily from file lists.
|
||||
|
||||
|
||||
=head1 Obtaining the Data
|
||||
|
||||
Different hubs will likely have totally different patterns in terms of what is
|
||||
being shared. In order to keep this experiment simple, I limited myself to a
|
||||
single hub. And in order to get as much data as possible, I chose the hub that
|
||||
is commonly known as "Ozerki", famous for being one of the larger hubs in
|
||||
existence.
|
||||
|
||||
My approach to getting as many file lists as possible from this hub was perhaps
|
||||
a bit too simple. I simply modified ncdc to have an "add the file list from all
|
||||
users to the download queue" key, and to save all downloaded lists to a
|
||||
directory instead of opening them.
|
||||
|
||||
I started this downloading process on a Monday around noon when there were a
|
||||
little over 11k users online. I hit my hacked download-all-filelists-key two
|
||||
more times later that day in order to get the file lists from those users who
|
||||
joined the hub at a later time. I let this downloading process running until
|
||||
the evening.
|
||||
|
||||
One thing I learned from this experience was that the downloading algorithm in
|
||||
ncdc (1.18.1) does not scale particularly well. Every 60 seconds, it would try
|
||||
to open a connection with B<all> users listed in the download queue. You can
|
||||
imagine that trying to connect to 11k users simultaneously put a significantly
|
||||
heavier load on the hub than would have been necessary. Not good. Not something
|
||||
a well-behaving netizen would do. Surprisingly enough, the hub didn't seem to
|
||||
mind too much and handled the load fine. This might have been because Mondays
|
||||
are typically not the most busy days in P2P land. Weekends tend to be busier.
|
||||
|
||||
Despite that scalability issue, I successfully managed to download the file
|
||||
lists of almost everyone who remained online for long enough to finally get
|
||||
their list downloaded. In total I managed to download 14143 file lists (that's
|
||||
one list too many for C<10000*sqrt(2)>, I should have stopped the process a bit
|
||||
earlier). The total bzip2-compressed size of these lists is 6.5 GiB.
|
||||
|
||||
For obvious reasons, I won't be sharing my modifications to ncdc. I already
|
||||
tarnished the reputation of ncdc enough in that single day. If you wish to
|
||||
repeat this experiment, please do so with a scalable downloading
|
||||
implementation. :-)
|
||||
|
||||
|
||||
=head1 Obtaining the Stats
|
||||
|
||||
And then comes the challenge of aggregating statistics on 6.5 GiB of compressed
|
||||
XML files. This didn't really sound like much of a challenge. After all, all
|
||||
one needs to do is decompress the file lists, do some XML parsing and update
|
||||
some values. Most of the CPU time in this process would likely be spent on
|
||||
bzip2 decompression, so I figured I'd just pipe the output of L<bzcat(1)> to a
|
||||
Perl script and be done with it.
|
||||
|
||||
To get the statistics on the sizes and the distribution of unique files, a data
|
||||
structure containing information on all unique files in the lists was
|
||||
necessary. Perl being the perfect language for data manipulation, I made use of
|
||||
its great support for hash tables to store this information. It turned out,
|
||||
rather unsurprisingly, that Perl isn't all that conservative with respect to
|
||||
memory usage. Neither my 4GB or RAM nor the extra 4GB of swap turned out to be
|
||||
enough to run the script to completion. I tried rewriting the script to use a
|
||||
disk-based data structure, but that slowed things down to a crawl. Some other
|
||||
solution was needed.
|
||||
|
||||
When faced with such a problem, some people will try to optimize the algorithm,
|
||||
others will throw extra hardware at it, and I did what I do best: Optimize away
|
||||
the constants. That is, I rewrote the data analysis program in C. Using the
|
||||
excellent L<khash|https://github.com/attractivechaos/klib> hash table library
|
||||
to keep track of the file information and the equally awesome
|
||||
L<yxml|http://dev.yorhel.nl/yxml> library (a little bit of self-promotion
|
||||
doesn't hurt, right?) to do the XML parsing, I was able to do all the necessary
|
||||
processing in 30 minutes using at most 3.6GB of RAM.
|
||||
|
||||
Long story short, here's my analysis program:
|
||||
L<dcfilestats.c|http://g.blicky.net/dcstats.git/tree/dcfilestats.c>.
|
||||
|
||||
|
||||
=head1 A Look at the Stats
|
||||
|
||||
Some lists didn't decompress/parse correctly, so the actual number of file
|
||||
lists used in these stats is B<14137>. The total compressed size of these lists
|
||||
is B<6,945,269,469> bytes (6.5 GiB), and uncompressed B<25,533,519,352> bytes
|
||||
(24 GiB). In total these lists mentioned B<197,413,253> files. After taking
|
||||
duplicate listings in account, there's still B<84,131,932> unique files.
|
||||
|
||||
And now for some graphs...
|
||||
|
||||
=head2 Size of the File Lists
|
||||
|
||||
Behold, the compressed and uncompressed size of the downloaded file lists:
|
||||
|
||||
[img graph dclistsize.png ]
|
||||
|
||||
Nothing too surprising here, I guess. 100 KiB seems to be a common size for a
|
||||
compressed file lists, but lists of 1 MiB aren't too weird, either. The largest
|
||||
file list in this set is 34.8 MiB compressed and 120 MiB uncompressed. The
|
||||
uncompressed size of a list tends to be (*gasp*) a bit larger, but we can't
|
||||
easily infer the compression ratio from this graph. Hence, another graph:
|
||||
|
||||
[img graph dclistcomp.png ]
|
||||
|
||||
Most file lists compress to about 24% - 35% of their original size. This seems
|
||||
to be consistent with L<similar
|
||||
measurements|http://forum.dcbase.org/viewtopic.php?f=18&t=667> done in 2010.
|
||||
|
||||
The raw data for these graphs is found in
|
||||
L<dclistsize|http://g.blicky.net/dcstats.git/tree/dclistsize>, which lists the
|
||||
compressed and uncompressed size, respectively, for each file list. The gnuplot
|
||||
script for the first graph is
|
||||
L<dclistsize.plot|http://g.blicky.net/dcstats.git/tree/dclistsize.plot> and
|
||||
L<dclistcomp.plot|http://g.blicky.net/dcstats.git/tree/dclistcomp.plot> for the
|
||||
second.
|
||||
|
||||
=head2 Number of Files Per List
|
||||
|
||||
So how many files are people sharing? Let's find out.
|
||||
|
||||
[img graph dcnumfiles.png ]
|
||||
|
||||
As expected, this graph looks very similar to the one about the size of the
|
||||
file list. The size of a list tends to be linear in the number of items it
|
||||
holds, after all.
|
||||
|
||||
The raw data for this graph is found in
|
||||
L<dcnumfiles|http://g.blicky.net/dcstats.git/tree/dcnumfiles>, which lists the
|
||||
unique and total number of files, respectively, for each file list. The gnuplot
|
||||
script is
|
||||
L<dcnumfiles.plot|http://g.blicky.net/dcstats.git/tree/dcnumfiles.plot>.
|
||||
|
||||
=head2 File Sizes
|
||||
|
||||
And how large are the files being shared? Well,
|
||||
|
||||
[img graph dcfilesize.png ]
|
||||
|
||||
This graph is fun, and rather hard to explain without knowing what kind of
|
||||
files we're dealing with. I'm not going to do any further analysis on what kind
|
||||
of files these file sizes represent exactly, but I am going to make some
|
||||
guesses. The files below 1 MiB could be anything, text files, images,
|
||||
subtitles, source code, etc. And considering that the hub in question doesn't
|
||||
put a whole lot of effort in weeding out spammers and bots, it's likely that
|
||||
some malicious users will be sharing small variations of the same virus within
|
||||
the 100 KiB range. The peak of files between 7 and 10 MiB would likely be
|
||||
audio files. The number of files larger than, say, 20 MiB drop significantly,
|
||||
but there are still a few million files in the 20 MiB to 1 GiB range.
|
||||
|
||||
I cut off the graph after 10 GiB, but there's apparently someone who claims to
|
||||
share a file between 1 and 2 TiB (don't know the exact size due to the
|
||||
binning). Since I can't imagine why someone would share a file that large, I
|
||||
expect it to be a fake file list entry. Note that there could be more fakes in
|
||||
my data set. I can't tell which files are fake and which are genuine from the
|
||||
information in the file lists, but I don't expect the number of fake files to
|
||||
be very significant.
|
||||
|
||||
The "raw" data for this graph is found in
|
||||
L<dcfilesize|http://g.blicky.net/dcstats.git/tree/dcfilesize>. Because I wasn't
|
||||
interested in dealing with a text file of 84 million lines, the data is already
|
||||
binned. The first column is the bin number and the second column the number of
|
||||
unique files in that bin. The file sizes that each bin represents are between
|
||||
C<2^(bin+9)> and C<2^(bin+10)>, with the exception of bin 0, which starts at a
|
||||
file size of 0. The source of the gnuplot script is
|
||||
L<dcfilesize.plot|http://g.blicky.net/dcstats.git/tree/dcfilesize.plot>.
|
||||
|
||||
=head2 Distribution of Files
|
||||
|
||||
Another interesting thing to measure is how often files are shared. That is,
|
||||
how many users have the same file?
|
||||
|
||||
[img graph dcfiledist.png ]
|
||||
|
||||
Many files are only available from a single user. That's not really a good sign
|
||||
when you wish to download such a file, but luckily there are also tons of files
|
||||
that I<are> available from multiple users. What is interesting in this graph
|
||||
isn't that it follows the L<power law|https://en.wikipedia.org/wiki/Power_law>,
|
||||
but it's wondering what those outliers could possibly be. There's a collection
|
||||
of 269 files that has been shared among 831 users, and there appears to be a
|
||||
similar group of around 510-515 files that is shared among 20 or so users. I've
|
||||
honestly no idea what those collections could be. Well, yes, I could probably
|
||||
figure that out from the file lists, but my analysis program doesn't tell me
|
||||
which files it's talking about and I'm too lazy to fix that.
|
||||
|
||||
The graph has been clipped to 600, but there's another interesting outlier. A
|
||||
single file that has been shared by 5668 users. I'm going to guess that this is
|
||||
the empty file. There are so many ways to get an empty file somewhere in your
|
||||
filesystem, after all.
|
||||
|
||||
The raw data for this graph is found in
|
||||
L<dcfiledist|http://g.blicky.net/dcstats.git/tree/dcfiledist>, which lists the
|
||||
number of times shared and the aggregate number of files. The gnuplot script is
|
||||
L<dcfiledist.plot|http://g.blicky.net/dcstats.git/tree/dcfiledist.plot>.
|
||||
|
||||
|
||||
=head1 Final Notes
|
||||
|
||||
So, erm, what conclusions can we draw from this? That stats are fun, I guess.
|
||||
If anyone (including me) is going to repeat this experiment on a fresh data
|
||||
set, make sure to use a more scalable downloading process that I did. My
|
||||
approach shouldn't be repeated if we wish to keep the Direct Connect network
|
||||
alive.
|
||||
|
||||
Furthermore, keep in mind that this is just a snapshot of a single day on a
|
||||
single hub. The graphs may look very different when the file lists are
|
||||
harvested at some other time. And it's also quite likely that different hubs
|
||||
will have very different share profiles. It could be interesting to try and
|
||||
graph everything, but I don't have I<that> kind of free time.
|
||||
|
||||
12
dat/ncdc
12
dat/ncdc
|
|
@ -10,14 +10,14 @@ ncurses interface.
|
|||
|
||||
=item Latest version
|
||||
|
||||
1.18 ([dllink ncdc-1.18.tar.gz download]
|
||||
1.18.1 ([dllink ncdc-1.18.1.tar.gz download]
|
||||
- L<changes|http://dev.yorhel.nl/ncdc/changes>
|
||||
- L<mirror|https://sourceforge.net/projects/ncdc/files/ncdc/>)
|
||||
|
||||
Convenient static binaries for Linux:
|
||||
L<64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.tar.gz> -
|
||||
L<32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.tar.gz> -
|
||||
L<ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.tar.gz>. Check the
|
||||
L<64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.1.tar.gz> -
|
||||
L<32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.1.tar.gz> -
|
||||
L<ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.1.tar.gz>. Check the
|
||||
L<installation instructions|http://dev.yorhel.nl/ncdc/install> for more info.
|
||||
|
||||
=item Development version
|
||||
|
|
@ -45,6 +45,7 @@ C<adc://dc.blicky.net:2780/> - If the mailing list is too slow for you.
|
|||
|
||||
Are available for the following systems:
|
||||
L<Arch Linux|http://aur.archlinux.org/packages.php?ID=50949> -
|
||||
L<Fedora|https://apps.fedoraproject.org/packages/ncdc/overview/> -
|
||||
L<FreeBSD|http://www.freshports.org/net-p2p/ncdc/> -
|
||||
L<Frugalware|http://frugalware.org/packages/136807> -
|
||||
L<Gentoo|http://packages.gentoo.org/package/net-p2p/ncdc> -
|
||||
|
|
@ -55,6 +56,9 @@ L<OpenSUSE|http://packman.links2linux.org/package/ncdc>
|
|||
I also have a few packages on the L<Open Build
|
||||
Service|https://build.opensuse.org/package/show?package=ncdc&project=home%3Ayorhel>.
|
||||
|
||||
An convenient installer is available for
|
||||
L<Android|http://code.ivysaur.me/ncdcinstaller.html>.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
|
|
|||
|
|
@ -1,3 +1,8 @@
|
|||
1.18.1 - 2013-10-05
|
||||
- Fix crash when downloading files from multiple sources
|
||||
- Use the yxml library to parse files.xml.bz2 files
|
||||
- Fix various XML conformance bugs in parsing files.xml.bz2 files
|
||||
|
||||
1.18 - 2013-09-25
|
||||
- Add support for segmented downloading
|
||||
- Support $MyINFO without flags byte on NMDC hubs
|
||||
|
|
|
|||
|
|
@ -38,11 +38,11 @@ compiling and/or installing it, I also offer statically linked binaries:
|
|||
|
||||
=over
|
||||
|
||||
=item * L<Linux, 64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.tar.gz>
|
||||
=item * L<Linux, 64-bit|http://dev.yorhel.nl/download/ncdc-linux-x86_64-1.18.1.tar.gz>
|
||||
|
||||
=item * L<Linux, 32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.tar.gz>
|
||||
=item * L<Linux, 32-bit|http://dev.yorhel.nl/download/ncdc-linux-i486-1.18.1.tar.gz>
|
||||
|
||||
=item * L<Linux, ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.tar.gz>
|
||||
=item * L<Linux, ARM|http://dev.yorhel.nl/download/ncdc-linux-arm-1.18.1.tar.gz>
|
||||
|
||||
=back
|
||||
|
||||
|
|
@ -58,6 +58,12 @@ architecture, please bug me and I'll see what I can do.
|
|||
|
||||
=head1 System-specific instructions
|
||||
|
||||
=head2 Android
|
||||
|
||||
An L<convenient installer|http://code.ivysaur.me/ncdcinstaller.html> is
|
||||
available for Android 2.3 and later, which makes use of the static binary.
|
||||
|
||||
|
||||
=head2 Arch Linux
|
||||
|
||||
Ncdc is available on L<AUR|https://aur.archlinux.org/packages.php?ID=50949>, to
|
||||
|
|
@ -70,6 +76,15 @@ favorite, go for the manual approach:
|
|||
makepkg -si
|
||||
|
||||
|
||||
=head2 Fedora
|
||||
|
||||
There's a L<package|https://apps.fedoraproject.org/packages/ncdc/overview/>
|
||||
available for Fedora.
|
||||
|
||||
Alternatively, I also have packages on the L<Open Build
|
||||
Service|http://software.opensuse.org/download/package?project=home:yorhel&package=ncdc>.
|
||||
|
||||
|
||||
|
||||
=head2 FreeBSD
|
||||
|
||||
|
|
@ -115,9 +130,9 @@ First install some required packages (as root):
|
|||
|
||||
Then, fetch the ncdc source tarball, extract and build as follows:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
export PATH="$PATH:/usr/perl5/5.10.0/bin"
|
||||
./configure --prefix=/usr LDFLAGS='-L/usr/gnu/lib -R/usr/gnu/lib'
|
||||
make
|
||||
|
|
@ -165,9 +180,9 @@ required libraries:
|
|||
|
||||
Then run the following commands to download and install ncdc:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
./configure --prefix=/usr
|
||||
make
|
||||
sudo make install
|
||||
|
|
@ -209,8 +224,8 @@ website|http://cygwin.com/> and use it to install the following packages:
|
|||
Then open a Cygwin terminal and run the following commands to download,
|
||||
compile, and install ncdc:
|
||||
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.tar.gz
|
||||
tar -xf ncdc-1.18.tar.gz
|
||||
cd ncdc-1.18
|
||||
wget http://dev.yorhel.nl/download/ncdc-1.18.1.tar.gz
|
||||
tar -xf ncdc-1.18.1.tar.gz
|
||||
cd ncdc-1.18.1
|
||||
./configure --prefix=/usr
|
||||
make install
|
||||
|
|
|
|||
4
dat/ncdu
4
dat/ncdu
|
|
@ -41,7 +41,7 @@ notifications for new releases.
|
|||
|
||||
=head2 Packages and ports
|
||||
|
||||
Ncdu has been packaged for quite a few systems already, here's a list of the ones I am aware of:
|
||||
Ncdu has been packaged for quite a few systems, here's a list of the ones I am aware of:
|
||||
|
||||
L<AgiliaLinux|http://packages.agilialinux.ru/search.php?tag=sys-fs> -
|
||||
L<AIX|http://www.perzl.org/aix/index.php?n=Main.Ncdu> -
|
||||
|
|
@ -68,7 +68,7 @@ L<Zenwalk|http://zur.zenwalk.org/view/package/name/ncdu>
|
|||
Packages for CentOS, RHEL and (open)SUSE can be found on the
|
||||
L<Open Build Service|https://build.opensuse.org/package/show?package=ncdu&project=utilities>.
|
||||
|
||||
Packages for NetBSD, DragonFlyBSD, MirBSD and others and be found on
|
||||
Packages for NetBSD, DragonFlyBSD, MirBSD and others can be found on
|
||||
L<pkgsrc|http://pkgsrc.se/sysutils/ncdu>.
|
||||
|
||||
|
||||
|
|
|
|||
35
dat/yxml
35
dat/yxml
|
|
@ -11,14 +11,15 @@ The code can be obtained from the L<git repo|http://g.blicky.net/yxml.git> and
|
|||
is available under a permissive MIT license. The only two files you need are
|
||||
L<yxml.c|http://g.blicky.net/yxml.git/plain/yxml.c> and
|
||||
L<yxml.h|http://g.blicky.net/yxml.git/plain/yxml.h>, which can easily be
|
||||
included and compiled as part of your project. Minimal documentation is
|
||||
included in yxml.h, more complete documentation is pending.
|
||||
included and compiled as part of your project. Complete API documentation is
|
||||
available in L<the manual|http://dev.yorhel.nl/yxml/man>.
|
||||
|
||||
The API follows a simple, mostly buffer-less design and only consists of two
|
||||
functions:
|
||||
The API follows a simple and mostly buffer-less design, and only consists of
|
||||
three functions:
|
||||
|
||||
void yxml_init(yxml_t *x, char *stack, size_t stacksize);
|
||||
void yxml_init(yxml_t *x, void *buf, size_t bufsize);
|
||||
yxml_ret_t yxml_parse(yxml_t *x, int ch);
|
||||
yxml_ret_t yxml_eof(yxml_t *x);
|
||||
|
||||
Be aware that I<simple> is not necessarily I<easy> or I<convenient>. The API is
|
||||
relatively low-level and designed to integrate into pretty much any application
|
||||
|
|
@ -28,11 +29,9 @@ devices. It is possible to implement a more convenient and high-level API on
|
|||
top of yxml, but I'm not very fond of libraries that do more than what I
|
||||
strictly need.
|
||||
|
||||
Yxml is still in a beta stage and hasn't been very thoroughly tested yet. There
|
||||
are no tarball releases available at the moment. The API and ABI may still
|
||||
change a bit, so I strongly advise against dynamic linking (I'm not sure if
|
||||
I'll ever promise a stable ABI, but the API should certainly get stabilized at
|
||||
some point).
|
||||
There are no tarball releases available at the moment. The API is relatively
|
||||
stable, but I won't currently promise any ABI stability. Dynamic linking
|
||||
against yxml is therefore not a very good idea.
|
||||
|
||||
=head3 Features
|
||||
|
||||
|
|
@ -95,11 +94,11 @@ using C<< <!ENTITY> >>.
|
|||
=back
|
||||
|
||||
These conformance issues are the result of the byte-oriented and minimal design
|
||||
of yxml, and I do not intent to fix these directly within the library. All of
|
||||
the above mentioned issues can be fixed on top of yxml (by the application, or
|
||||
by a wrapper) if strict conformance is required. With the exception of custom
|
||||
entity references, but I have a simple idea on how to support that in the
|
||||
future, too.
|
||||
of yxml, and I do not intent to fix these directly within the library. The
|
||||
intention is to make sure that all of the above mentioned issues can be fixed
|
||||
on top of yxml (by the application, or by a wrapper) if strict conformance is
|
||||
required, but the required functionality to support custom entity references
|
||||
and DTD handling has not been implemented yet.
|
||||
|
||||
=head3 Non-features
|
||||
|
||||
|
|
@ -136,7 +135,7 @@ implementation is also included as an indication of the "theoretical" minimum.
|
|||
expat 2.1.0 MIT 162 139 194 432 1.47 1.09
|
||||
libxml2 2.9.1 MIT 464 328 518 816 2.53 1.75
|
||||
mxml 2.7 LGPL2+static 32 733 75 832 12.38 7.80
|
||||
yxml git MIT 5 935 31 384 1.14 0.74
|
||||
yxml git MIT 5 971 31 416 1.15 0.74
|
||||
|
||||
The code for these benchmarks is available in the
|
||||
L<bench/|http://g.blicky.net/yxml.git/tree/bench> directory on git. Some
|
||||
|
|
@ -177,7 +176,7 @@ with C<-Os> than with C<-O2>.
|
|||
expat 2.1.0 MIT 113 314 145 632 1.58 1.20
|
||||
libxml2 2.9.1 MIT 356 948 412 256 3.01 2.08
|
||||
mxml 2.7 LGPL2+static 27 725 71 704 11.70 7.44
|
||||
yxml git MIT 4 835 30 264 1.72 1.05
|
||||
yxml git MIT 4 955 30 392 1.67 1.02
|
||||
|
||||
|
||||
=head2 Validating vs. non-validating
|
||||
|
|
@ -204,6 +203,6 @@ It should be noted that a lot of XML documents found in the wild are not
|
|||
described with a DTD, but instead use an alternative technology such as XML
|
||||
schema. Wikipedia L<has more
|
||||
information|https://en.wikipedia.org/wiki/XML#Schemas_and_validation> on this.
|
||||
Using a validating parser for such documents would only introduce bloat and may
|
||||
Using a validating parser for such documents would only add bloat and may
|
||||
introduce L<potential security
|
||||
vulnerabilities|https://en.wikipedia.org/wiki/Billion_laughs>.
|
||||
|
|
|
|||
1
dat/yxml-man
Symbolic link
1
dat/yxml-man
Symbolic link
|
|
@ -0,0 +1 @@
|
|||
../../yxml/yxml.pod
|
||||
17
download/ncdc-1.18.1.tar.gz.asc
Normal file
17
download/ncdc-1.18.1.tar.gz.asc
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
-----BEGIN PGP SIGNATURE-----
|
||||
Version: GnuPG v2.0.20 (GNU/Linux)
|
||||
|
||||
iQIcBAABCgAGBQJSUBb8AAoJEGI5TGmMJzn6wlAQANWirjw2yB5MLBVJlj5WfCOH
|
||||
ky6giyilDJnn8C2PPOXmvI8ddTUm6OSNCpGrK2gQ82bUqurx10aIYQfy+GZHvSgj
|
||||
5pi8GIQiQ1Js4b9BdX54zgeE6GhCk10mN0R4+xoMgszN+5mwRrg66WGsF2A6dZiB
|
||||
vvm3rQnpk3ydDb9kn/vGun8CT9e2xX43aXZ4EJtheHXkdx+o76Pb7LmORElZnKqm
|
||||
10Dw+SGZQ272xuiwnj1UROvMUnbs8yhq8ADpRONL7L1UGlORUpFjmAkB3UE8kLFE
|
||||
4CjTgYLKG6X3L8CFRMttQWmGNTMuvCBxGjqASUQunC1HV10B87ItbuHIiNbjxsEB
|
||||
pj+6dGye4gSI7J3piAXrIe4PG27UhQ9BnbQbhv7NtVM9IPSlb3hyMjzqZOethnRX
|
||||
PgTdwdEy53TI50hJf8tZAC8ZEOyPKtxFqufm7dx8NgoVsX0Si6NLwaAEpxu4h/Ny
|
||||
2QSWB3zGFZinaQ0QWuvXbzJMqjyzYvXGDpPGCUaGx4vcdE/Rz85ywzYeuBIiEH24
|
||||
N7NFD+nVaIpRQthH7GRqOwzgn1qHFPvm6DDj+jv3gzpUSkOz16iWVcIyatXPFJg1
|
||||
BFvlaZm8e2/lQXEyTu8QW6soGq4Pp+nxvow7xpMBSKNAk9gw7GBRPn1TaDQaD83t
|
||||
BLtM/ZkDdqrJMqQnWq9n
|
||||
=CwkC
|
||||
-----END PGP SIGNATURE-----
|
||||
1
download/ncdc-1.18.1.tar.gz.md5
Normal file
1
download/ncdc-1.18.1.tar.gz.md5
Normal file
|
|
@ -0,0 +1 @@
|
|||
c0070916c8bb8a0409d01f6663ca0c6a ncdc-1.18.1.tar.gz
|
||||
1
download/ncdc-1.18.1.tar.gz.sha1
Normal file
1
download/ncdc-1.18.1.tar.gz.sha1
Normal file
|
|
@ -0,0 +1 @@
|
|||
184dce59b5b51563f869a43d81971a1537cdc438 ncdc-1.18.1.tar.gz
|
||||
BIN
img/dcfiledist.png
Normal file
BIN
img/dcfiledist.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 5.8 KiB |
BIN
img/dcfilesize.png
Normal file
BIN
img/dcfilesize.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.4 KiB |
BIN
img/dclistcomp.png
Normal file
BIN
img/dclistcomp.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.8 KiB |
BIN
img/dclistsize.png
Normal file
BIN
img/dclistsize.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 5.2 KiB |
BIN
img/dcnumfiles.png
Normal file
BIN
img/dcnumfiles.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.9 KiB |
BIN
img/yxml-apistates.png
Normal file
BIN
img/yxml-apistates.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 23 KiB |
45
index.cgi
45
index.cgi
|
|
@ -12,6 +12,9 @@ BEGIN { ($ROOT = abs_path $0) =~ s{index\.cgi$}{}; }
|
|||
|
||||
|
||||
my @changes = (
|
||||
[ '2014-01-09', '/doc/dcstats', 'Uploaded an article on DC file list stats' ],
|
||||
[ '2013-11-14', '/yxml/man', 'yxml now has a manual' ],
|
||||
[ '2013-10-05', '/ncdc', 'ncdc 1.18.1 released' ],
|
||||
[ '2013-09-25', '/ncdc', 'ncdc 1.18 released' ],
|
||||
[ '2013-09-03', '/yxml', 'Announcing yxml: A small, fast and correct XML parser' ],
|
||||
[ '2013-07-05', '/dump/insbench', 'Documented a little data structure benchmark' ],
|
||||
|
|
@ -75,7 +78,7 @@ my @changes = (
|
|||
[ '2009-04-30', undef, 'Site redesign and reorganisation.' ],
|
||||
);
|
||||
|
||||
my %feeds = map +($_,1), qw|ncdc ncdu globster tuwf|;
|
||||
my %feeds = map +($_,1), qw|ncdc ncdu globster tuwf yxml|;
|
||||
my $feedreg = join '|', keys %feeds;
|
||||
|
||||
|
||||
|
|
@ -102,9 +105,11 @@ TUWF::register(
|
|||
qr{tuwf/changes} => sub { changelog(shift, 'tuwf-changelog', 'TUWF', 'tuwf', 'changes', 'TUWF Changelog') },
|
||||
qr{ylib} => sub { podpage(shift, 'ylib/README.pod', 'ylib', '', 'Ylib') },
|
||||
qr{yxml} => sub { podpage(shift, 'yxml', 'yxml', '', 'Yxml - A small, fast and correct* XML parser') },
|
||||
qr{yxml/man} => sub { podpage(shift, 'yxml-man', 'yxml', 'man', 'Yxml Manual', 1) },
|
||||
qr{doc} => sub { podpage(shift, 'doc', 'doc', '', 'Articles') },
|
||||
qr{doc/sqlaccess} => sub { podpage(shift, 'sqlaccess', 'doc', '', 'Multi-threaded Access to an SQLite3 Database', 1) },
|
||||
qr{doc/commvis} => sub { podpage(shift, 'doc-commvis', 'doc', '', 'A Distributed Communication System for Modular Applications', 1) },
|
||||
qr{doc/dcstats} => sub { podpage(shift, 'doc-dcstats', 'doc', '', 'Some Measurements on Direct Connect File Lists', 1) },
|
||||
qr{dump} => sub { podpage(shift, 'dump', 'dump', '', 'Code dump') },
|
||||
qr{demo} => sub { podpage(shift, 'dump-demo', 'dump', 'demo', 'Demos') },
|
||||
qr{dump/awshrink} => sub { podpage(shift, 'dump-awshrink', 'dump', 'awshrink', 'AWStats Data File Shrinker') },
|
||||
|
|
@ -112,10 +117,10 @@ TUWF::register(
|
|||
qr{dump/nccolour} => sub { podpage(shift, 'dump-nccolour', 'dump', 'nccolour', 'Colours in NCurses') },
|
||||
qr{dump/insbench} => sub { podpage(shift, 'dump-insbench', 'dump', 'insbench', 'Insertion Performance Benchmarks') },
|
||||
qr{(?:($feedreg)/)?feed\.atom} => \&atom,
|
||||
qr{(ncdc|ncdu|globster)/bug} => \&bug_list,
|
||||
qr{(ncdc|ncdu|globster)/bug/post} => \&bug_post,
|
||||
qr{(ncdc|ncdu|globster)/bug/new} => \&bug_new,
|
||||
qr{(ncdc|ncdu|globster)/bug/([1-9][0-9]*)} => \&bug_item,
|
||||
qr{(ncdc|ncdu|globster|yxml)/bug} => \&bug_list,
|
||||
qr{(ncdc|ncdu|globster|yxml)/bug/post} => \&bug_post,
|
||||
qr{(ncdc|ncdu|globster|yxml)/bug/new} => \&bug_new,
|
||||
qr{(ncdc|ncdu|globster|yxml)/bug/([1-9][0-9]*)} => \&bug_item,
|
||||
);
|
||||
|
||||
TUWF::set(
|
||||
|
|
@ -195,7 +200,7 @@ sub atom {
|
|||
|
||||
my $n = 0;
|
||||
for(@changes) {
|
||||
next if $sub && $_->[1] !~ /^\/\Q$sub/;
|
||||
next if $sub && (!$_->[1] || $_->[1] !~ /^\/\Q$sub/);
|
||||
last if $n++ >= 10;
|
||||
tag 'entry';
|
||||
tag id => 'http://dev.yorhel.nl'.($_->[1]||'/').'#'.$_->[0];
|
||||
|
|
@ -420,7 +425,7 @@ sub genChanges {
|
|||
sub htmlHeader {
|
||||
my $s = shift;
|
||||
my %o = (
|
||||
spec => { map +($_,1), qw|ncdu ncdc globster tuwf| },
|
||||
spec => { map +($_,1), qw|ncdu ncdc globster tuwf yxml| },
|
||||
page => '',
|
||||
sec => '',
|
||||
sec2 => '',
|
||||
|
|
@ -440,17 +445,15 @@ sub htmlHeader {
|
|||
div class => 'notes';
|
||||
txt 'Yoran Heling'; br;
|
||||
a href => 'mailto:projects@yorhel.nl', 'projects@yorhel.nl';
|
||||
br; a href => 'http://yorhel.nl', 'yh';
|
||||
txt ' - '; a href => 'http://g.blicky.net', 'git';
|
||||
txt ' - '; a href => 'http://pgp.mit.edu:11371/pks/lookup?search=0x8c2739fa', 'pgp';
|
||||
br;br;
|
||||
lit q|
|
||||
<form action="https://www.paypal.com/cgi-bin/webscr" method="post"><fieldset style="border:0">
|
||||
<input type="hidden" name="cmd" value="_s-xclick" />
|
||||
<input type="hidden" name="encrypted" value="-----BEGIN PKCS7-----MIIHFgYJKoZIhvcNAQcEoIIHBzCCBwMCAQExggEwMIIBLAIBADCBlDCBjjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRYwFAYDVQQHEw1Nb3VudGFpbiBWaWV3MRQwEgYDVQQKEwtQYXlQYWwgSW5jLjETMBEGA1UECxQKbGl2ZV9jZXJ0czERMA8GA1UEAxQIbGl2ZV9hcGkxHDAaBgkqhkiG9w0BCQEWDXJlQHBheXBhbC5jb20CAQAwDQYJKoZIhvcNAQEBBQAEgYCukWoZm+KyKZ6D0GzhtVdPoSKwCFaiiH2qku6EbCz6l0wQptWk9nPTcFVyRXr/WkoUAMSJBP8nFdzNHEXwKhRmDwJIzTd15L6BWLe9iQzqwEWfNFCOg/VUflJ1YSnZLk96d7M7H65/+uX3UgQKaG5xfKDpLAZLRieTM3O0QGHbpTELMAkGBSsOAwIaBQAwgZMGCSqGSIb3DQEHATAUBggqhkiG9w0DBwQIM6MXKluujROAcGE4dE5oixMuUPpljrDdw3gyIkbcv5yitn8YtrO53ial5XsFQKuQKJOJXzxHwaznE6a8qYTVW1ozZoJETrzY+O0PY+IOgemhnDduAG02fcPchqBqau+3f6hVnkolsXj+1QrubZxfAzt2cPIy9m7RYTSgggOHMIIDgzCCAuygAwIBAgIBADANBgkqhkiG9w0BAQUFADCBjjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRYwFAYDVQQHEw1Nb3VudGFpbiBWaWV3MRQwEgYDVQQKEwtQYXlQYWwgSW5jLjETMBEGA1UECxQKbGl2ZV9jZXJ0czERMA8GA1UEAxQIbGl2ZV9hcGkxHDAaBgkqhkiG9w0BCQEWDXJlQHBheXBhbC5jb20wHhcNMDQwMjEzMTAxMzE1WhcNMzUwMjEzMTAxMzE1WjCBjjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRYwFAYDVQQHEw1Nb3VudGFpbiBWaWV3MRQwEgYDVQQKEwtQYXlQYWwgSW5jLjETMBEGA1UECxQKbGl2ZV9jZXJ0czERMA8GA1UEAxQIbGl2ZV9hcGkxHDAaBgkqhkiG9w0BCQEWDXJlQHBheXBhbC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMFHTt38RMxLXJyO2SmS+Ndl72T7oKJ4u4uw+6awntALWh03PewmIJuzbALScsTS4sZoS1fKciBGoh11gIfHzylvkdNe/hJl66/RGqrj5rFb08sAABNTzDTiqqNpJeBsYs/c2aiGozptX2RlnBktH+SUNpAajW724Nv2Wvhif6sFAgMBAAGjge4wgeswHQYDVR0OBBYEFJaffLvGbxe9WT9S1wob7BDWZJRrMIG7BgNVHSMEgbMwgbCAFJaffLvGbxe9WT9S1wob7BDWZJRroYGUpIGRMIGOMQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0ExFjAUBgNVBAcTDU1vdW50YWluIFZpZXcxFDASBgNVBAoTC1BheVBhbCBJbmMuMRMwEQYDVQQLFApsaXZlX2NlcnRzMREwDwYDVQQDFAhsaXZlX2FwaTEcMBoGCSqGSIb3DQEJARYNcmVAcGF5cGFsLmNvbYIBADAMBgNVHRMEBTADAQH/MA0GCSqGSIb3DQEBBQUAA4GBAIFfOlaagFrl71+jq6OKidbWFSE+Q4FqROvdgIONth+8kSK//Y/4ihuE4Ymvzn5ceE3S/iBSQQMjyvb+s2TWbQYDwcp129OPIbD9epdr4tJOUNiSojw7BHwYRiPh58S1xGlFgHFXwrEBb3dgNbMUa+u4qectsMAXpVHnD9wIyfmHMYIBmjCCAZYCAQEwgZQwgY4xCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJDQTEWMBQGA1UEBxMNTW91bnRhaW4gVmlldzEUMBIGA1UEChMLUGF5UGFsIEluYy4xEzARBgNVBAsUCmxpdmVfY2VydHMxETAPBgNVBAMUCGxpdmVfYXBpMRwwGgYJKoZIhvcNAQkBFg1yZUBwYXlwYWwuY29tAgEAMAkGBSsOAwIaBQCgXTAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xMjAyMTQyMTM1NTBaMCMGCSqGSIb3DQEJBDEWBBQ1cTUdt+dHu7f5zToLjuWqv4T5OTANBgkqhkiG9w0BAQEFAASBgBjI8TO90fmKmBmOazqFUhAWN3AbU6I3y04XtFEP5vazfiwq5fn2OaekjF1RwcaKAnDU6rC6wRBQ8nNSrT7NFCARqzxVXx4YRfxiFYhCkEYF3oYCbdNOPr+Q3/P1nETnTHnidaJmEz/HTV3nta9D4PypZCaSxIJKMOofW+VkEAV2-----END PKCS7-----" />
|
||||
<input type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_donate_SM.gif" name="submit" alt="PayPal - The safer, easier way to pay online!" style="border: 0" />
|
||||
<img alt="" src="https://www.paypalobjects.com/nl_NL/i/scr/pixel.gif" style="width:1px; height:1px; border:0" />
|
||||
</fieldset></form>|;
|
||||
br; a href => 'http://yorhel.nl', 'home';
|
||||
txt ' - '; a href => 'http://g.blicky.net', 'git repos';
|
||||
br; b '= donate =';
|
||||
a href => 'https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=BBF8LGT2LLNFN&lc=US¤cy_code=EUR&bn=PP%2dDonationsBF%3abtn_donate_SM%2egif%3aNonHosted', 'paypal';
|
||||
txt ' - '; a href => 'bitcoin:1PhXZaKbPFhuz4KbRcfUL9VveB58psa8R', 'bitcoin';
|
||||
br; b '= pgp =';
|
||||
a href => 'http://yorhel.nl/key.asc', 'key';
|
||||
txt ' - '; a href => 'http://pgp.mit.edu:11371/pks/lookup?search=0x8c2739fa', 'mit';
|
||||
br; i '7446 0D32 B808 10EB A9AF A2E9 6239 4C69 8C27 39FA';
|
||||
end;
|
||||
img id => 'scissors', src => '/img/scissors.png', alt => 'Cute decorative scissors, cutting through your code.';
|
||||
end 'div';
|
||||
|
|
@ -523,14 +526,18 @@ sub htmlMenu {
|
|||
$m->('/tuwf/man/xml', '::XML', $o{sec} eq 'man' && $o{sec2} eq 'xml');
|
||||
});
|
||||
$m->('/tuwf/changes', 'Changelog', $o{sec} eq 'changes');
|
||||
} elsif($o{page} eq 'yxml') {
|
||||
$m->('/yxml', 'Info', !$o{sec});
|
||||
$m->('/yxml/man', 'Manual', $o{sec} eq 'man');
|
||||
$m->('/yxml/bug', 'Bug tracker', $o{sec} eq 'bug');
|
||||
} else {
|
||||
$m->('/', 'Home', !$o{page});
|
||||
$m->('/ncdu', 'Ncdu ');
|
||||
$m->('/ncdc', 'Ncdc ');
|
||||
$m->('/globster', 'Globster ');
|
||||
$m->('/tuwf', 'Tuwf ');
|
||||
$m->('/yxml', 'Yxml ');
|
||||
$m->('/ylib', 'Ylib', $o{page} eq 'ylib');
|
||||
$m->('/yxml', 'Yxml', $o{page} eq 'yxml');
|
||||
$m->('/doc', 'Articles', $o{page} eq 'doc');
|
||||
$m->('/dump', 'Code dump', $o{page} eq 'dump', sub {
|
||||
$m->('/dump', 'Misc.', $o{page} eq 'dump' && !$o{sec});
|
||||
|
|
|
|||
|
|
@ -19,8 +19,10 @@ html,body { background: #ccc; text-align: center; height: 100% }
|
|||
#left li a.small { font-size: 10px }
|
||||
#left .menusel { color: #03a }
|
||||
#left .notes { margin-top: 50px; text-align: center }
|
||||
#left .notes, #left .notes a { font-size: 9px; text-decoration: none }
|
||||
#left .notes, #left .notes a, #left .notes b { font-size: 9px; text-decoration: none }
|
||||
#left .notes a:hover { text-decoration: underline }
|
||||
#left .notes i { font-size: 7px; display: block; margin-top: -2px; margin-bottom: -10px; font-style: normal }
|
||||
#left .notes b { display: block; margin-top: 10px; margin-bottom: 2px }
|
||||
#scissors { position: relative; top: 30px; left: 113px; }
|
||||
img.right { float: right; margin: 0 0 5px 10px }
|
||||
.indexgroup { margin: 30px 10px 0px 20px }
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue