Commit graph

194 commits

Author SHA1 Message Date
Yorhel
577e519e7d Fedora 26 is EOL 2018-07-17 17:02:33 +02:00
Yorhel
617a76eeba Add CentOS 6.10 and FreeBSD 11.2 2018-07-06 07:51:13 +02:00
Yorhel
cfc656bf12 Convert README to markdown + update git URLs 2018-06-15 10:52:21 +02:00
Yorhel
7bb4397f9b Add CentOS 7.5 + Fix Fedora 28 updates repo 2018-06-02 07:34:54 +02:00
Yorhel
7aa89145ca indexer: Re-use memory buffer when reading RPM repo data
This avoids reading the entire uncompressed XML into a buffer.
2018-05-04 15:25:35 +02:00
Yorhel
cec70a59c4 Merge branch 'master' of site 2018-05-01 20:40:47 +02:00
Yorhel
bb236c1f0e Add Ubuntu 18.04 + Fedora 28 2018-05-01 20:30:27 +02:00
Yorhel
2c7bf1507a indexer: Update crates to latest version
With the exception of Hyper, because the new tokio-based version is...
different.
2018-03-25 10:36:29 +02:00
Yorhel
8487d37aad web: Updating Cargo deps + fixes for stable 2018-03-25 08:15:22 +02:00
Yorhel
5480fc206e Link updates 2018-03-25 08:14:53 +02:00
Yorhel
b382189fb6 Hopefully properly fixed the weird system layout on home page 2018-01-21 08:34:01 +01:00
Yorhel
8031a90989 Some improvements to the about text + link to the new Arch man pages 2018-01-21 08:23:57 +01:00
Yorhel
035538f156 Add CentOS 2018-01-21 08:13:05 +01:00
Yorhel
b3e94e3d51 Stop syncing Ubuntu Zesty 2018-01-19 10:23:06 +01:00
Yorhel
b89c7625d5 Add FreeBSD 10.4 2018-01-15 21:39:45 +01:00
Yorhel
fd657824b3 Add database downloads 2017-12-23 14:28:16 +01:00
Yorhel
6120b2fa5a Add Fedora 27 2017-11-16 12:15:54 +01:00
Yorhel
aedb1795c1 Favor more recent packages in man page selection 2017-10-27 09:00:46 +02:00
Yorhel
2388aaefcc Stop syncing Fedora 24; Add FreeBSD 11.1 and Ubuntu 17.10 2017-10-20 21:37:33 +02:00
Yorhel
cb2d970d3a Add Fedora 26 + stop syncing old Ubuntu 2017-07-22 12:27:40 +02:00
Yorhel
8aa0fc02a5 Add Debian Stretch 2017-06-23 17:44:03 +02:00
Yorhel
72cb1ff184 Add Ubuntu 17.04 2017-04-14 17:09:18 +02:00
Yorhel
34e8ee8603 Add link to Debian man pages 2017-02-26 09:07:37 +01:00
Yorhel
8e5fa1e165 www: Include .so mans if found in the same package
Unfortunately, this can lead to slightly confusing scenarios, because
the exact package of the displayed man page is not very well defined.
It's possible that, when browsing from a package listing to a man page,
you may see an included file that does not come from the package you
browsed from.
E.g. https://manned.org/pwrite/5f2909f6 - that man page simply includes
pread.2, but from the URL it's unclear from which package or system it
should be included.

The only way to fix this is to add the package ID to the link format.
2017-02-26 08:52:23 +01:00
Yorhel
81e2c99503 Friendlier pagination on package listings 2017-01-25 10:42:01 +01:00
Yorhel
bb46087068 Add Fedora 1 - 25 2017-01-21 09:05:34 +01:00
Yorhel
06694fd131 Style changes 2017-01-20 09:55:43 +01:00
Yorhel
8235fb28b8 indexer: Fix link resolution and hardlink handling for rpm
Unlike tar, cpio does not have a separate entry for each directory, so
the link resolution can't assume that directory entries exist for each
path component.

I also mistakenly assumed that cpio handled hardlinks similarly to tar,
but that's clearly not the case. libarchive does help a bit, but these
differences still suck.
2017-01-18 13:07:42 +01:00
Yorhel
608f79eb93 indexer: Add support for indexing RPM repositories
This code hasn't been thoroughly tested, I'll see how things go when
indexing a live repo.

And XML parsing sucks in every language.
2017-01-17 17:05:03 +01:00
Yorhel
f77db5f541 indexer: Add bare RPM directory indexing
This is for a few special cases, most RPM repos will have proper
metadata and all.
2017-01-17 12:50:25 +01:00
Yorhel
d720441fb4 indexer: Rust crate updates 2017-01-17 11:01:11 +01:00
Yorhel
1923b9901d Support bold+italic in HTML conversion 2017-01-16 09:52:32 +01:00
Yorhel
746889851c A few more HTML conversion improvements
- Fix segfault on empty output (bug was in XS code)
- Still better end-of-URL detection
- Recognize a few common multicharacter sections in man references
2017-01-15 20:27:16 +01:00
Yorhel
1ccc86ce86 Whole bunch of HTML conversion improvements
- Grotty escape sequences are now better interpreted. I feel rather
  stupid for not realizing the idea behind how those codes are supposed
  to work earlier. It finally hit me when I read the BSD ul(1) source
  code.
- URL end detection is slightly better (much better than the old C code)
- Man page references with : are recognized now (common in Perl modules).
- More efficient HTML escaping, no need to escape > and ".

There's still a bunch of improvements to make, but I have much more
confidence in the current implementation already.
2017-01-15 17:07:03 +01:00
Yorhel
6114b17389 Experimental rewrite of grotty to html conversion in Rust
The previous C code was troublesome.
- Didn't handle long lines
- I couldn't convince myself that it was free of memory safety issues
- Needed improving anyway, there are some formatting bugs. These are
  hard to fix in the current code.

I mostly replicated the formatting bugs of the old C implementation in
Rust, and possibly added a few new bugs as well. It's not a significant
improvement right now, more testing and fixing will be needed.

The performance of both implementations is comparable, with the Rust
version being slightly faster in many cases (and slower in some others).
I did spend more time trying to optimize this Rust version than I did
with the old C code. I initially tried a naive-ish conversion of the C
code to Rust, but that turned out to be much slower and I had to resort
to using regexes and different data structures fix that.
2017-01-15 12:17:34 +01:00
Yorhel
8a3af4aee2 util/freebsd.sh: Fix copy-paste error in package dates 2016-12-30 18:10:46 +01:00
Yorhel
8d6e7bc2d8 indexer: Prioritize 7bit encodings when decoding man pages
Fixes parsing of https://manned.org/xshisen/ae5d469f
2016-12-29 09:27:19 +01:00
Yorhel
eac4b6ac77 Dont index ELF binaries + remove some non-man-pages 2016-12-18 16:35:25 +01:00
Yorhel
d153004532 indexer: Support FreeBSD 9.3+; remove now obsolete add_index.pl 2016-12-18 15:08:56 +01:00
Yorhel
b9764fce4a indexer: Remove openssl + replace siphash with sha1 in cache filename
HTTPS isn't used, so removing it saves some space.

The std SipHash API has been deprecated, and since hashing performance
isn't exactly critical in this case I've replaced it with SHA1, which
was already being used in man.rs.
2016-12-11 13:41:10 +01:00
Yorhel
defaa032f8 indexer: Support for indexing FreeBSD <9.3 repositories 2016-12-11 10:59:54 +01:00
Yorhel
1ca0cd4325 Indexer: Remove pointless check 2016-11-27 10:59:31 +01:00
Yorhel
b79ecfb284 indexer: Fix bug in Contents file parsing + decrease cron verbosity
Turns out that not all Contents files heave a header.
2016-11-27 10:48:35 +01:00
Yorhel
eb15b6e2c7 indexer: Improve Debian Contents file parsing performance by 5.2x
Further improvements can be gained by caching the results of
get_contents(), since the same Contents file is often parsed multiple
times in a single cron run. But this is already a significant
achievement.
2016-11-26 16:57:05 +01:00
Yorhel
de28175cd3 Misc. indexing fixes 2016-11-20 16:41:08 +01:00
Yorhel
46a6e2ff7c Use Rust indexer for Ubuntu + script cleanup 2016-11-20 15:01:22 +01:00
Yorhel
2ee2f7495b Reorganize indexing scripts + use Rust for Debian 2016-11-20 12:34:02 +01:00
Yorhel
5d44d0e2ec Indexer: Add --dryrun and workarounds for old deb repos 2016-11-20 11:39:00 +01:00
Yorhel
ecb1a9e25b Indexer: Support reading date from .deb archives 2016-11-20 09:01:33 +01:00
Yorhel
a1e5a2d80d Indexer: Improve logging + cache management 2016-11-20 07:31:55 +01:00