Commit graph

160 commits

Author SHA1 Message Date
Yorhel
6114b17389 Experimental rewrite of grotty to html conversion in Rust
The previous C code was troublesome.
- Didn't handle long lines
- I couldn't convince myself that it was free of memory safety issues
- Needed improving anyway, there are some formatting bugs. These are
  hard to fix in the current code.

I mostly replicated the formatting bugs of the old C implementation in
Rust, and possibly added a few new bugs as well. It's not a significant
improvement right now, more testing and fixing will be needed.

The performance of both implementations is comparable, with the Rust
version being slightly faster in many cases (and slower in some others).
I did spend more time trying to optimize this Rust version than I did
with the old C code. I initially tried a naive-ish conversion of the C
code to Rust, but that turned out to be much slower and I had to resort
to using regexes and different data structures fix that.
2017-01-15 12:17:34 +01:00
Yorhel
8a3af4aee2 util/freebsd.sh: Fix copy-paste error in package dates 2016-12-30 18:10:46 +01:00
Yorhel
8d6e7bc2d8 indexer: Prioritize 7bit encodings when decoding man pages
Fixes parsing of https://manned.org/xshisen/ae5d469f
2016-12-29 09:27:19 +01:00
Yorhel
eac4b6ac77 Dont index ELF binaries + remove some non-man-pages 2016-12-18 16:35:25 +01:00
Yorhel
d153004532 indexer: Support FreeBSD 9.3+; remove now obsolete add_index.pl 2016-12-18 15:08:56 +01:00
Yorhel
b9764fce4a indexer: Remove openssl + replace siphash with sha1 in cache filename
HTTPS isn't used, so removing it saves some space.

The std SipHash API has been deprecated, and since hashing performance
isn't exactly critical in this case I've replaced it with SHA1, which
was already being used in man.rs.
2016-12-11 13:41:10 +01:00
Yorhel
defaa032f8 indexer: Support for indexing FreeBSD <9.3 repositories 2016-12-11 10:59:54 +01:00
Yorhel
1ca0cd4325 Indexer: Remove pointless check 2016-11-27 10:59:31 +01:00
Yorhel
b79ecfb284 indexer: Fix bug in Contents file parsing + decrease cron verbosity
Turns out that not all Contents files heave a header.
2016-11-27 10:48:35 +01:00
Yorhel
eb15b6e2c7 indexer: Improve Debian Contents file parsing performance by 5.2x
Further improvements can be gained by caching the results of
get_contents(), since the same Contents file is often parsed multiple
times in a single cron run. But this is already a significant
achievement.
2016-11-26 16:57:05 +01:00
Yorhel
de28175cd3 Misc. indexing fixes 2016-11-20 16:41:08 +01:00
Yorhel
46a6e2ff7c Use Rust indexer for Ubuntu + script cleanup 2016-11-20 15:01:22 +01:00
Yorhel
2ee2f7495b Reorganize indexing scripts + use Rust for Debian 2016-11-20 12:34:02 +01:00
Yorhel
5d44d0e2ec Indexer: Add --dryrun and workarounds for old deb repos 2016-11-20 11:39:00 +01:00
Yorhel
ecb1a9e25b Indexer: Support reading date from .deb archives 2016-11-20 09:01:33 +01:00
Yorhel
a1e5a2d80d Indexer: Improve logging + cache management 2016-11-20 07:31:55 +01:00
Yorhel
4bdd91f65e Indexer: Initial support for debian repos 2016-11-19 15:27:24 +01:00
Yorhel
50fe17a604 Indexer: Support .deb archives 2016-11-15 21:15:35 +01:00
Yorhel
1f05463c3a About page: Remove TOC feature as planned 2016-11-09 19:01:24 +01:00
Yorhel
aa01365e60 Move nav menu a bit up to create space
This is where the old nav menu used to be. This involved shrinking the
width of the locations/versions selector, but that never needed the full
page width anyway. Unfortunately I suck at CSS so the nav menu and
selector thing won't look too great on smaller screen sizes; but that's
just a minor visual uglyness.
2016-11-09 18:58:34 +01:00
Yorhel
09af881767 Add TOC listing + more section/lang select back into a nav menu 2016-11-09 18:43:10 +01:00
Yorhel
20141aa980 indexer: Improve charset detection + lower file cache time 2016-11-09 18:41:53 +01:00
Yorhel
7d2abfb3a4 indexer: Fix storing locale as NULL when empty
Perhaps it's better to get rid of NULL and make empty the default value.
But for now this'll do.
2016-11-06 16:24:45 +01:00
Yorhel
cb81bedac1 Add arch/encoding metadata to DB + Fetch Arch Linux x86_64
The encoding metadata will be very useful in finding badly decoded man
pages. The package 'arch' is necessary to properly identify which
package was used, which is not obvious now that I'm going to switch more
systems to the (more common) x86_64 arch.
2016-11-06 16:05:16 +01:00
Yorhel
b8a1945d38 Merge branch 'indexer' 2016-11-06 15:26:42 +01:00
Yorhel
5e39af459f Replace old Arch Linux scripts with new indexer 2016-11-06 15:26:20 +01:00
Yorhel
1ca43665a1 indexer: Add file caching + Arch Linux indexing 2016-11-06 13:34:22 +01:00
Yorhel
35fab522d6 Indexer: Support HTTP fetching + misc improvements 2016-11-06 09:21:53 +01:00
Yorhel
aff68205b0 Add postgres package indexing + cli options 2016-11-05 10:22:31 +01:00
Yorhel
0cab758665 Add support for man page reading & decoding 2016-10-30 11:06:14 +01:00
Yorhel
c8bb4da246 Use libarchive3-sys crate directly + improve archread API
This all should offer a more convenient and robust interface to handle
all sorts of archives.
2016-10-29 09:33:39 +02:00
Yorhel
9db73b2709 Fix CSS I accidentally removed 2016-10-26 21:31:47 +02:00
Yorhel
863fae2476 Add link to manpag.es 2016-10-26 19:27:57 +02:00
Yorhel
25a39c6fe4 Improved pagination on package info pages 2016-10-26 19:25:23 +02:00
Yorhel
022e9acc4f WIP: Rewritten man page indexer in Rust
Currently just figuring out how to read archives. Turns out to not be as
simple as I had expected.
2016-10-22 14:54:37 +02:00
Yorhel
965aa9a2f6 Add Ubuntu 16.10 2016-10-19 07:30:49 +02:00
Yorhel
7535218a06 Add FreeBSD 11.0 2016-10-18 07:09:27 +02:00
Yorhel
a7352d27b9 Fix possible wrapping of MANNEDINCLUDE by removing space
This doesn't really guarantee that it won't wrap, but fixes at least one
man page.

- https://manned.org/BlockSelectionDCOPInterface/6dfdf921
2016-10-16 10:28:44 +02:00
Yorhel
5436435c3f Improve handling of man names with special characters
The 'source' link was broken for mans with [ or ] characters.
All links were broken for mans with space characters.

Man page of the week:
https://manned.org/KGenericFactory_%20KTypeList_%20Product,%20ProductListTail%20_,%20KTypeList_%20ParentType,%20ParentTypeListTail%20_%20_/dfc33ca6

There's a 5 man pages left with a '%' or '#' character. I've no idea if
it's worth handling those; A fix for these isn't going to be as trivial
as this commit.
2016-10-16 10:19:27 +02:00
Yorhel
8a0fac08b6 DB cleanup: Remove some non-manpages & fix wrongly-detected locales 2016-10-16 10:03:34 +02:00
Yorhel
17fc298217 Fix handling of URLs ending in a ⟩
I've known about this issue before, but didn't realize it was so
widespread. This fixes many links.
2016-10-16 09:11:15 +02:00
Yorhel
7d31f41ba8 Add FreeBSD 10.3 2016-10-15 22:37:58 +02:00
Yorhel
4affcec7c3 Homepage: Add "less" button after clicking on "more" 2016-10-15 16:55:23 +02:00
Yorhel
6740dc2546 A few more links to other man page sites 2016-10-15 16:46:03 +02:00
Yorhel
44df29ea18 Fix 404 on /(pkg|man)/<hash> 2016-10-15 16:06:18 +02:00
Yorhel
20daba820f Complete revamp of navigation menu on man pages
This removes the navigation menu on the right, leaving more space for
the actual contents. Instead, there are now a few links/tabs at the top
of the page. There's also a 'permalink' now.

The previous navigation combined the selection of man page versions,
translations and sections in a single menu. While handy in some cases,
in most cases it was just slow and messy. It also didn't scale very
well, some man pages have so many versions that it significantly
affected the page load time.

The 'locations' table has now also been moved into tab and is loaded
asynchronously as well, for the same performance reasons.

I had hoped that this new navigation would be much easier and more
convenient, but honestly, it's still a mess. At least the new code is
more maintainable, so perhaps I'll be able to make some incremental
improvements in the future.
2016-10-15 16:06:18 +02:00
Yorhel
3f40896679 Add FreeBSD 10.2 2016-10-14 08:09:53 +02:00
Yorhel
c04e6b3b6a Add FreeBSD 10.1 2016-10-12 17:02:37 +02:00
Yorhel
1106b0c08d Add FreeBSD 10.0 2016-10-10 17:19:08 +02:00
Yorhel
b7328cc039 Reorganize links on homepage a bit 2016-10-09 11:34:55 +02:00