- Fix segfault on empty output (bug was in XS code)
- Still better end-of-URL detection
- Recognize a few common multicharacter sections in man references
- Grotty escape sequences are now better interpreted. I feel rather
stupid for not realizing the idea behind how those codes are supposed
to work earlier. It finally hit me when I read the BSD ul(1) source
code.
- URL end detection is slightly better (much better than the old C code)
- Man page references with : are recognized now (common in Perl modules).
- More efficient HTML escaping, no need to escape > and ".
There's still a bunch of improvements to make, but I have much more
confidence in the current implementation already.
The previous C code was troublesome.
- Didn't handle long lines
- I couldn't convince myself that it was free of memory safety issues
- Needed improving anyway, there are some formatting bugs. These are
hard to fix in the current code.
I mostly replicated the formatting bugs of the old C implementation in
Rust, and possibly added a few new bugs as well. It's not a significant
improvement right now, more testing and fixing will be needed.
The performance of both implementations is comparable, with the Rust
version being slightly faster in many cases (and slower in some others).
I did spend more time trying to optimize this Rust version than I did
with the old C code. I initially tried a naive-ish conversion of the C
code to Rust, but that turned out to be much slower and I had to resort
to using regexes and different data structures fix that.
HTTPS isn't used, so removing it saves some space.
The std SipHash API has been deprecated, and since hashing performance
isn't exactly critical in this case I've replaced it with SHA1, which
was already being used in man.rs.
Further improvements can be gained by caching the results of
get_contents(), since the same Contents file is often parsed multiple
times in a single cron run. But this is already a significant
achievement.
This is where the old nav menu used to be. This involved shrinking the
width of the locations/versions selector, but that never needed the full
page width anyway. Unfortunately I suck at CSS so the nav menu and
selector thing won't look too great on smaller screen sizes; but that's
just a minor visual uglyness.
The encoding metadata will be very useful in finding badly decoded man
pages. The package 'arch' is necessary to properly identify which
package was used, which is not obvious now that I'm going to switch more
systems to the (more common) x86_64 arch.
This removes the navigation menu on the right, leaving more space for
the actual contents. Instead, there are now a few links/tabs at the top
of the page. There's also a 'permalink' now.
The previous navigation combined the selection of man page versions,
translations and sections in a single menu. While handy in some cases,
in most cases it was just slow and messy. It also didn't scale very
well, some man pages have so many versions that it significantly
affected the page load time.
The 'locations' table has now also been moved into tab and is loaded
asynchronously as well, for the same performance reasons.
I had hoped that this new navigation would be much easier and more
convenient, but honestly, it's still a mess. At least the new code is
more maintainable, so perhaps I'll be able to make some incremental
improvements in the future.