The encoding metadata will be very useful in finding badly decoded man
pages. The package 'arch' is necessary to properly identify which
package was used, which is not obvious now that I'm going to switch more
systems to the (more common) x86_64 arch.
Man selection has to be performed over several thousand rows in some
cases. Loading all those in Perl and then doing the selection isn't very
efficient[1]. The getman() implementation was also buggy: The comparison
function used to determine which man page should be preferred was not
associative[2], and the result thus depended on the order in which the
man pages were compared. This resulted in some wrong selections in some
cases.
While I was at it, I also made the selection more strict:
- /man/unknown-hash would previously ignore the hash and just select
whatever man page. Now it results in a 404.
- Same with /man.unknown-section
- /man.section/hash is now disallowed, it's either /man.section or
/man/hash.
1) Note that all possible man pages are currently still loaded into Perl
anyway, because the ugly navigation menu on the right needs them. I plan
to revamp that entire menu to be more efficient and usable.
2) Initially I wrote the SQL implementation in a similar fashion to the
Perl implementation, and ended up with the same bug. I wasted more than
a day before I finally got to the current CTE query.