Util: Add to_bool() and use it for JSON, Pg & query encoding

To improve interop with legacy modules.
This commit is contained in:
Yorhel 2025-02-25 09:10:03 +01:00
parent 06e2f950fe
commit c7a3415485
10 changed files with 141 additions and 37 deletions

View file

@ -8,6 +8,7 @@ use POSIX ();
use experimental 'builtin';
our @EXPORT_OK = qw/
to_bool
json_format json_parse
utf8_decode uri_escape uri_unescape
query_decode query_encode
@ -52,12 +53,12 @@ sub query_encode :prototype($) ($o) {
my($k, $v) = ($_, $o->{$_});
$k = uri_escape $k;
map {
my $a = $_;
$a = $a->TO_QUERY() if builtin::blessed($a) && $a->can('TO_QUERY');
!defined $a || (builtin::is_bool($a) && !$a)
? ()
: builtin::is_bool($a) ? $k
: $k.'='.uri_escape($a)
my $x = $_;
$x = $x->TO_QUERY() if builtin::blessed($x) && $x->can('TO_QUERY');
my $bool = to_bool($x);
!defined $x || !($bool//1) ? ()
: $bool ? $k
: $k.'='.uri_escape($x)
} ref $v eq 'ARRAY' ? @$v : ($v);
} sort keys %$o;
}
@ -75,7 +76,7 @@ sub httpdate_format :prototype($) ($time) {
}
sub httpdate_parse :prototype($) ($str) {
return if $str !~ /^$httpdays, ([0-9]{2}) ([A-Z][a-z]{2}) ([0-9]{4}) ([0-9]{2}):([0-9]{2}):([0-9]{2}) GMT$/;
return if $str !~ /^\s*$httpdays, ([0-9]{2}) ([A-Z][a-z]{2}) ([0-9]{4}) ([0-9]{2}):([0-9]{2}):([0-9]{2}) GMT\s*$/;
my ($mday, $mon, $year, $hour, $min, $sec) = ($1, $httpmonths{$2}, $3, $4, $5, $6);
return if !defined $mon;
# mktime() interprets the broken down time as our local timezone,
@ -105,6 +106,30 @@ doesn't believe in the concept of a "batteries included" standard library.
=head1 DESCRIPTION
=head2 Boolean Stuff
Perl has had a builtin boolean type since version 5.36 and FU uses that where
appropriate, but there's still a lot of older code out there using different
conventions. The following function should help when interacting with older
code and provide a gradual migration path to the new builtin booleans.
=over
=item to_bool($val)
Returns C<undef> if C<$val> is not likely to be a distinct boolean type,
otherwise it returns a normalized C<builtin::true> or C<builtin::false>.
This function recognizes the builtin booleans, C<\0>, C<\1>,
L<Types::Serialiser> (which is used by L<JSON::XS>, L<JSON::SIMD>, L<CBOR::XS>
and others), L<JSON::PP> (also used by L<Cpanel::JSON::XS> and others),
L<JSON::Tiny> and L<Mojo::JSON>.
This function is ambiguous in contexts where a bare scalar reference is a valid
value for C<$val>, due to C<\0> and C<\1> being considered booleans.
=back
=head2 JSON parsing & formatting
This module comes with a custom C-based JSON parser and formatter. These
@ -112,10 +137,9 @@ functions conform strictly to L<RFC-8259|https://tools.ietf.org/html/rfc8259>,
non-standard extensions are not supported and never will be. It also happens to
be pretty fast, refer to L<FU::Benchmarks> for some numbers.
JSON booleans are parsed into C<builtin::true> and C<builtin::false>. When
formatting, those builtin constants are the I<only> recognized boolean values -
alternative representations such as C<JSON::PP::true> and C<JSON::PP::false>
are not recognized and attempting to format such values will croak.
JSON booleans are parsed into C<builtin::true> and C<builtin::false>. In the
other direction, the C<to_bool()> function above is used to recognize which
values to represent as JSON boolean.
JSON numbers that are too large fit into a Perl integer are parsed into a
floating point value instead. This obviously loses precision, but is consistent
@ -230,11 +254,11 @@ Maximum permitted nesting depth of Perl values. Defaults to 512.
(Why the hell yet another JSON codec when CPAN is already full of them!? Well,
L<JSON::XS> is pretty cool but isn't going to be updated to support Perl's new
builtin booleans. L<JSON::PP> is slow and while L<Cpanel::JSON::XS> is
perfectly adequate, its codebase is too large and messy for my taste - too many
unnecessary features and C<#ifdef>s to support ancient perls and esoteric
configurations. Still, if you need anything not provided by these functions,
L<JSON::PP> and L<Cpanel::JSON::XS> are perfectly fine alternatives.
L<JSON::SIMD> and L<Mojo::JSON> also look like good and maintained candidates.)
perfectly adequate, its codebase is way too large and messy for what I need -
it has too many unnecessary features and C<#ifdef>s to support ancient perls
and esoteric configurations. Still, if you need anything not provided by these
functions, L<JSON::PP> and L<Cpanel::JSON::XS> are perfectly fine alternatives.
L<JSON::SIMD> and L<JSON::Tiny> also look like good and maintained candidates.)
=head2 URI-Related Functions
@ -289,7 +313,7 @@ characters, as per C<utf8_decode>.
=item query_encode($hashref)
The opposite of C<query_decode>. Takes a hashref of similar structure and
returns an ASCII-encoded query string. Keys with C<undef> or C<builtin::false>
returns an ASCII-encoded query string. Keys with C<undef> or C<to_bool()> false
values are omitted in the output.
If a given value is a blessed object with a C<TO_QUERY()> method, that method
@ -350,8 +374,8 @@ descriptor was received. The returned C<$message> is undef on error or an empty
string on EOF.
Like regular socket I/O, a single C<fdpass_send()> message may be split across
multiple C<fdpass_recv()> calls; in that case the C<$fd> will only be received
on the first call.
multiple C<fdpass_recv()> calls; in that case the C<$fd> is only received on
the first call.
Don't use this function if the sender may include multiple file descriptors in
a single message, weird things can happen. File descriptors received this way