yhdev/dat/doc/funcweb.md
Yorhel 6242b2ee9c Rewrite to static site
With a complete reorganisation of the directory structure and most of
the content converted to pandoc-flavoured markdown.

Some TODO's left before this can go live:
- Main page
- Atom feeds
- Bug tracker
2019-03-23 11:56:53 +01:00

517 lines
29 KiB
Markdown

% An Opinionated Survey of Functional Web Development
(Published on **2017-05-28**)
# Intro
TL;DR: In this article I provide an overview of the frameworks and libraries
available for creating websites in statically-typed functional programming
languages.
I recommend you now skip directly to the next section, but if you're interested
in some context and don't mind a rant, feel free to read on. :-)
**<Rant mode>**
When compared to native desktop application development, web development just
sucks. Native development is relatively simple with toolkits such as
[Qt](https://www.qt.io/), [GTK+](https://www.gtk.org/) and others: You have
convenient widget libraries, and you can describe your entire application, from
interface design to all behavioural aspects, in a single programming language.
You're also largely free to structure code in whichever way makes most sense.
You can describe what a certain input field looks like, what happens when the
user interacts with it and what will happen with the input data, all succinctly
in a single file. There are even drag-and-drop UI builders to speed up
development.
Web development is the exact opposite of that. There are several different
technologies you're forced to work with even when creating the most mundane
website, and there's a necessary but annoying split between code that runs on
the server and code that runs in the browser. Creating a simple input field
requires you to consider and maintain several ends:
- The back end (server-side code) that describes how the input field interacts
with the database.
- Some JavaScript code to describe how the user can interact with the input
field.
- Some CSS to describe what the input field looks like.
- And then there's HTML to act as a glue between the above.
In many web development setups, all four of the above technologies are
maintained in different files. If you want to add, remove or modify an input
field, or just about anything else on a page, you'll be editing at least four
different files with different syntax and meaning. I don't know how other
developers deal with this, but the only way I've been able to keep these places
synchronized is to just edit one or two places, test if it works in a browser,
and then edit the other places accordingly to fix whatever issues I find. This
doesn't always work well: I don't get a warning if I remove an HTML element
somewhere and forget to also remove the associated CSS. Heck, in larger
projects I can't even tell whether it's safe to remove or edit a certain line
of CSS because I have no way to know for sure that it's not still being used
elsewhere. Perhaps this particular case can be solved with proper organization
and discipline, but similar problems exist with the other technologies.
Yet despite that, why do I still create websites in my free time? Because it is
the only environment with high portability and low friction - after all, pretty
much anyone can browse the web. I would not have been able to create a useful
"[Visual Novel Database](https://vndb.org/)" any other way than through a
website. And the entire purpose of [Manned.org](https://manned.org/) is to
provide quick access to man pages from anywhere, which is not easily possible
with native applications.
**</Rant mode>**
Fortunately, I am not the only one who sees the problems with the "classic"
development strategy mentioned above. There are many existing attempts to
improve on that situation. A popular approach to simplify development is the
[Single-page
application](https://en.wikipedia.org/wiki/Single-page_application) (SPA). The
idea is to move as much code as possible to the front end, and keep only a
minimal back end. Both the HTML and the entire behaviour of the page can be
defined in the same language and same file. With libraries such as
[React](https://facebook.github.io/react/) and browser support for [Web
components](https://developer.mozilla.org/en-US/docs/Web/Web_Components), the
split between files described above can be largely eliminated. And if
JavaScript isn't your favorite language, there are many alternative languages
that compile to JavaScript. (See [The JavaScript
Minefield](http://walkercoderanger.com/blog/2014/02/javascript-minefield/) for
an excellent series of articles on that topic).
While that approach certainly has the potential to make web development more
pleasant, it has a very significant drawback: Performance. For some
applications, such as web based email clients or CRM systems, it can be
perfectly acceptable to have a megabyte of JavaScript as part of the initial
page load. But for most other sites, such as this one, or the two sites I
mentioned earlier, or sites like Wikipedia, a slow initial page load is
something I consider to be absolutely unacceptable. The web can be really fast,
and developer laziness is not a valid excuse to ruin it. (If you haven't seen
or read [The Website Obesity
Crisis](http://idlewords.com/talks/website_obesity.htm) yet, please do so now).
I'm much more interested in the opposite approach to SPA: Move as much code as
possible to the back end, and only send a minimal amount of JavaScript to the
browser. This is arguably how web development has always been done in the past,
and there's little reason to deviate from it. The difference, however, is that
people tend to expect much more "interactivity" from web sites nowadays, so the
amount of JavaScript is increasing. And that is alright, so long as the
JavaScript doesn't prevent the initial page from loading quickly. But this
increase in JavaScript does amplify the "multiple files" problem I ranted about
earlier.
So my ideal solution is a framework where I can describe all aspects of a site
in a single language, and organize the code among files in a way that makes
sense to me. That is, I want the same kind of freedom that I get with native
desktop software development. Such a framework should run on the back end, and
automatically generate efficient JavaScript and, optionally, CSS for the front
end. As an additional requirement (or rather, strong preference), all this
should be in a statically-typed language - because I am seemingly incapable of
writing large reliable applications with dynamic typing - and in a language
from functional heritage - because programming in functional languages has
spoiled me.
I'm confident that what I describe is possible, and it's evident that I'm not
the only person to want this, as several (potential) solutions like this do
indeed exist. I've been looking around for these solutions and have
experimented with a few that looked promising. This article provides an
overview of what I have found so far.
# OCaml
My adventure began with [OCaml](https://ocaml.org/). It's been a few years
since I last used OCaml for anything, but development on the language and its
ecosystem have all but halted. [Real World OCaml](https://realworldocaml.org/)
has been a great resource to get me up to speed again.
## Ocsigen
For OCaml there is one project that has it all: [Ocsigen](http://ocsigen.org/).
It comes with an OCaml to JavaScript compiler, a web server, several handy
libraries, and a [framework](http://ocsigen.org/eliom/) to put everything
together. Its [syntax
extension](http://ocsigen.org/eliom/6.2/manual/ppx-syntax) allows you to mix
front and back end code, and you can easily share code between both ends. The
final result is a binary that runs the server and a JavaScript file that
handles everything on the client side.
The framework comes with an embedded DSL with which you can conveniently
generate HTML without actually typing HTML. And best of all, this DSL works on
both the client and the server: On the server side it generates an HTML string
that can be sent to the client, and running the same code on the client side
will result in a DOM element that is ready to be used.
Ocsigen makes heavy use of the OCaml type system to statically guarantee the
correctness of various aspects of the application. The HTML DSL ensures not
only that the generated HTML well-formed, but also prevents you from
incorrectly nesting certain elements and using the wrong attributes on the
wrong elements. Similarly, an HTML element generated on the server side can be
referenced from client side code without having to manually assign a unique ID
to the element. This prevents accidental typos in the ID naming and guarantees
that the element that the client side code refers to actually exists. URL
routing and links to internal pages are also checked at compile time.
Ocsigen almost exactly matches what I previously described as the perfect
development framework. Unfortunately, it has a few drawbacks:
- The generated JavaScript is quite large, a bit over 400 KiB for an hello
world. In my brief experience with the framework, this also results in a
noticeably slower page load. I don't know if it was done for performance
purposes, but subsequent page views are per default performed via in-browser
XHR requests, which do not require that all the JavaScript is re-parsed and
evaluated, and is thus much faster. This, however, doesn't work well if the
user opens pages in multiple tabs or performs a page reload for whatever
reason. And as I mentioned, I care a lot about the initial page loading time.
- The framework has a steep learning curve, and the available documentation is
by far not complete enough to help you. I've found myself wondering many
times how I was supposed to use a certain API and have had to look for
example code for enlightenment. At some point I ended up just reading the
source code instead of going for the documentation. What doesn't help here is
that, because of the heavy use of the type system to ensure code correctness,
most of the function signatures are far from intuitive and are sometimes very
hard to interpret. This problem is made even worse with the generally
unhelpful error messages from the compiler. (A few months with
[Rust](https://www.rust-lang.org/) and its excellent error messages has
really spoiled me on this aspect, I suppose).
- I believe they went a bit too far with the compile-time verification of
certain correctness properties. Apart from making the framework harder to
learn, it also increases the verbosity of the code and removes a lot of
flexibility. For instance, in order for internal links to be checked, you
have to declare your URLs (or _services_, as they call it) somewhere central
such that the view part of your application can access it. Then elsewhere you
have to register a handler to that service. This adds boilerplate and
enforces a certain code structure. And the gain of all this is, in my
opinion, pretty small: In the 15 years that I have been building web sites, I
don't remember a single occurrence where I mistyped the URL in an internal
link. I do suppose that this feature makes it easy to change URLs without
causing breakage, but there is a trivial counter-argument to that: [Cool URIs
don't change](https://www.w3.org/Provider/Style/URI.html). (Also, somewhat
ironically, I have found more dead internal links on the Ocsigen website than
on any other site I have visited in the past year, so perhaps this was indeed
a problem they considered worth fixing. Too bad it didn't seem to work out so
well for them).
Despite these drawbacks, I am really impressed with what the Ocsigen project
has achieved, and it has set a high bar for the future frameworks that I will
be considering.
# Haskell
I have always seen Haskell as that potentially awesome language that I just
can't seem to wrap my head around, despite several attempts in the past to
learn it. Apparently the only thing I was missing in those attempts was a
proper goal: When I finally started playing around with some web frameworks I
actually managed to get productive in Haskell with relative ease. What also
helped me this time was a practical introductory Haskell reference, [What I
Wish I Knew When Learning Haskell](http://dev.stephendiehl.com/hask/), in
addition to the more theoretical [Learn You A Haskell for Great
Good](http://learnyouahaskell.com/).
Haskell itself already has a few advantages when compared to OCaml: For one, it
has a larger ecosystem, so for any task you can think of there is probably
already at least one existing library. As an example, I was unable to find an
actively maintained SQL DSL for OCaml, while there are several available for
Haskell. Another advantage that I found were the much more friendly and
detailed error messages generated by the Haskell compiler, GHC. In terms of
build systems, Haskell has standardized on
[Cabal](https://www.haskell.org/cabal/), which works alright most of the time.
Packaging is still often complex and messy, but it's certainly improving as
[Stack](http://haskellstack.org/) is gaining more widespread adoption. Finally,
I feel that the Haskell syntax is slightly less verbose, and more easily lends
itself to convenient DSLs.
Despite Haskell's larger web development community, I could not find a single
complete and integrated client/server development framework such as Ocsigen.
Instead, there are a whole bunch of different projects focussing on either the
back end or the front end. I'll explore some of them with the idea that,
perhaps, it's possible to mix and match different libraries and frameworks in
order to get the perfect development environment. And indeed, this seems to be
a common approach in many Haskell projects.
## Server-side
Let's start with a few back end frameworks.
Scotty
: [Scotty](https://github.com/scotty-web/scotty) is a web framework inspired by
[Sinatra](http://www.sinatrarb.com/). I have no experience with (web)
development in Ruby and have never used Sinatra, but it has some similarities
to what I have been using for a long time: [TUWF](https://dev.yorhel.nl/tuwf).
Scotty is a very minimalist framework; It does routing (that is, mapping URLs
to Haskell functions), it has some functions to access request data and some
functions to create and modify a response. That's it. No database handling,
session management, HTML generation, form handling or other niceties. But
that's alright, because there are many generic libraries to help you out there.
Thanks to its minimalism, I found Scotty to be very easy to learn and get used
to. Even as a Haskell newbie I had a simple website running within a day. The
documentation is appropriate, but the idiomatic way of combining Scotty with
other libraries is through the use of Monad Transformers, and a few more
examples in this area would certainly have helped.
Spock
: Continuing with the Star Trek franchise, there's
[Spock](https://www.spock.li/). Spock is very similar to Scotty, but comes with
type-safe routing and various other goodies such as session and state
management, [CSRF](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
protection and database helpers.
As with everything that is (supposedly) more convenient, it also comes with a
slightly steeper learning curve. I haven't, for example, figured out yet how to
do regular expression based routing. I don't even know if that's still possible
in the latest version - the documentation isn't very clear. Likewise, it's
unclear to me what the session handling does exactly (Does it store something?
And where? Is there a timeout?) and how that interacts with CSRF protection.
Spock seems useful, but requires more than just a cursory glance.
Servant
: [Servant](http://haskell-servant.github.io/) is another minimalist web
framework, although it is primarily designed for creating RESTful APIs.
Servant distinguishes itself from Scotty and Spock by not only featuring
type-safe routing, it furthermore allows you to describe your complete public
API as a type, and get strongly typed responses for free. This also enables
support for automatically generated documentation and client-side API wrappers.
Servant would be an excellent back end for a SPA, but it does not seem like an
obvious approach to building regular websites.
Happstack / Snap / Yesod
: [Happstack](http://www.happstack.com/), [Yesod](http://www.yesodweb.com/) and
[Snap](http://snapframework.com/) are three large frameworks with many
auxiliary libraries. They all come with a core web server, routing, state and
database management. Many of the libraries are not specific to the framework
and can be used together with other frameworks. I won't go into a detailed
comparison between the three projects because I have no personal experience
with any of them, and fortunately [someone else already wrote a
comparison](http://softwaresimply.blogspot.nl/2012/04/hopefully-fair-and-useful-comparison-of.html)
in 2012 - though I don't know how accurate that still is today.
So there are a fair amount of frameworks to choose from, and they can all work
together with other libraries to implement additional functions. Apart from the
framework, another important aspect of web development is how you generate the
HTML to send to the client. In true Haskell style, there are several answers.
For those who prefer embedded DSLs, there are
[xhtml](http://hackage.haskell.org/package/xhtml),
[BlazeHTML](https://jaspervdj.be/blaze/) and
[Lucid](https://github.com/chrisdone/lucid). The xhtml package is not being
used much nowadays and has been superseded by BlazeHTML, which is both faster
and offers a more readable DSL using Haskell's do-notation. Lucid is heavily
inspired by Blaze, and attempts to [fix several of its
shortcomings](http://chrisdone.com/posts/lucid). Having used Lucid a bit
myself, I can attest that it is easy to get started with and pretty convenient
in use.
I definitely prefer to generate HTML using DSLs as that keeps the entire
application in a single host language and with consistent syntax, but the
alternative approach, templating, is also fully supported in Haskell. The Snap
framework comes with [Heist](https://github.com/snapframework/heist), which are
run-time interpreted templates, like similar systems in most other languages.
Yesod comes with [Shakespeare](http://hackage.haskell.org/package/shakespeare),
which is a type-safe templating system with support for inlining the templates
in Haskell code. Interestingly, Shakespeare also has explicit support for
templating JavaScript code. Too bad that this doesn't take away the need to
write the JavaScript yourself, so I don't see how this is an improvement over
some other JavaScript solution that uses JSON for communication with the back
end.
## Client-side
It is rather unusual to have multiple compiler implementations targeting
JavaScript for the same source language, but Haskell has three of them. All
three can be used to write front end code without touching a single line of
JavaScript, but there are large philosophical differences between the three
projects.
Fay
: [Fay](https://github.com/faylang/fay/wiki) compiles Haskell code directly to
JavaScript. The main advantage of Fay is that it does not come with a large
runtime, resulting small and efficient JavaScript. The main downside is that it
only [supports a subset of
Haskell](https://github.com/faylang/fay/wiki/What-is-not-supported?). The
result is a development environment that is very browser-friendly, but where
you can't share much code between the front and back ends. You're basically
back to the separated front and back end situation in classic web development,
but at least you can use the same language for both - somewhat.
Fay itself doesn't come with many convenient UI libraries, but
[Cinder](http://crooney.github.io/cinder/index.html) covers that with a
convenient HTML DSL and DOM manipulation library.
Fay is still seeing sporadic development activity, but there is not much of
a lively community around it. Most people have moved on to other solutions.
GHCJS
: [GHCJS](https://github.com/ghcjs/ghcjs) uses GHC itself to compile Haskell to a
low-level intermediate language, and then compiles that language to JavaScript.
This allows GHCJS to achieve excellent compatibility with native Haskell code,
but comes, quite predictably, at the high cost of duplicating a large part of
the Haskell runtime into the JavaScript output. The generated JavaScript code
is typically measured in megabytes rather than kilobytes, which is (in my
opinion) far too large for regular web sites. The upside of this high
compatibility, of course, is that you can re-use a lot of code between the
front and back ends, which will certainly make web development more tolerable.
The community around GHCJS seems to be more active than that of Fay. GHCJS
integrates properly with the Stack package manager, and there are a [whole
bunch](http://hackage.haskell.org/packages/search?terms=ghcjs) of libraries
available.
Haste
: [Haste](https://github.com/valderman/haste-compiler) provides a middle ground
between Fay and GHCJS. Like GHCJS, Haste is based on GHC, but it instead of
using low-level compiler output, Haste uses a higher-level intermediate
language. This results in good compatibility with regular Haskell code while
keeping the output size in check. Haste has a JavaScript runtime of around 60
KiB and the compiled code is roughly as space-efficient as Fay.
While it should be possible to share a fair amount of code between the front
and back ends, not all libraries work well with Haste. I tried to use Lucid
within a Haste application, for example, but that did not work. Apparently one
of its dependencies (probably the UTF-8 codec, as far as I could debug the
problem) performs some low-level performance optimizations that are
incompatible with Haste.
Haste itself is still being sporadically developed, but not active enough to be
called alive. The compiler lags behind on the GHC version, and the upcoming 0.6
version has stayed unreleased and in limbo state for at least 4 months on the
git repository. The community around Haste is in a similar state. Various
libraries do exist, such as [Shade](https://github.com/takeoutweight/shade)
(HTML DSL, Reactive UI), [Perch](https://github.com/agocorona/haste-perch)
(another HTML DSL), [haste-markup](https://github.com/ajnsit/haste-markup) (yet
another HTML DSL) and
[haste-dome](https://github.com/wilfriedvanasten/haste-dome) (_yet_ another
HTML DSL), but they're all pretty much dead.
Despite having three options available, only Haste provides enough benefit of
code reuse while remaining efficient enough for the kind of site that I
envision. Haste really deserves more love than it is currently getting.
## More Haskell
In my quest for Haskell web development frameworks and tools, I came across a
few other interesting libraries. One of them is
[Clay](http://fvisser.nl/clay/), a CSS preprocessor as a DSL. This will by
itself not solve the CSS synchronisation problem that I mentioned at the start
of this article, but it could still be used to keep the CSS closer to code
implementing the rest of the site.
It also would not do to write an article on Haskell web development and not
mention a set of related projects: [MFlow](https://github.com/agocorona/MFlow),
[HPlayground](https://github.com/agocorona/hplayground) and the more recent
[Axiom](https://github.com/transient-haskell/axiom). These are ambitious
efforts at building a very high-level and functional framework for both front
and back end web development. I haven't spend nearly enough time on these
projects to fully understand their scope, but I'm afraid of these being a bit
too high level. This invariably results in reduced flexibility (i.e. too many
opinions being hard-coded in the API) and less efficient JavaScript output.
Axiom being based on GHCJS reinforces the latter concern.
# Other languages
I've covered OCaml and Haskell now, but there are relevant projects in other
languages, too:
PureScript
: [PureScript](http://www.purescript.org/) is the spiritual successor of Fay -
except it does not try to be compatible with Haskell, and in fact
[intentionally deviates from
Haskell](https://github.com/purescript/documentation/blob/master/language/Differences-from-Haskell.md)
at several points. Like Fay, and perhaps even more so, PureScript compiles down
to efficient and small JavaScript.
Being a not-quite-Haskell language, sharing code between a PureScript front end
and a Haskell back end is not possible, the differences are simply too large.
It is, however, possible to go into the other direction: PureScript could also
run on the back end in a NodeJS environment. I don't really know how well this
is supported by the language ecosystem, but I'm not sure I'm comfortable with
replacing the excellent quality of Haskell back end frameworks with a fragile
NodeJS back end (or such is my perception, I admittedly don't have too much
faith in most JavaScript-heavy projects).
The PureScript community is very active and many libraries are available in the
[Persuit](https://pursuit.purescript.org/) package repository. Of note is
[Halogen](https://pursuit.purescript.org/packages/purescript-halogen), a
high-level reactive UI library. One thing to be aware of is that not all
libraries are written with space efficiency as their highest priority, the
simple [Halogen
button](https://github.com/slamdata/purescript-halogen/tree/v2.0.1/examples/basic)
example already compiles down to a hefty 300 KB for me.
Elm
: [Elm](http://elm-lang.org/) is similar to PureScript, but rather than trying to
be a generic something-to-JavaScript compiler, Elm focuses exclusively on
providing a good environment to create web UIs. The reactive UI libraries are
well maintained and part of the core Elm project. Elm has a strong focus on
being easy to learn and comes with good documentation and many examples to get
started with.
Ur/Web
: [Ur/Web](http://www.impredicative.com/ur/) is an ML and Haskell inspired
programming language specifically designed for client/server programming. Based
on its description, Ur/Web is exactly the kind of thing I'm looking for: It
uses a single language for the front and back ends and provides convenient
methods for communication between the two.
This has been a low priority on my to-try list because it seems to be primarily
a one-man effort, and the ecosystem around it is pretty small. Using Ur/Web for
practical applications will likely involve writing your own libraries or
wrappers for many common tasks, such as for image manipulation or advanced text
processing. Nonetheless, I definitely should be giving this a try sometime.
(Besides, who still uses frames in this day and age? :-)
Opa
: I'll be moving out of the functional programming world for a bit.
[Opa](http://opalang.org/) is another language and environment designed for
client/server programming. Opa takes a similar approach to "everything in
PureScript": Just compile everything to JavaScript and run the server-side code
on NodeJS. The main difference with other to-JavaScript compilers is that Opa
supports mixing back end code with front end code, and it can automatically
figure out where the code should be run and how the back and front ends
communicate with each other.
Opa, as a language, is reminiscent of a statically-typed JavaScript with
various syntax extensions. While it does support SQL databases, its database
API seems to strongly favor object-oriented use rather than relational database
access.
GWT
: Previously I compared web development to native GUI application development.
There is no reason why you can't directly apply native development structure
and strategies onto the web, and that's exactly what
[GWT](http://www.gwtproject.org/) does. It provides a widget-based programming
environment that eventually runs on the server and compiles the client-side
part to JavaScript. I haven't really considered it further, as Java is not a
language I can be very productive in.
Webtoolkit
: In the same vein, there's [Wt](https://www.webtoolkit.eu/wt). The name might
suggest that it is a web-based clone of Qt, and indeed that's what it looks
like. Wt is written in C++, but there are wrappers for [other
languages](https://www.webtoolkit.eu/wt/other_language). None of the languages
really interest me much, however.
That said, if I had to write a web UI for a resource-constrained device, this
seems like an excellent project to consider.
# To conclude
To be honest, I am a bit overwhelmed at the number of options. On the one hand,
it makes me very happy to see that a lot is happening in this world, and that
alternatives to boring web frameworks do exist. Yet after all this research I
still have no clue what I should use to develop my next website. I do like the
mix and match culture of Haskell, which has the potential to form a development
environment entirely to my own taste and with my own chosen trade-offs. On the
other hand, the client-side Haskell solutions are simply too immature and
integration with the back end frameworks is almost nonexistent.
Almost none of the frameworks I discussed attempt to tackle the CSS problem
that I mentioned in the introduction, so there is clearly room for more
research in this area.
There are a few technologies that I should spend more time on to familiarize
myself with. Ur/Web is an obvious candidate here, but perhaps it is possible to
create a Haskell interface to Wt. Or maybe some enhancements to the Haste
ecosystem could be enough to make that a workable solution instead.