yhdev/dat/doc/funcweb.md

% An Opinionated Survey of Functional Web Development

(Published on **2017-05-28**)

# Intro

TL;DR: In this article I provide an overview of the frameworks and libraries
available for creating websites in statically-typed functional programming
languages.

I recommend you now skip directly to the next section, but if you're interested
in some context and don't mind a rant, feel free to read on. :-)

**&lt;Rant mode>**

When compared to native desktop application development, web development just
sucks.  Native  development is relatively simple with toolkits such as
[Qt](https://www.qt.io/), [GTK+](https://www.gtk.org/) and others: You have
convenient widget libraries, and you can describe your entire application, from
interface design to all behavioural aspects, in a single programming language.
You're also largely free to structure code in whichever way makes most sense.
You can describe what a certain input field looks like, what happens when the
user interacts with it and what will happen with the input data, all succinctly
in a single file.  There are even drag-and-drop UI builders to speed up
development.

Web development is the exact opposite of that. There are several different
technologies you're forced to work with even when creating the most mundane
website, and there's a necessary but annoying split between code that runs on
the server and code that runs in the browser. Creating a simple input field
requires you to consider and maintain several ends:

- The back end (server-side code) that describes how the input field interacts
  with the database.
- Some JavaScript code to describe how the user can interact with the input
  field.
- Some CSS to describe what the input field looks like.
- And then there's HTML to act as a glue between the above.

In many web development setups, all four of the above technologies are
maintained in different files. If you want to add, remove or modify an input
field, or just about anything else on a page, you'll be editing at least four
different files with different syntax and meaning. I don't know how other
developers deal with this, but the only way I've been able to keep these places
synchronized is to just edit one or two places, test if it works in a browser,
and then edit the other places accordingly to fix whatever issues I find. This
doesn't always work well: I don't get a warning if I remove an HTML element
somewhere and forget to also remove the associated CSS. Heck, in larger
projects I can't even tell whether it's safe to remove or edit a certain line
of CSS because I have no way to know for sure that it's not still being used
elsewhere. Perhaps this particular case can be solved with proper organization
and discipline, but similar problems exist with the other technologies.

Yet despite that, why do I still create websites in my free time? Because it is
the only environment with high portability and low friction - after all, pretty
much anyone can browse the web. I would not have been able to create a useful
"[Visual Novel Database](https://vndb.org/)" any other way than through a
website. And the entire purpose of [Manned.org](https://manned.org/) is to
provide quick access to man pages from anywhere, which is not easily possible
with native applications.

**&lt;/Rant mode>**

Fortunately, I am not the only one who sees the problems with the "classic"
development strategy mentioned above. There are many existing attempts to
improve on that situation. A popular approach to simplify development is the
[Single-page
application](https://en.wikipedia.org/wiki/Single-page_application) (SPA). The
idea is to move as much code as possible to the front end, and keep only a
minimal back end. Both the HTML and the entire behaviour of the page can be
defined in the same language and same file.  With libraries such as
[React](https://facebook.github.io/react/) and browser support for [Web
components](https://developer.mozilla.org/en-US/docs/Web/Web_Components), the
split between files described above can be largely eliminated. And if
JavaScript isn't your favorite language, there are many alternative languages
that compile to JavaScript. (See [The JavaScript
Minefield](http://walkercoderanger.com/blog/2014/02/javascript-minefield/) for
an excellent series of articles on that topic).

While that approach certainly has the potential to make web development more
pleasant, it has a very significant drawback: Performance. For some
applications, such as web based email clients or CRM systems, it can be
perfectly acceptable to have a megabyte of JavaScript as part of the initial
page load. But for most other sites, such as this one, or the two sites I
mentioned earlier, or sites like Wikipedia, a slow initial page load is
something I consider to be absolutely unacceptable. The web can be really fast,
and developer laziness is not a valid excuse to ruin it. (If you haven't seen
or read [The Website Obesity
Crisis](http://idlewords.com/talks/website_obesity.htm) yet, please do so now).

I'm much more interested in the opposite approach to SPA: Move as much code as
possible to the back end, and only send a minimal amount of JavaScript to the
browser. This is arguably how web development has always been done in the past,
and there's little reason to deviate from it. The difference, however, is that
people tend to expect much more "interactivity" from web sites nowadays, so the
amount of JavaScript is increasing. And that is alright, so long as the
JavaScript doesn't prevent the initial page from loading quickly. But this
increase in JavaScript does amplify the "multiple files" problem I ranted about
earlier.

So my ideal solution is a framework where I can describe all aspects of a site
in a single language, and organize the code among files in a way that makes
sense to me. That is, I want the same kind of freedom that I get with native
desktop software development. Such a framework should run on the back end, and
automatically generate efficient JavaScript and, optionally, CSS for the front
end. As an additional requirement (or rather, strong preference), all this
should be in a statically-typed language - because I am seemingly incapable of
writing large reliable applications with dynamic typing - and in a language
from functional heritage - because programming in functional languages has
spoiled me.

I'm confident that what I describe is possible, and it's evident that I'm not
the only person to want this, as several (potential) solutions like this do
indeed exist.  I've been looking around for these solutions and have
experimented with a few that looked promising.  This article provides an
overview of what I have found so far.

# OCaml

My adventure began with [OCaml](https://ocaml.org/). It's been a few years
since I last used OCaml for anything, but development on the language and its
ecosystem have all but halted. [Real World OCaml](https://realworldocaml.org/)
has been a great resource to get me up to speed again.

## Ocsigen

For OCaml there is one project that has it all: [Ocsigen](http://ocsigen.org/).
It comes with an OCaml to JavaScript compiler, a web server, several handy
libraries, and a [framework](http://ocsigen.org/eliom/) to put everything
together.  Its [syntax
extension](http://ocsigen.org/eliom/6.2/manual/ppx-syntax) allows you to mix
front and back end code, and you can easily share code between both ends. The
final result is a binary that runs the server and a JavaScript file that
handles everything on the client side.

The framework comes with an embedded DSL with which you can conveniently
generate HTML without actually typing HTML. And best of all, this DSL works on
both the client and the server: On the server side it generates an HTML string
that can be sent to the client, and running the same code on the client side
will result in a DOM element that is ready to be used.

Ocsigen makes heavy use of the OCaml type system to statically guarantee the
correctness of various aspects of the application. The HTML DSL ensures not
only that the generated HTML well-formed, but also prevents you from
incorrectly nesting certain elements and using the wrong attributes on the
wrong elements.  Similarly, an HTML element generated on the server side can be
referenced from client side code without having to manually assign a unique ID
to the element. This prevents accidental typos in the ID naming and guarantees
that the element that the client side code refers to actually exists. URL
routing and links to internal pages are also checked at compile time.

Ocsigen almost exactly matches what I previously described as the perfect
development framework. Unfortunately, it has a few drawbacks:

- The generated JavaScript is quite large, a bit over 400 KiB for an hello
  world.  In my brief experience with the framework, this also results in a
  noticeably slower page load. I don't know if it was done for performance
  purposes, but subsequent page views are per default performed via in-browser
  XHR requests, which do not require that all the JavaScript is re-parsed and
  evaluated, and is thus much faster. This, however, doesn't work well if the
  user opens pages in multiple tabs or performs a page reload for whatever
  reason. And as I mentioned, I care a lot about the initial page loading time.
- The framework has a steep learning curve, and the available documentation is
  by far not complete enough to help you. I've found myself wondering many
  times how I was supposed to use a certain API and have had to look for
  example code for enlightenment. At some point I ended up just reading the
  source code instead of going for the documentation. What doesn't help here is
  that, because of the heavy use of the type system to ensure code correctness,
  most of the function signatures are far from intuitive and are sometimes very
  hard to interpret.  This problem is made even worse with the generally
  unhelpful error messages from the compiler. (A few months with
  [Rust](https://www.rust-lang.org/) and its excellent error messages has
  really spoiled me on this aspect, I suppose).
- I believe they went a bit too far with the compile-time verification of
  certain correctness properties. Apart from making the framework harder to
  learn, it also increases the verbosity of the code and removes a lot of
  flexibility.  For instance, in order for internal links to be checked, you
  have to declare your URLs (or _services_, as they call it) somewhere central
  such that the view part of your application can access it. Then elsewhere you
  have to register a handler to that service. This adds boilerplate and
  enforces a certain code structure. And the gain of all this is, in my
  opinion, pretty small: In the 15 years that I have been building web sites, I
  don't remember a single occurrence where I mistyped the URL in an internal
  link. I do suppose that this feature makes it easy to change URLs without
  causing breakage, but there is a trivial counter-argument to that: [Cool URIs
  don't change](https://www.w3.org/Provider/Style/URI.html). (Also, somewhat
  ironically, I have found more dead internal links on the Ocsigen website than
  on any other site I have visited in the past year, so perhaps this was indeed
  a problem they considered worth fixing. Too bad it didn't seem to work out so
  well for them).

Despite these drawbacks, I am really impressed with what the Ocsigen project
has achieved, and it has set a high bar for the future frameworks that I will
be considering.

# Haskell

I have always seen Haskell as that potentially awesome language that I just
can't seem to wrap my head around, despite several attempts in the past to
learn it. Apparently the only thing I was missing in those attempts was a
proper goal: When I finally started playing around with some web frameworks I
actually managed to get productive in Haskell with relative ease. What also
helped me this time was a practical introductory Haskell reference, [What I
Wish I Knew When Learning Haskell](http://dev.stephendiehl.com/hask/), in
addition to the more theoretical [Learn You A Haskell for Great
Good](http://learnyouahaskell.com/).

Haskell itself already has a few advantages when compared to OCaml: For one, it
has a larger ecosystem, so for any task you can think of there is probably
already at least one existing library. As an example, I was unable to find an
actively maintained SQL DSL for OCaml, while there are several available for
Haskell. Another advantage that I found were the much more friendly and
detailed error messages generated by the Haskell compiler, GHC. In terms of
build systems, Haskell has standardized on
[Cabal](https://www.haskell.org/cabal/), which works alright most of the time.
Packaging is still often complex and messy, but it's certainly improving as
[Stack](http://haskellstack.org/) is gaining more widespread adoption. Finally,
I feel that the Haskell syntax is slightly less verbose, and more easily lends
itself to convenient DSLs.

Despite Haskell's larger web development community, I could not find a single
complete and integrated client/server development framework such as Ocsigen.
Instead, there are a whole bunch of different projects focussing on either the
back end or the front end. I'll explore some of them with the idea that,
perhaps, it's possible to mix and match different libraries and frameworks in
order to get the perfect development environment. And indeed, this seems to be
a common approach in many Haskell projects.

## Server-side

Let's start with a few back end frameworks.

Scotty
:   [Scotty](https://github.com/scotty-web/scotty) is a web framework inspired by
    [Sinatra](http://www.sinatrarb.com/). I have no experience with (web)
    development in Ruby and have never used Sinatra, but it has some similarities
    to what I have been using for a long time: [TUWF](https://dev.yorhel.nl/tuwf).

    Scotty is a very minimalist framework; It does routing (that is, mapping URLs
    to Haskell functions), it has some functions to access request data and some
    functions to create and modify a response. That's it. No database handling,
    session management, HTML generation, form handling or other niceties. But
    that's alright, because there are many generic libraries to help you out there.

    Thanks to its minimalism, I found Scotty to be very easy to learn and get used
    to. Even as a Haskell newbie I had a simple website running within a day. The
    documentation is appropriate, but the idiomatic way of combining Scotty with
    other libraries is through the use of Monad Transformers, and a few more
    examples in this area would certainly have helped.

Spock
:   Continuing with the Star Trek franchise, there's
    [Spock](https://www.spock.li/). Spock is very similar to Scotty, but comes with
    type-safe routing and various other goodies such as session and state
    management, [CSRF](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
    protection and database helpers.

    As with everything that is (supposedly) more convenient, it also comes with a
    slightly steeper learning curve. I haven't, for example, figured out yet how to
    do regular expression based routing. I don't even know if that's still possible
    in the latest version - the documentation isn't very clear. Likewise, it's
    unclear to me what the session handling does exactly (Does it store something?
    And where? Is there a timeout?) and how that interacts with CSRF protection.
    Spock seems useful, but requires more than just a cursory glance.

Servant
:   [Servant](http://haskell-servant.github.io/) is another minimalist web
    framework, although it is primarily designed for creating RESTful APIs.

    Servant distinguishes itself from Scotty and Spock by not only featuring
    type-safe routing, it furthermore allows you to describe your complete public
    API as a type, and get strongly typed responses for free. This also enables
    support for automatically generated documentation and client-side API wrappers.

    Servant would be an excellent back end for a SPA, but it does not seem like an
    obvious approach to building regular websites.

Happstack / Snap / Yesod
:   [Happstack](http://www.happstack.com/), [Yesod](http://www.yesodweb.com/) and
    [Snap](http://snapframework.com/) are three large frameworks with many
    auxiliary libraries. They all come with a core web server, routing, state and
    database management. Many of the libraries are not specific to the framework
    and can be used together with other frameworks. I won't go into a detailed
    comparison between the three projects because I have no personal experience
    with any of them, and fortunately [someone else already wrote a
    comparison](http://softwaresimply.blogspot.nl/2012/04/hopefully-fair-and-useful-comparison-of.html)
    in 2012 - though I don't know how accurate that still is today.

So there are a fair amount of frameworks to choose from, and they can all work
together with other libraries to implement additional functions. Apart from the
framework, another important aspect of web development is how you generate the
HTML to send to the client. In true Haskell style, there are several answers.

For those who prefer embedded DSLs, there are
[xhtml](http://hackage.haskell.org/package/xhtml),
[BlazeHTML](https://jaspervdj.be/blaze/) and
[Lucid](https://github.com/chrisdone/lucid). The xhtml package is not being
used much nowadays and has been superseded by BlazeHTML, which is both faster
and offers a more readable DSL using Haskell's do-notation. Lucid is heavily
inspired by Blaze, and attempts to [fix several of its
shortcomings](http://chrisdone.com/posts/lucid). Having used Lucid a bit
myself, I can attest that it is easy to get started with and pretty convenient
in use.

I definitely prefer to generate HTML using DSLs as that keeps the entire
application in a single host language and with consistent syntax, but the
alternative approach, templating, is also fully supported in Haskell. The Snap
framework comes with [Heist](https://github.com/snapframework/heist), which are
run-time interpreted templates, like similar systems in most other languages.
Yesod comes with [Shakespeare](http://hackage.haskell.org/package/shakespeare),
which is a type-safe templating system with support for inlining the templates
in Haskell code. Interestingly, Shakespeare also has explicit support for
templating JavaScript code. Too bad that this doesn't take away the need to
write the JavaScript yourself, so I don't see how this is an improvement over
some other JavaScript solution that uses JSON for communication with the back
end.

## Client-side

It is rather unusual to have multiple compiler implementations targeting
JavaScript for the same source language, but Haskell has three of them. All
three can be used to write front end code without touching a single line of
JavaScript, but there are large philosophical differences between the three
projects.

Fay
:   [Fay](https://github.com/faylang/fay/wiki) compiles Haskell code directly to
    JavaScript. The main advantage of Fay is that it does not come with a large
    runtime, resulting small and efficient JavaScript. The main downside is that it
    only [supports a subset of
    Haskell](https://github.com/faylang/fay/wiki/What-is-not-supported?).  The
    result is a development environment that is very browser-friendly, but where
    you can't share much code between the front and back ends. You're basically
    back to the separated front and back end situation in classic web development,
    but at least you can use the same language for both - somewhat.

    Fay itself doesn't come with many convenient UI libraries, but
    [Cinder](http://crooney.github.io/cinder/index.html) covers that with a
    convenient HTML DSL and DOM manipulation library.

    Fay is still seeing sporadic development activity, but there is not much of
    a lively community around it. Most people have moved on to other solutions.

GHCJS
:   [GHCJS](https://github.com/ghcjs/ghcjs) uses GHC itself to compile Haskell to a
    low-level intermediate language, and then compiles that language to JavaScript.
    This allows GHCJS to achieve excellent compatibility with native Haskell code,
    but comes, quite predictably, at the high cost of duplicating a large part of
    the Haskell runtime into the JavaScript output. The generated JavaScript code
    is typically measured in megabytes rather than kilobytes, which is (in my
    opinion) far too large for regular web sites. The upside of this high
    compatibility, of course, is that you can re-use a lot of code between the
    front and back ends, which will certainly make web development more tolerable.

    The community around GHCJS seems to be more active than that of Fay. GHCJS
    integrates properly with the Stack package manager, and there are a [whole
    bunch](http://hackage.haskell.org/packages/search?terms=ghcjs) of libraries
    available.

Haste
:   [Haste](https://github.com/valderman/haste-compiler) provides a middle ground
    between Fay and GHCJS. Like GHCJS, Haste is based on GHC, but it instead of
    using low-level compiler output, Haste uses a higher-level intermediate
    language. This results in good compatibility with regular Haskell code while
    keeping the output size in check. Haste has a JavaScript runtime of around 60
    KiB and the compiled code is roughly as space-efficient as Fay.

    While it should be possible to share a fair amount of code between the front
    and back ends, not all libraries work well with Haste. I tried to use Lucid
    within a Haste application, for example, but that did not work. Apparently one
    of its dependencies (probably the UTF-8 codec, as far as I could debug the
    problem) performs some low-level performance optimizations that are
    incompatible with Haste.

    Haste itself is still being sporadically developed, but not active enough to be
    called alive. The compiler lags behind on the GHC version, and the upcoming 0.6
    version has stayed unreleased and in limbo state for at least 4 months on the
    git repository. The community around Haste is in a similar state. Various
    libraries do exist, such as [Shade](https://github.com/takeoutweight/shade)
    (HTML DSL, Reactive UI), [Perch](https://github.com/agocorona/haste-perch)
    (another HTML DSL), [haste-markup](https://github.com/ajnsit/haste-markup) (yet
    another HTML DSL) and
    [haste-dome](https://github.com/wilfriedvanasten/haste-dome) (_yet_ another
    HTML DSL), but they're all pretty much dead.

Despite having three options available, only Haste provides enough benefit of
code reuse while remaining efficient enough for the kind of site that I
envision. Haste really deserves more love than it is currently getting.

## More Haskell

In my quest for Haskell web development frameworks and tools, I came across a
few other interesting libraries. One of them is
[Clay](http://fvisser.nl/clay/), a CSS preprocessor as a DSL. This will by
itself not solve the CSS synchronisation problem that I mentioned at the start
of this article, but it could still be used to keep the CSS closer to code
implementing the rest of the site.

It also would not do to write an article on Haskell web development and not
mention a set of related projects: [MFlow](https://github.com/agocorona/MFlow),
[HPlayground](https://github.com/agocorona/hplayground) and the more recent
[Axiom](https://github.com/transient-haskell/axiom). These are ambitious
efforts at building a very high-level and functional framework for both front
and back end web development. I haven't spend nearly enough time on these
projects to fully understand their scope, but I'm afraid of these being a bit
too high level. This invariably results in reduced flexibility (i.e. too many
opinions being hard-coded in the API) and less efficient JavaScript output.
Axiom being based on GHCJS reinforces the latter concern.

# Other languages

I've covered OCaml and Haskell now, but there are relevant projects in other
languages, too:

PureScript
:   [PureScript](http://www.purescript.org/) is the spiritual successor of Fay -
    except it does not try to be compatible with Haskell, and in fact
    [intentionally deviates from
    Haskell](https://github.com/purescript/documentation/blob/master/language/Differences-from-Haskell.md)
    at several points. Like Fay, and perhaps even more so, PureScript compiles down
    to efficient and small JavaScript.

    Being a not-quite-Haskell language, sharing code between a PureScript front end
    and a Haskell back end is not possible, the differences are simply too large.
    It is, however, possible to go into the other direction: PureScript could also
    run on the back end in a NodeJS environment. I don't really know how well this
    is supported by the language ecosystem, but I'm not sure I'm comfortable with
    replacing the excellent quality of Haskell back end frameworks with a fragile
    NodeJS back end (or such is my perception, I admittedly don't have too much
    faith in most JavaScript-heavy projects).

    The PureScript community is very active and many libraries are available in the
    [Persuit](https://pursuit.purescript.org/) package repository.  Of note is
    [Halogen](https://pursuit.purescript.org/packages/purescript-halogen), a
    high-level reactive UI library. One thing to be aware of is that not all
    libraries are written with space efficiency as their highest priority, the
    simple [Halogen
    button](https://github.com/slamdata/purescript-halogen/tree/v2.0.1/examples/basic)
    example already compiles down to a hefty 300 KB for me.

Elm
:   [Elm](http://elm-lang.org/) is similar to PureScript, but rather than trying to
    be a generic something-to-JavaScript compiler, Elm focuses exclusively on
    providing a good environment to create web UIs. The reactive UI libraries are
    well maintained and part of the core Elm project. Elm has a strong focus on
    being easy to learn and comes with good documentation and many examples to get
    started with.

Ur/Web
:   [Ur/Web](http://www.impredicative.com/ur/) is an ML and Haskell inspired
    programming language specifically designed for client/server programming. Based
    on its description, Ur/Web is exactly the kind of thing I'm looking for: It
    uses a single language for the front and back ends and provides convenient
    methods for communication between the two.

    This has been a low priority on my to-try list because it seems to be primarily
    a one-man effort, and the ecosystem around it is pretty small. Using Ur/Web for
    practical applications will likely involve writing your own libraries or
    wrappers for many common tasks, such as for image manipulation or advanced text
    processing.  Nonetheless, I definitely should be giving this a try sometime.

    (Besides, who still uses frames in this day and age? :-)

Opa
:   I'll be moving out of the functional programming world for a bit.

    [Opa](http://opalang.org/) is another language and environment designed for
    client/server programming. Opa takes a similar approach to "everything in
    PureScript": Just compile everything to JavaScript and run the server-side code
    on NodeJS. The main difference with other to-JavaScript compilers is that Opa
    supports mixing back end code with front end code, and it can automatically
    figure out where the code should be run and how the back and front ends
    communicate with each other.

    Opa, as a language, is reminiscent of a statically-typed JavaScript with
    various syntax extensions. While it does support SQL databases, its database
    API seems to strongly favor object-oriented use rather than relational database
    access.

GWT
:   Previously I compared web development to native GUI application development.
    There is no reason why you can't directly apply native development structure
    and strategies onto the web, and that's exactly what
    [GWT](http://www.gwtproject.org/) does. It provides a widget-based programming
    environment that eventually runs on the server and compiles the client-side
    part to JavaScript. I haven't really considered it further, as Java is not a
    language I can be very productive in.

Webtoolkit
:   In the same vein, there's [Wt](https://www.webtoolkit.eu/wt). The name might
    suggest that it is a web-based clone of Qt, and indeed that's what it looks
    like. Wt is written in C++, but there are wrappers for [other
    languages](https://www.webtoolkit.eu/wt/other_language). None of the languages
    really interest me much, however.

    That said, if I had to write a web UI for a resource-constrained device, this
    seems like an excellent project to consider.

# To conclude

To be honest, I am a bit overwhelmed at the number of options. On the one hand,
it makes me very happy to see that a lot is happening in this world, and that
alternatives to boring web frameworks do exist. Yet after all this research I
still have no clue what I should use to develop my next website. I do like the
mix and match culture of Haskell, which has the potential to form a development
environment entirely to my own taste and with my own chosen trade-offs. On the
other hand, the client-side Haskell solutions are simply too immature and
integration with the back end frameworks is almost nonexistent.

Almost none of the frameworks I discussed attempt to tackle the CSS problem
that I mentioned in the introduction, so there is clearly room for more
research in this area.

There are a few technologies that I should spend more time on to familiarize
myself with. Ur/Web is an obvious candidate here, but perhaps it is possible to
create a Haskell interface to Wt. Or maybe some enhancements to the Haste
ecosystem could be enough to make that a workable solution instead.