A Distributed Communication System for Modular Applications

=pod

(Published on B<2012-02-15>. Also available in L<POD|http://dev.yorhel.nl/dat/doc-commvis>.)


=head1 Introduction

I have a vision. A vision in which rigid point-to-point IPC is replaced with a
far more flexible and distributed communication system. A vision in which
different components in the same program can interact with each other without
having to worry about each others' internal state. A vision where programs can
be designed in a modular way, without even worrying about whether to use
threads or an event-based model. A vision where every component communicates
with others, and where you can communicate with every component. And more
importantly, a vision in which each component can be implemented in a different
programming language, without the need for specific code to glue everything
together.

If that sounds interesting to you, then please read on. As a small research
project of mine, I've been looking into ways to realize the above vision, and I
believe to have found an answer. In this article I'll try to explain my ideas
and how they may be used to realize this vision.

My ideas have been heavily inspired by
L<Linda|http://en.wikipedia.org/wiki/Linda_(coordination_language)>. If you're
already familiar with that, then what I present here probably won't be very
revolutionary. Still, there are several aspects in which my ideas differ
significantly from Linda, so you won't be bored reading this. :-)



=head1 The Concept

In this section I'll try to introduce the overall concept and some terminology.
This is going to be somewhat abstract and technical, but please bear with me.
I promise that things will get more interesting in the later sections.

Let me first define an abstract communications framework. We have a B<network>
and a bunch of B<sessions> connected to that network. Sessions can communicate
with each other through this network (that's usually what a network is for,
after all). These sessions do not have to be static: they may come and go.
Keep in mind that, for the purpose of explaining this concept, these terms are
very abstract: a session can be anything. A process, thread, a single function,
an object, or even your mobile phone. Anything. In the same way, the network is
nothing more than an abstract way to connect these sessions. It could be
sockets, pipes, a HTTP server, a broadcast network or just shared memory
between threads. If it allows sessions to communicate I'll call it a network.

Unlike many communication systems, this network does not have the concept of
I<addresses>. There is no direct way for one session to identify another, and
indeed there is no need to do so for the purposes of communication. Instead,
the primary means of communication is by using B<tuples> and patterns.

A tuple is an ordered set (list, array, whatever terminology you prefer) of
zero or more elements.  Each element may have a different type, so it can hold
booleans, integers, floating point numbers, strings and even more complex data
structures as arrays or maps. You may think of a tuple as an array in
L<JSON|http://json.org/> notation, if that makes things easier to understand.

Sessions send and receive tuples to communicate with each other. On the sending
side, a session simply "passes" a tuple to the network. This is a non-blocking,
asynchronous operation. In fact, it makes no sense to make this a blocking
action, because the sender can not know whether it will be received by any
other session anyway. The tuple may be received by many other sessions, or
there may not even be a single session interested in the tuple at all.

On the receiving side, sessions B<register> patterns. A pattern itself is
mostly just a tuple, but with a more limited set of allowed types: only those
types for which exact matching makes sense, like booleans, integers and
strings. A pattern matches an incoming tuple if the first C<n> elements of the
tuple exactly match the corresponding elements of the pattern. A special
I<wildcard> element may be used to match any value of any type.

A sessions thus only receives tuples from other sessions if they have
registered a pattern for them. As mentioned, it is not illegal to send a tuple
for which no other sessions have registered. In this case, the tuple will just
be discarded. It is also possible that many sessions have registered for a
matching pattern, in which case all of these sessions will receive the tuple.
As an additional rule, if a session sends out a tuple that matches one of its
own patterns, then it will receive its own tuple. (However, programming
interfaces might allow this to be detected and/or disabled if this eases the
implementation of a session).

Finally, there is the concept of a B<return-path>. Upon sending out a tuple, a
session may indicate that it is interested in receiving replies. The network
is then responsible for providing a return-path: a way for receivers of the
tuple to reply to it. When a tuple is received, the session has the option to
reply to it: a reply consists of one or more tuples that are sent directly to
the session from which the tuple originated, using this return-path. When a
receiver is done replying to the tuple or when it has no intention of sending
back a reply, it should close the return-path to indicate this. The session
that sent the original tuple is then notified that the return-path is closed,
and no more replies will be received. If there is no session that has
registered for the tuple, the return-path is closed immediately (or at least,
the sending session is notified that there won't be a reply). If the tuple is
received by multiple sessions, then the replies will be interleaved over the
return-path, and the path is closed when all of the receiving sessions have
closed their end.



=head1 Common design patterns and solutions

The previous section was rather abstract. This section provides several
examples on how to do common tasks and design patterns by using the previously
described concepts.


=head2 Broadcast notifications

This is commonly implemented in OOP systems using the I<Observer pattern>.
Implementing the same using tuples and patterns is an order of magnitude more
simple, as broadcast notifications are pretty much the native means of
communication.

In OOP you have the "observers" that can add themselves to the "observer list"
of any "object". This observer list is usually managed by the object that is to
be observed. If something happens to the object, it will walk through the
observer list and notify each observer.

If you represent an object as a session and define a notification as a tuple
that follows a certain pattern, then you very easily achieve the same
functionality as with an OOP implementation. In fact, there are some advantages
to doing it this way:

=over

=item *

Sessions stay registered to the same notifications even if the "object" (the
session that is being observed) is restarted or replaced with something else.
It's the network itself that keeps track of the registrations, not the sessions
that provide the notifications. Of course, this can be seen as a drawback, but
you can easily emulate OOP behaviour by providing an extra notification when
the "object" is shut down, indicating that the observing sessions can remove
their patterns.

=item *

Since there is no need for the session that is being observed to keep a list of
sessions that are observing it, it also doesn't have walk the list and send out
multiple notifications. Notifying the observers is as simple as sending out a
single tuple.

=item *

Many implementations of the Observer pattern maintain only a single list of
observers per object, and each listed observer will be notified for every
change to the object. For example, if an object maintains a list and provides
notifications when something is added and deleted to the list, every observer
will be notified of both the "added" action and the "deleted" action. The use
of tuples and patterns allows observers to register for all actions, or just
for a single one. If an "add" action would be notified with a tuple of
C<["object", "add", id]> and a "delete" action with
C<["object", "delete", id]>, then an observing session can register with the
pattern C<["object", *]> to be notified for both actions, or just
C<["object", "add"]> to register only for additions.

=back

Of course, this is only one way to implement a notification mechanism. There
are also solutions that more accurately mimic the behaviour of the Observer
pattern OOP in cases where that is desired.


=head2 Commands

A I<command> is what I call something along the lines of one session telling an
other session to do something. Suppose we have a session representing a file
system. A command for this session could then be something like "delete file
X".

In a sense, this isn't much different from a notification as described above.
The file system session would have registered a pattern like
C<["fs", "delete", *]>, where the wildcard is used for the file name. If an
other session then wants to have a file deleted, the only thing it will have to
do is send out a tuple matching that pattern, and the file system session will
take care of deleting it.

In the above scenario, the session sending the command has no feedback
whatsoever on whether the command has been successfully executed or not.
Whether this is acceptable depends of course on the specific application. One
way of still providing some form of feedback is to have the file system session
send out a notification tuple, e.g. C<["fs", "deleted", "file"]> (Note that the
second element is now C<deleted> rather than C<delete>. Using the same tuple
for actions and notifications is going to be very messy...). This way the
session sending the command, in addition to any other sessions that happen to
be interested in file deletion, will be notified of the deletion of the file.
An alternative solution is to use the RPC-like method, as described below.


=head2 RPC

L<RPC|http://en.wikipedia.org/wiki/Remote_procedure_cal> is in essence nothing
else than providing an interface similar to a regular function call to a
component that can't be reached via a regular function call (e.g. because the
object isn't inside the address space of the program). RPC is generally a
request-response type of interaction, and making use of the return-path
facility as I described earlier, all of the functionality of RPC is also
available with the concept of tuple communication.

=head3 Commands, the RPC-way

Take the previous file system example. Instead of just sending the command
tuple to delete the file, the session could indicate that it is interested in
replies and the network will create a return-path. If the return-path is closed
before any replies have been received, then the commanding session knows that
the file system session is either down or broken. Otherwise, the file system
session has the ability to send back a response. This could be a simple "okay,
file has been deleted" tuple if things went alright, or an error indication if
things didn't go too well. The commanding session has the option to either
block and wait for a reply (or a close of the return-path), or continue doing
whatever it wanted to do and asynchronously check for a reply.

The downside of using the return-path rather than the previously mentioned
notification approach is that other sessions can't easily be notified of file
deletion. Of course, an other session can register for the same pattern as the
file system did and thus receive the same command, but it would have no way of
knowing whether the delete was actually successful or not. For other sessions
to be notified as well, the file system session would probably have to send out
a notification tuple. Of course, it all depends on the application whether this
is necessary, you only have to implement the functionality that is necessary
for your purposes.

=head3 Requesting information

Another use of RPC, and thus also of the return-path, is to allow sessions to
request information from each other. Using the same example again, the file
system session could register for a pattern such as C<["fs", "list"]>. Upon
receiving a tuple matching that pattern, the session would send a list of all
its files over the return-path. Other sessions can then request this list by
simply sending out the right tuple and waiting for the replies.




=head1 Advantages over other systems

Now that I've hopefully convinced you that my communication concept is powerful
enough to build applications with it, you may be wondering why you should use
it instead of the other technologies. After all, you can achieve pretty much
the same functionality with just regular OOP, RPC, message passing, or other
systems. Let me present some of the inherent advantages that this system has
compared to others, and why it will help in designing flexible and modular
applications.

=head2 Loose coupling of components

Sessions (representing the components of a system) do not have to have a lot of
knowledge about each other. Sessions implicitly provide abstracted I<services>
using tuple communications, in much the same way as interfaces explicitly do in
OOP.

Very much unlike OOP, however, is that sessions do not even have to know of
each other how they should be used in threaded or event-based environments. For
example, threading in OOP is a pain: which objects should implement
synchronisation and which shouldn't? The answer to this question is not nearly
as obvious as it should be. With event-based systems, you'll always need to
worry about how long a certain function call block the callers' thread.  Since
communication between the different sessions is completely asynchronous, these
worries are gone.

=head2 Location independence

Sessions can communicate with other sessions without knowing I<where> they are.
This has as major advantage that a session can be moved around without having
to change a single line of code in any of the sessions relying on its service.
This allows sessions that communicate a lot with each other to be placed in the
same process, while resource-heavy sessions may be distributed among several
physical devices.

=head2 Programming language independence

All communication is solely done with tuples, which can be represented as
abstract objects and serialized and deserialized (or marshalled/unmarshalled,
whichever terminology you prefer) for communication. I used a JSON array as an
example of a tuple earlier, and perhaps it's not such a bad one: JSON data can
be interchanged between many programming languages, and are quite often not
that annoying in use. Still, there are many other alternatives (Bencoding, XML,
binary encodings, etc.), and it all depends on the exact data types and values
you wish to use for communication.

Language independence allows each session to be (re)implemented in a different
language, again without affecting any other sessions. Did you write an
application in a high-level language and noticed that performance wasn't as
good as you wanted? Then you can very easily rewrite the most resource-heavy
sessions in a low-level language such as C. Similarly, it allows developers to
hook into your application even when they are not familiar with your favorite
programming language.

=head2 Easy debugging

Not only can other applications and/or plugins hook into your application, you
can also connect a simple debugger to the network. The debugger just has to
register for a pattern and then print out any received tuples, allowing you to
see exactly what is being sent over the network and whether the sessions react
as expected. Similarly, the debugger could allow you to send tuples back to the
network and see whether the sessions react as they should. Unfortunately, what
is being sent over a return-path is generally not visible to anyone but the
receiver of the replies, although a network implementation might allow a
debugging application to look into that as well.



=head1 Where to go from here

What I've described above is nothing more than a bunch of ideas. To actually
use this, there's a lot to be done.

=over

=item Defining a "tuple"

What types can be used in tuples? Should a tuple have some maximum size or a
maximum number of elements? Should a C<NULL> type be included? What about a
boolean type, why not use the integers 1 and 0 for that? Should it be possible
to interchange binary data, or only UTF-8 strings?

What will be the size of an integer that a session can reasonably assume to be
available? Specifying something like "infinite" is going to be either
inefficient in terms of memory and CPU overhead or will require extra overhead
(in terms of code) in usage. Specifying that everything should fit in a 64bit
integer is a lot more practical, but may be somewhat annoying to cope with in
many dynamically typed languages running on 32bit architectures. Specifying
that integers are 32bits will definitely ease the implementation of the network
library in interpreted languages, but lowers the usefulness of the integer type
and is still a pain to use in OCaml (which has 31bit integers).

These choices greatly affect the ease of implementing a networking library for
specific programming languages and the ease of using the network to actually
develop an application.

=item The exact semantics of matching

Somewhat similar to the previous point, the semantics of matching tuples with
patterns should also be defined in some way. Some related questions are whether
values of different types may be equivalent. For example, is the string
C<"1234"> equivalent to an integer with that value? What about NULL and/or
boolean types? If there is a floating point type, you probably won't need exact
matching on those values (floating points are too imprecise for that anyway),
but you might still want the floating point number C<10.0> to match the integer
C<10> to ease the use in dynamic languages where the distinction between
integer and float is blurred.

=item Defining the protocol(s)

Making my vision of modularity and ease of use a reality requires that any
session can easily communicate with an other session, even if they have a
vastly different implementation. To do this, we need a protocol to connect
multiple processes together, whether they run on a local machine or over a
physical network.

=item Coding the stuff

Obviously, all of this remains as a mere concept if nothing ever gets
implemented. Easy-to-use libraries are needed for several programming
languages. And more importantly, actual applications will have to be developed
using these libraries.

=back

Of course, realizing all of the above is an iterative process. You can't write
an implementation without knowing what data types a tuple is made of, but it is
equally impossible to determine the exact definition of a tuple without having
experienced with an actual implementation.


=head2 What's the plan?

I've been working on documenting the basics of the semantics and the
point-to-point communication protocol, and have started on an early
implementation in the Go programming language to experiment with. I've dubbed
the project B<Tanja>, and have published my progress on a
L<git repo|http://g.blicky.net/tanja.git/>.

My intention is to also write implementations for C and Perl, experiment with
that, and see if I can refine the semantics to make this concept one that is
both efficient and easy to use.

Since I still have no idea whether this concept is actually a convenient one to
write large applications with, I'd love to experiment with that as well. My
original intention has always been to write a flexible client for the Direct
Connect network, possibly extending it to other P2P or chat networks in the
future.  So I'd love to write a large application using this concept, and see
how things work out.

In either case, if this article managed to get you interested in this concept
or in project Tanja, and you have any questions, feedback or (gasp!) feel like
helping out, don't hesitate to contact me! I'm available as 'Yorhel' on Direct
Connect at C<adc://blicky.net:2780> and IRC at C<irc.synirc.net>, or just drop
me a mail at C<projects@yorhel.nl>.
