yhdev/dat/sqlaccess

Multi-threaded Access to an SQLite3 Database

=pod

(Published on B<2011-11-26>. Also available in L<POD|https://dev.yorhel.nl/dat/sqlaccess>.)

(Minor 2013-04-06 update: I abstracted my message passing solution from ncdc
and implemented it in a POSIX C library for general use. It's called
I<sqlasync> and is part of my L<Ylib library collection|https://dev.yorhel.nl/ylib>.)

=head1 Introduction

As I was porting L<ncdc|https://dev.yorhel.nl/ncdc> over to use SQLite3 as
storage backend, I stumbled on a problem: The program uses a few threads for
background jobs, and it would be nice to give these threads access to the
database.

Serializing all database access through the main thread wouldn't have been very
hard to implement in this particular case, but that would have been far from
optimal. The main thread is also responsible for keeping the user interface
responsive and handling most of the network interaction. Overall responsiveness
of the program would significantly improve when the threads could access the
database without involvement of the main thread.

Which brought me to the following questions: What solutions are available for
providing multi-threaded access to an SQLite database? What problems may I run
in to? I was unable to find a good overview in this area on the net, so I wrote
this article with the hope to improve that situation.


=head1 SQLite3 and threading

Let's first see what SQLite3 itself has to offer in terms of threading support.
The official documentation mentions threading support several times in various
places, but this information is scattered around and no good overview is given.
Someone has tried to organize this before on a L<single
page|http://www.sqlite.org/cvstrac/wiki?p=MultiThreading>, and while this
indeed gives a nice overview, it has unfortunately not been updated since 2006.
The advices are therefore a little on the conservative side.

Nonetheless, it is wise to remain portable with different SQLite versions,
especially when writing programs that dynamically link with some random version
installed on someone's system. It should be fairly safe to assume that SQLite
binaries provided by most systems, if not all, are compiled with thread safety
enabled. This doesn't mean all that much, unfortunately: The only thing
I<thread safe> means in this context is that you can use SQLite3 in multiple
threads, but a single database connection should still stay within a single
thread.

Since SQLite 3.3.1, which was released in early 2006, it is possible to move a
single database connection along multiple threads. Doing this with older
versions is not advisable, as explained in L<the SQLite
FAQ|http://www.sqlite.org/faq.html#q6>. But even with 3.3.1 and later there is
an annoying restriction: A connection can only be passed to another thread when
any outstanding statements are closed and finalized. In practice this means
that it is not possible to keep a prepared statement in memory for later
executions.

Since SQLite 3.5.0, released in 2007, a single SQLite connection can be used
from multiple threads simultaneously. SQLite will internally manage locks to
avoid any data corruption. I can't recommend making use of this facility,
however, as there are still many issues with the API. The L<error fetching
functions|http://www.sqlite.org/c3ref/errcode.html> and
L<sqlite3_last_insert_row_id()|http://www.sqlite.org/c3ref/last_insert_rowid.html>,
among others, are still useless without explicit locking in the application. I
also believe that the previously mentioned restriction on having to finalize
statements has been relaxed in this version, so keeping prepared statements in
memory and passing them among different threads becomes possible.

When using multiple database connections within a single process, SQLite offers
a facility to allow L<sharing of its
cache|http://www.sqlite.org/sharedcache.html>, in order to reduce memory usage
and disk I/O. The semantics of this feature have changed with different SQLite
versions and appear to have stabilised in 3.5.0. This feature may prove useful
to optimize certain situations, but does not open up new possibilities of
communicating with a shared database.


=head1 Criteria

Before looking at some available solutions, let's first determine the criteria
we can use to evaluate them.

=over

=item Implementation size

Obviously, a solution that requires only a few lines of code to implement is
preferable over one that requires several levels of abstraction in order to be
usable. I won't be giving actual implementations here, so the sizes will be
rough estimates for comparison purposes. The actual size of an implementation
is of course heavily dependent on the programming environment as well.

=item Memory/CPU overhead

The most efficient solution for a single-threaded application is to simply have
direct access to a single database connection. Every solution is in principle a
modification or extension of this idea, and will therefore add a certain
overhead. This overhead manifests itself in both increased CPU and memory
usage. The order of which varies between solutions.

=item Prepared statement re-use

Is it possible to prepare a statement once and keep using it for the lifetime
of the program? Or will prepared statements have to be thrown away and
recreated every time? Keeping statement handles in memory will result in a nice
performance boost for applications that run the same SQL statement many times.

=item Transaction grouping

A somewhat similar issue to prepared statement re-use: From a performance point
of view, it is very important to try to batch many UPDATE/DELETE/INSERT
statements within a single transaction, as opposed to running each modify query
separately. Running each query separately will force SQLite to flush the data
to disk separately every time, whereas a single transaction will batch-flush
all the changes to disk in a single go. Some solutions allow for grouping
multiple statements in a single transaction quite easily, while others require
more involved steps.

=item Background processing

In certain situations it may be desirable to queue a certain query for later
processing, without explicitly waiting for it to complete. For example, if
something in the database has to be modified as a result of user interaction in
a UI thread, then the application would feel a lot more responsive if the
UPDATE query was simply queued to be processed in a background thread than when
the query had run in the UI thread itself. A database accessing solution with
built-in support for background processing of queries will significantly help
with building a responsive application.

=item Concurrency

Concurrency indicates how well the solution allows for concurrent access. The
worst possible concurrency is achieved when a single database connection is
used for all threads, as only a single action can be performed on the database
at any point in time. Maximum concurrency is achieved when each thread has its
own SQLite connection. Note that maximum concurrency doesn't mean that the
database can be accessed in a I<fully> concurrent manner. SQLite uses internal
database-level locks to avoid data corruption, and these will limit the actual
maximum concurrency. I am not too knowledgeable about the inner workings of
these locks, but it is at least possible to have a large number truly
concurrent database I<reads>. Database I<writes> from multiple threads may
still allow for significantly more concurrency than when they are manually
serialized over a single database connection.

=item Portability

What is the minimum SQLite version required to implement the solution? Does it
require any special OS features or SQLite compilation settings? As outlined
above, different versions of SQLite offer different features with regards to
threading. Relying one of the relatively new features will decrease
portability.

=back


=head1 The Solutions

Here I present four solutions to allow database access from multiple threads.
Note that this list may not be exhaustive, these are just a few solutions that
I am aware of. Also note that none of the solutions presented here are in any
way new. Most of these paradigms date back to the entire notion of concurrent
programming, and have been applied in software since decades ago.


=head2 Connection sharing

By far the simplest solution to implement: Keep a single database connection
throughout your program and allow every thread to access it. Of course, you
will need to be careful to always put locks around the code where you access
the database handler. An example implementation could look like the following:

  // The global SQLite connection
  sqlite3 *db;

  int main(int argc, char **argv) {
    if(sqlite3_open("database.sqlite3", &db))
      exit(1);

    // start some threads
    // wait until the threads are finished

    sqlite3_close(db);
    return 0;
  }

  void *some_thread(void *arg) {
    sqlite3_mutex_enter(sqlite3_db_mutex(db));
    // Perform some queries on the database
    sqlite3_mutex_leave(sqlite3_db_mutex(db));
  }

=over

=item Implementation size

This is where connection sharing shines: There is little extra code required
when compared to using a database connection in a single-threaded context. All
you need to be careful of is to lock the mutex before using the database, and
to unlock it again afterwards.

=item Memory/CPU overhead

As the only addition to the single-threaded case are the locks, this solution
has practically no memory overhead. The mutexes are provided by SQLite,
after all. CPU overhead is also as minimal as it can be: mutexes are the most
primitive type provided by threading libraries to serialize access to a shared
resource, and are therefore very efficient.

=item Prepared statement re-use

Prepared statements can be safely re-used inside a single enter/leave block.
However, if you want to remain portable with SQLite versions before 3.5.0, then
any prepared statements B<must> be freed before the mutex is unlocked. This can
be a major downside if the enter/leave blocks themselves are relatively short
but accessed quite often. If portability with older versions is not an issue,
then this restriction is gone and prepared statements can be re-used easily.

=item Transaction grouping

A reliable implementation will not allow transactions to span multiple
enter/leave blocks. So as with prepared statements, transactions need to be
committed to disk before the mutex is unlocked. Again shared with prepared
statement re-use is that this limitation may prove to be a significant problem
in optimizing application performance, disk I/O in particular. One way to lower
the effects of this limitation is to increase the size of a single enter/leave
block, thus allowing for more work to be done in a single transaction. Code
restructuring may be required in order to efficiently implement this.  Another
way to get around this problem is to do allow a transaction to span multiple
enter/leave blocks. Implementing this reliably may not be an easy task,
however, and will most likely require application-specific knowledge.

=item Background processing

Background processing is not natively supported with connection sharing. It is
possible to spawn a background thread to perform database operations each time
that this is desirable. But care should be taken to make sure that these
background threads will execute dependent queries in the correct order.  For
example, if thread A spawns a background thread, say B, to execute an UPDATE
query, and later thread A wants to read that same data back, it must first wait
for thread B to finish execution. This may add more inter-thread communication
than is preferable.

=item Concurrency

There is no concurrency at all here. Since the database connection is protected
by an exclusive lock, only a single thread can operate on the database at any
point in time. Additionally, one may be tempted to increase the size of an
enter/leave block in order to allow for larger transactions or better re-use of
prepared statements. However, any time spent on performing operations that do
not directly use the database within such an enter/leave block will lower the
maximum possible database concurrency even further.

=item Portability

Connection sharing requires at least SQLite 3.3.1 in order to pass the same
database connection around. SQLite must be compiled with threading support
enabled. If prepared statements are kept around outside of an enter/leave
block, then version 3.5.0 or higher will be required.

=back


=head2 Message passing

An alternative approach is to allow only a single thread to access the
database. Any other thread that wants to access the database in any way will
then have to communicate with this database thread. This communication is done
by sending messages (I<requests>) to the database thread, and, when query
results are required, receiving back one or more I<response> messages.

Message passing schemes and libraries are available for many programming
languages and come in many different forms. For this article, I am going to
assume that an asynchronous and unbounded FIFO queue is used to pass around
messages, but most of the following discussion will apply to bounded queues as
well. I'll try to note the important differences between the two where
applicable.

A very simple and naive implementation of a message passing solution is given
below.  Here I assume that C<queue_create()> will create a message queue (type
C<message_queue>), C<queue_get()> will return the next message in the queue, or
block if the queue is empty. C<thread_create(func, arg)> will run I<func> in a
newly created thread and pass I<arg> as its argument. Error handling has been
ommitted to keep this example consice.

  void *db_thread(void *arg) {
    message_queue *q = arg;

    sqlite3 *db;
    if(sqlite3_open("database.sqlite3", &db))
      return ERROR;

    request_msg *m;
    while((m = queue_get(q)) {
      if(m->action == QUIT)
        break;
      if(m->action == EXEC)
        sqlite3_exec(db, m->query, NULL, NULL, NULL);
    }

    sqlite3_close(db);
    return OK;
  }

  int main(int argc, char **argv) {
    message_queue *db_queue = queue_create();
    thread_create(db_thread, db_queue);
    // Do work.
    return 0;
  }

This example implementation has a single database thread running in the
background that accepts the messages C<QUIT>, to stop processing queries and
close the database, and C<EXEC>, to run a certain query on the database. No
support is available yet for passing query results back to the thread that sent
the message. This can be implemented by including a separate C<message_queue>
object in the request messages, to which the results can be sent.

=over

=item Implementation size

This will largely depend on the used programming environment and the complexity
of the database thread. If your environment already comes with a message queue
implementation, and constructing the request/response messages is relatively
simple, then a simple implementation as shown above will not require much code.
On the other hand, if you have to implement your own message queue or want more
intelligence in the database thread to improve efficiency, then the complete
implementation may be significantly larger than that of connection sharing.

=item Memory/CPU overhead

Constructing and passing around messages will incur a CPU overhead, though with
an efficient implementation this should not be significant enough to worry
about. Memory usage is highly dependent on the size of the messages being
passed around and the length of the queue. If messages are queued faster than
they are processed and there is no bound on the queue length, then a process
may quickly run of out memory. On the other hand, if messages are processed
fast enough then the queue will generally not have more than a single message
in it, and the memory overhead will remain fairly small.

=item Prepared statement re-use

As the database connection will never leave the database thread, prepared
statements can be kept in memory and re-used without problems.

=item Transaction grouping

A naive but robust implementation will handle each message in its own
transaction. A more clever database thread, however, could wait for multiple
messages to be queued and can then batch-execute them in a single transaction.
Correctly implementing this may require some additional information to be
specified along with the request, such as whether the query may be combined in
a single transaction or whether it may only be executed outside of a
transaction. Some threads may want to have confirmation that the data has been
successfully written to disk, in which case responsiveness will not improve if
such actions are queued for later processing. Nonetheless, since the database
thread has all the knowledge about the state of the database and any
outstanding actions, transaction grouping can be implemented quite reliably.

=item Background processing

Background processing is supported natively with a message passing
implementation: a thread that isn't interested in query results can simply
queue the action to be performed by the database thread without indicating a
return path for the results. Of course, if a thread queues many messages that
do not require results followed by one that does, it will have to wait for all
earlier messages to be processed before receiving any results for the last one.
In the case that the actions are not dependent on each other, the database
thread may re-order the messages in order to process the last request first.
This requires knowledge about dependencies and may significantly complicate the
implementation, however.

=item Concurrency

As with a shared database connection, database access is exclusive: Only a
single action can be performed on the database at a time. Unlike connection
sharing, however, any processing within the application will not further
degrade the maximum attainable concurrency. As long as unbounded asynchronous
queues are used to pass around messages, the database thread will be able to
continue working on the database without waiting for another thread to process
the results.

=item Portability

This is where message passing shines: SQLite is only used within the database
thread, no other thread will have a need to call any SQLite function. This
allows any version of SQLite to be used, even those that have not been compiled
with thread safety enabled.

=back


=head2 Thread-local connections

A rather different approach to giving each thread access to a single database
is to simply open a new database connection for each thread. This way each
connection will be local to the specific thread, which in turn has the power to
do with it as it likes without worrying about what the other threads do. The
following is a short example to illustrate the idea:

  void *some_thread(void *arg) {
    sqlite3 *db;
    if(sqlite3_open("database.sqlite3", &db))
      return ERROR;

    // Do some work on the database

    sqlite3_close(db);
  }

  int main(int argc, char **argv) {
    int i;
    for(i=0; i<10; i++)
      thread_create(some_thread, NULL);

    // Wait until the threads are done

    return 0;
  }

=over

=item Implementation size

Giving each thread its own connection is practically not much different from
the single-threaded case where there is only a single database connection. And
as the example shows, this can be implemented quite trivially.

=item Memory/CPU overhead

If we assume that threads are not created very often and each thread has a
relatively long life, then the CPU and I/O overhead caused by opening a new
connection for each thread will not be very significant. On the other hand, if
threads are created quite often and lead a relatively short life before they
are destroyed again, then opening a new connection each time will soon require
more resources than running the queries themselves.

There is a significant memory overhead: every new database connection requires
memory. If each connection also has a separate cache, then every thread will
quickly require several megabytes only to interact with the database. Since
version 3.5.0, SQLite allows sharing of this cache with the other threads,
which will reduce this memory overhead.

=item Prepared statement re-use

Prepared statements can be re-used without limitations within a single thread.
This will allow full re-use of prepared statements if each thread has a
different task, in which case every thread will have different queries and
access patterns anyway. But when every thread runs the same code, and thus also
the same queries, it will still need its own copy of the prepared statement.
Prepared statements are specific to a single database connection, so they can't
be passed around between the threads. The same argument for CPU overhead works
here: as long as threads are long-lived, then this will not be a very large
problem.

=item Transaction grouping

Each thread has full access to its own database connection, so it can easily
batch many queries in a single transaction. It is not possible, however, to
group queries from the other threads in this same transaction as well. The
grouping may therefore not be as optimal as a message passing solution could
provide, but it is still a large improvement compared to connection sharing.

=item Background processing

Background processing is not easily possible. While it is possible to spawn a
separate thread for each query that needs to be processed in the background, a
new database connection will have to be opened every time this is done. This
solution will obviously not be very efficient.

=item Concurrency

In general, it is not possible to get better concurrency than by providing each
thread with its own database connection. This solution definitely wins in this
area.

=item Portability

Thread-local connections are very portable: the only requirement is that SQLite
has been built with threading support enabled. Connections are not passed
around between threads, so any SQLite version will do. In order to make use of
the shared cache feature, however, SQLite 3.5.0 is required.

=back


=head2 Connection pooling

A common approach in server-like applications is to have a connection pool.
When a thread wishes to have access to the database, it requests a database
connection from a pool of (currently) unused database connections. If no unused
connections are available, it can either wait until one becomes available, or
create a new database connection on its own.  When a thread is done with a
connection, it will add it back to the pool to allow it to be re-used in an
other thread.

The following example illustrates a basic connection pool implementation in
which a thread creates a new database connection when no connections are
available. A global C<db_pool> is defined, on which any thread can call
C<pool_pop()> to get an SQLite connection if there is one available, and
C<pool_push()> can be used to push a connection back to the pool. This pool can
be implemented as any kind of set: a FIFO or a stack could do the trick, as
long as it can be accessed from multiple threads concurrently.

  // Some global pool of database connections
  pool_t *db_pool;

  sqlite3 *get_database() {
    sqlite3 *db = pool_pop(db_pool);
    if(db)
      return db;
    if(sqlite3_open("database.sqlite3", &db))
      return NULL;
    return db;
  }

  void *some_thread(void *arg) {
    // Do some work

    sqlite3 *db = get_database();

    // Do some work on the database

    pool_push(db_pool, db);
  }

  int main(int argc, char **argv) {
    int i;
    for(i=0; i<10; i++)
      thread_create(some_thread, NULL);

    // Wait until the threads are done

    return 0;
  }

=over

=item Implementation size

A connection pool is in essense not very different from thread-local
connections. The only major difference is that the call to sqlite3_open() is
replaced with a function call to obtain a connection from the pool and
sqlite3_close() with one to give it back to the pool. As shown above, these
functions can be fairly simple. Note, however, that unlike with thread-local
connections it is advisable to "open" and "close" a connection more often in
long-running threads, in order to give other threads a chance to use the
connection as well.

=item Memory/CPU overhead

This mainly depends on the number of connections you allow to be in memory at
any point in time. If this number is not bounded, as in the above example, then
you can assume that after running your program for a certain time, there will
always be enough unused connections available in the pool. Requesting a
connection will then be very fast, since the overhead of creating a new
connection, as would have been done with thread-local connections, is
completely gone.

In terms of memory usage, however, it would be more efficient to put a maximum
limit on the number of open connections, and have the thread wait until another
thread gives a connection back to the pool. Similarly to thread-local
connections, memory usage can be decreased by using SQLite's cache sharing
feature.

=item Prepared statement re-use

Unfortunately, this is where connection pooling borrows from connection
sharing. Prepared statements must be cleaned up before passing a connection to
another thread if one aims to be portable. But even if you remove that
portability requirement, prepared statements are always specific to a single
connection. Since you can't assume that you will always get the same connection
from the pool, caching prepared statements is not practical.

On the other hand, a connection pool does allow you to use a single connection
for a longer period of time than with connection sharing without negatively
affecting concurrency. Unless, of course, there is a limit on the number of
open connections, in which case using a connection for a long period of time
may starve another thread.

=item Transaction grouping

Pretty much the same arguments with re-using prepared statements also apply to
transaction grouping: Transactions should be committed to disk before passing a
connection back to the pool.

=item Background processing

This is also where a connection pool shares a lot of similarity with connection
sharing. With thread-local storage, creating a worker thread to perform
database operations on the background would be very inefficient. But since this
inefficiency is being tackled by allowing connection re-use with a connection
pool, it's not a problem. Still the same warning applies with regard to
dependent queries, though.

=item Concurrency

Connection pooling gives you fine-grained control over how much concurrency
you'd like to have. For maximum concurrency, don't put a limit on the number of
maximum database connections. If there is a limit, then that will decrease the
maximim concurrency in favor of lower memory usage.

=item Portability

Since database connections are being passed among threads, connection pooling
will require at least SQLite 3.3.1 compiled with thread safety enabled. Making
use of its cache sharing capibilities to reduce memory usage will require
SQLite 3.5.0 or higher.

=back


=head1 Final notes

As for what I used for ncdc. I initially chose connection sharing, for its
simplicity. Then when I noticed that the UI became less responsive than I found
acceptable I started adding a simple queue for background processing of
queries. Later I stumbled upon the main problem with that solution: I wanted to
read back a value that was written in a background thread, and had no way of
knowing whether the background thread had finished executing that query or not.
I then decided to expand the background thread to allow for passing back query
results, and transformed everything into a full message passing solution. This
appears to be working well at the moment, and my current implementation has
support for both prepared statement re-use and transaction grouping, which
measurably increased performance.

To summarize, there isn't really a I<best> solution that works for every
application.  Connection sharing works well for applications where
responsiveness and concurrency isn't of major importance. Message passing works
well for applications that aim to be responsive, and is flexible enough for
optimizing CPU and I/O by re-using prepared statements and grouping queries in
larger transactions. Thread-local connections are suitable for applications
that have a relatively fixed number of threads, whereas connection pooling
works better for applications with a varying number of worker threads.

=cut