Lots of changes: - Article about IPC - New TUWF release - New ncdu release - Atom feeds for the bug tracker - Bug tracker switch to sqlite
644 lines
28 KiB
Text
644 lines
28 KiB
Text
Multi-threaded Access to an SQLite3 Database
|
|
|
|
=pod
|
|
|
|
(Published on B<2011-11-26>. Also available in L<POD|https://dev.yorhel.nl/dat/sqlaccess>.)
|
|
|
|
(Minor 2013-04-06 update: I abstracted my message passing solution from ncdc
|
|
and implemented it in a POSIX C library for general use. It's called
|
|
I<sqlasync> and is part of my L<Ylib library collection|https://dev.yorhel.nl/ylib>.)
|
|
|
|
=head1 Introduction
|
|
|
|
As I was porting L<ncdc|https://dev.yorhel.nl/ncdc> over to use SQLite3 as
|
|
storage backend, I stumbled on a problem: The program uses a few threads for
|
|
background jobs, and it would be nice to give these threads access to the
|
|
database.
|
|
|
|
Serializing all database access through the main thread wouldn't have been very
|
|
hard to implement in this particular case, but that would have been far from
|
|
optimal. The main thread is also responsible for keeping the user interface
|
|
responsive and handling most of the network interaction. Overall responsiveness
|
|
of the program would significantly improve when the threads could access the
|
|
database without involvement of the main thread.
|
|
|
|
Which brought me to the following questions: What solutions are available for
|
|
providing multi-threaded access to an SQLite database? What problems may I run
|
|
in to? I was unable to find a good overview in this area on the net, so I wrote
|
|
this article with the hope to improve that situation.
|
|
|
|
|
|
|
|
|
|
=head1 SQLite3 and threading
|
|
|
|
Let's first see what SQLite3 itself has to offer in terms of threading support.
|
|
The official documentation mentions threading support several times in various
|
|
places, but this information is scattered around and no good overview is given.
|
|
Someone has tried to organize this before on a L<single
|
|
page|http://www.sqlite.org/cvstrac/wiki?p=MultiThreading>, and while this
|
|
indeed gives a nice overview, it has unfortunately not been updated since 2006.
|
|
The advices are therefore a little on the conservative side.
|
|
|
|
Nonetheless, it is wise to remain portable with different SQLite versions,
|
|
especially when writing programs that dynamically link with some random version
|
|
installed on someone's system. It should be fairly safe to assume that SQLite
|
|
binaries provided by most systems, if not all, are compiled with thread safety
|
|
enabled. This doesn't mean all that much, unfortunately: The only thing
|
|
I<thread safe> means in this context is that you can use SQLite3 in multiple
|
|
threads, but a single database connection should still stay within a single
|
|
thread.
|
|
|
|
Since SQLite 3.3.1, which was released in early 2006, it is possible to move a
|
|
single database connection along multiple threads. Doing this with older
|
|
versions is not advisable, as explained in L<the SQLite
|
|
FAQ|http://www.sqlite.org/faq.html#q6>. But even with 3.3.1 and later there is
|
|
an annoying restriction: A connection can only be passed to another thread when
|
|
any outstanding statements are closed and finalized. In practice this means
|
|
that it is not possible to keep a prepared statement in memory for later
|
|
executions.
|
|
|
|
Since SQLite 3.5.0, released in 2007, a single SQLite connection can be used
|
|
from multiple threads simultaneously. SQLite will internally manage locks to
|
|
avoid any data corruption. I can't recommend making use of this facility,
|
|
however, as there are still many issues with the API. The L<error fetching
|
|
functions|http://www.sqlite.org/c3ref/errcode.html> and
|
|
L<sqlite3_last_insert_row_id()|http://www.sqlite.org/c3ref/last_insert_rowid.html>,
|
|
among others, are still useless without explicit locking in the application. I
|
|
also believe that the previously mentioned restriction on having to finalize
|
|
statements has been relaxed in this version, so keeping prepared statements in
|
|
memory and passing them among different threads becomes possible.
|
|
|
|
When using multiple database connections within a single process, SQLite offers
|
|
a facility to allow L<sharing of its
|
|
cache|http://www.sqlite.org/sharedcache.html>, in order to reduce memory usage
|
|
and disk I/O. The semantics of this feature have changed with different SQLite
|
|
versions and appear to have stabilised in 3.5.0. This feature may prove useful
|
|
to optimize certain situations, but does not open up new possibilities of
|
|
communicating with a shared database.
|
|
|
|
|
|
|
|
=head1 Criteria
|
|
|
|
Before looking at some available solutions, let's first determine the criteria
|
|
we can use to evaluate them.
|
|
|
|
=over
|
|
|
|
=item Implementation size
|
|
|
|
Obviously, a solution that requires only a few lines of code to implement is
|
|
preferable over one that requires several levels of abstraction in order to be
|
|
usable. I won't be giving actual implementations here, so the sizes will be
|
|
rough estimates for comparison purposes. The actual size of an implementation
|
|
is of course heavily dependent on the programming environment as well.
|
|
|
|
=item Memory/CPU overhead
|
|
|
|
The most efficient solution for a single-threaded application is to simply have
|
|
direct access to a single database connection. Every solution is in principle a
|
|
modification or extension of this idea, and will therefore add a certain
|
|
overhead. This overhead manifests itself in both increased CPU and memory
|
|
usage. The order of which varies between solutions.
|
|
|
|
=item Prepared statement re-use
|
|
|
|
Is it possible to prepare a statement once and keep using it for the lifetime
|
|
of the program? Or will prepared statements have to be thrown away and
|
|
recreated every time? Keeping statement handles in memory will result in a nice
|
|
performance boost for applications that run the same SQL statement many times.
|
|
|
|
=item Transaction grouping
|
|
|
|
A somewhat similar issue to prepared statement re-use: From a performance point
|
|
of view, it is very important to try to batch many UPDATE/DELETE/INSERT
|
|
statements within a single transaction, as opposed to running each modify query
|
|
separately. Running each query separately will force SQLite to flush the data
|
|
to disk separately every time, whereas a single transaction will batch-flush
|
|
all the changes to disk in a single go. Some solutions allow for grouping
|
|
multiple statements in a single transaction quite easily, while others require
|
|
more involved steps.
|
|
|
|
=item Background processing
|
|
|
|
In certain situations it may be desirable to queue a certain query for later
|
|
processing, without explicitly waiting for it to complete. For example, if
|
|
something in the database has to be modified as a result of user interaction in
|
|
a UI thread, then the application would feel a lot more responsive if the
|
|
UPDATE query was simply queued to be processed in a background thread than when
|
|
the query had run in the UI thread itself. A database accessing solution with
|
|
built-in support for background processing of queries will significantly help
|
|
with building a responsive application.
|
|
|
|
=item Concurrency
|
|
|
|
Concurrency indicates how well the solution allows for concurrent access. The
|
|
worst possible concurrency is achieved when a single database connection is
|
|
used for all threads, as only a single action can be performed on the database
|
|
at any point in time. Maximum concurrency is achieved when each thread has its
|
|
own SQLite connection. Note that maximum concurrency doesn't mean that the
|
|
database can be accessed in a I<fully> concurrent manner. SQLite uses internal
|
|
database-level locks to avoid data corruption, and these will limit the actual
|
|
maximum concurrency. I am not too knowledgeable about the inner workings of
|
|
these locks, but it is at least possible to have a large number truly
|
|
concurrent database I<reads>. Database I<writes> from multiple threads may
|
|
still allow for significantly more concurrency than when they are manually
|
|
serialized over a single database connection.
|
|
|
|
=item Portability
|
|
|
|
What is the minimum SQLite version required to implement the solution? Does it
|
|
require any special OS features or SQLite compilation settings? As outlined
|
|
above, different versions of SQLite offer different features with regards to
|
|
threading. Relying one of the relatively new features will decrease
|
|
portability.
|
|
|
|
=back
|
|
|
|
|
|
|
|
|
|
=head1 The Solutions
|
|
|
|
Here I present four solutions to allow database access from multiple threads.
|
|
Note that this list may not be exhaustive, these are just a few solutions that
|
|
I am aware of. Also note that none of the solutions presented here are in any
|
|
way new. Most of these paradigms date back to the entire notion of concurrent
|
|
programming, and have been applied in software since decades ago.
|
|
|
|
|
|
=head2 Connection sharing
|
|
|
|
By far the simplest solution to implement: Keep a single database connection
|
|
throughout your program and allow every thread to access it. Of course, you
|
|
will need to be careful to always put locks around the code where you access
|
|
the database handler. An example implementation could look like the following:
|
|
|
|
// The global SQLite connection
|
|
sqlite3 *db;
|
|
|
|
int main(int argc, char **argv) {
|
|
if(sqlite3_open("database.sqlite3", &db))
|
|
exit(1);
|
|
|
|
// start some threads
|
|
// wait until the threads are finished
|
|
|
|
sqlite3_close(db);
|
|
return 0;
|
|
}
|
|
|
|
void *some_thread(void *arg) {
|
|
sqlite3_mutex_enter(sqlite3_db_mutex(db));
|
|
// Perform some queries on the database
|
|
sqlite3_mutex_leave(sqlite3_db_mutex(db));
|
|
}
|
|
|
|
=over
|
|
|
|
=item Implementation size
|
|
|
|
This is where connection sharing shines: There is little extra code required
|
|
when compared to using a database connection in a single-threaded context. All
|
|
you need to be careful of is to lock the mutex before using the database, and
|
|
to unlock it again afterwards.
|
|
|
|
=item Memory/CPU overhead
|
|
|
|
As the only addition to the single-threaded case are the locks, this solution
|
|
has practically no memory overhead. The mutexes are provided by SQLite,
|
|
after all. CPU overhead is also as minimal as it can be: mutexes are the most
|
|
primitive type provided by threading libraries to serialize access to a shared
|
|
resource, and are therefore very efficient.
|
|
|
|
=item Prepared statement re-use
|
|
|
|
Prepared statements can be safely re-used inside a single enter/leave block.
|
|
However, if you want to remain portable with SQLite versions before 3.5.0, then
|
|
any prepared statements B<must> be freed before the mutex is unlocked. This can
|
|
be a major downside if the enter/leave blocks themselves are relatively short
|
|
but accessed quite often. If portability with older versions is not an issue,
|
|
then this restriction is gone and prepared statements can be re-used easily.
|
|
|
|
=item Transaction grouping
|
|
|
|
A reliable implementation will not allow transactions to span multiple
|
|
enter/leave blocks. So as with prepared statements, transactions need to be
|
|
committed to disk before the mutex is unlocked. Again shared with prepared
|
|
statement re-use is that this limitation may prove to be a significant problem
|
|
in optimizing application performance, disk I/O in particular. One way to lower
|
|
the effects of this limitation is to increase the size of a single enter/leave
|
|
block, thus allowing for more work to be done in a single transaction. Code
|
|
restructuring may be required in order to efficiently implement this. Another
|
|
way to get around this problem is to do allow a transaction to span multiple
|
|
enter/leave blocks. Implementing this reliably may not be an easy task,
|
|
however, and will most likely require application-specific knowledge.
|
|
|
|
=item Background processing
|
|
|
|
Background processing is not natively supported with connection sharing. It is
|
|
possible to spawn a background thread to perform database operations each time
|
|
that this is desirable. But care should be taken to make sure that these
|
|
background threads will execute dependent queries in the correct order. For
|
|
example, if thread A spawns a background thread, say B, to execute an UPDATE
|
|
query, and later thread A wants to read that same data back, it must first wait
|
|
for thread B to finish execution. This may add more inter-thread communication
|
|
than is preferable.
|
|
|
|
=item Concurrency
|
|
|
|
There is no concurrency at all here. Since the database connection is protected
|
|
by an exclusive lock, only a single thread can operate on the database at any
|
|
point in time. Additionally, one may be tempted to increase the size of an
|
|
enter/leave block in order to allow for larger transactions or better re-use of
|
|
prepared statements. However, any time spent on performing operations that do
|
|
not directly use the database within such an enter/leave block will lower the
|
|
maximum possible database concurrency even further.
|
|
|
|
=item Portability
|
|
|
|
Connection sharing requires at least SQLite 3.3.1 in order to pass the same
|
|
database connection around. SQLite must be compiled with threading support
|
|
enabled. If prepared statements are kept around outside of an enter/leave
|
|
block, then version 3.5.0 or higher will be required.
|
|
|
|
=back
|
|
|
|
|
|
=head2 Message passing
|
|
|
|
An alternative approach is to allow only a single thread to access the
|
|
database. Any other thread that wants to access the database in any way will
|
|
then have to communicate with this database thread. This communication is done
|
|
by sending messages (I<requests>) to the database thread, and, when query
|
|
results are required, receiving back one or more I<response> messages.
|
|
|
|
Message passing schemes and libraries are available for many programming
|
|
languages and come in many different forms. For this article, I am going to
|
|
assume that an asynchronous and unbounded FIFO queue is used to pass around
|
|
messages, but most of the following discussion will apply to bounded queues as
|
|
well. I'll try to note the important differences between the two where
|
|
applicable.
|
|
|
|
A very simple and naive implementation of a message passing solution is given
|
|
below. Here I assume that C<queue_create()> will create a message queue (type
|
|
C<message_queue>), C<queue_get()> will return the next message in the queue, or
|
|
block if the queue is empty. C<thread_create(func, arg)> will run I<func> in a
|
|
newly created thread and pass I<arg> as its argument. Error handling has been
|
|
ommitted to keep this example consice.
|
|
|
|
void *db_thread(void *arg) {
|
|
message_queue *q = arg;
|
|
|
|
sqlite3 *db;
|
|
if(sqlite3_open("database.sqlite3", &db))
|
|
return ERROR;
|
|
|
|
request_msg *m;
|
|
while((m = queue_get(q)) {
|
|
if(m->action == QUIT)
|
|
break;
|
|
if(m->action == EXEC)
|
|
sqlite3_exec(db, m->query, NULL, NULL, NULL);
|
|
}
|
|
|
|
sqlite3_close(db);
|
|
return OK;
|
|
}
|
|
|
|
int main(int argc, char **argv) {
|
|
message_queue *db_queue = queue_create();
|
|
thread_create(db_thread, db_queue);
|
|
// Do work.
|
|
return 0;
|
|
}
|
|
|
|
This example implementation has a single database thread running in the
|
|
background that accepts the messages C<QUIT>, to stop processing queries and
|
|
close the database, and C<EXEC>, to run a certain query on the database. No
|
|
support is available yet for passing query results back to the thread that sent
|
|
the message. This can be implemented by including a separate C<message_queue>
|
|
object in the request messages, to which the results can be sent.
|
|
|
|
=over
|
|
|
|
=item Implementation size
|
|
|
|
This will largely depend on the used programming environment and the complexity
|
|
of the database thread. If your environment already comes with a message queue
|
|
implementation, and constructing the request/response messages is relatively
|
|
simple, then a simple implementation as shown above will not require much code.
|
|
On the other hand, if you have to implement your own message queue or want more
|
|
intelligence in the database thread to improve efficiency, then the complete
|
|
implementation may be significantly larger than that of connection sharing.
|
|
|
|
=item Memory/CPU overhead
|
|
|
|
Constructing and passing around messages will incur a CPU overhead, though with
|
|
an efficient implementation this should not be significant enough to worry
|
|
about. Memory usage is highly dependent on the size of the messages being
|
|
passed around and the length of the queue. If messages are queued faster than
|
|
they are processed and there is no bound on the queue length, then a process
|
|
may quickly run of out memory. On the other hand, if messages are processed
|
|
fast enough then the queue will generally not have more than a single message
|
|
in it, and the memory overhead will remain fairly small.
|
|
|
|
=item Prepared statement re-use
|
|
|
|
As the database connection will never leave the database thread, prepared
|
|
statements can be kept in memory and re-used without problems.
|
|
|
|
=item Transaction grouping
|
|
|
|
A naive but robust implementation will handle each message in its own
|
|
transaction. A more clever database thread, however, could wait for multiple
|
|
messages to be queued and can then batch-execute them in a single transaction.
|
|
Correctly implementing this may require some additional information to be
|
|
specified along with the request, such as whether the query may be combined in
|
|
a single transaction or whether it may only be executed outside of a
|
|
transaction. Some threads may want to have confirmation that the data has been
|
|
successfully written to disk, in which case responsiveness will not improve if
|
|
such actions are queued for later processing. Nonetheless, since the database
|
|
thread has all the knowledge about the state of the database and any
|
|
outstanding actions, transaction grouping can be implemented quite reliably.
|
|
|
|
=item Background processing
|
|
|
|
Background processing is supported natively with a message passing
|
|
implementation: a thread that isn't interested in query results can simply
|
|
queue the action to be performed by the database thread without indicating a
|
|
return path for the results. Of course, if a thread queues many messages that
|
|
do not require results followed by one that does, it will have to wait for all
|
|
earlier messages to be processed before receiving any results for the last one.
|
|
In the case that the actions are not dependent on each other, the database
|
|
thread may re-order the messages in order to process the last request first.
|
|
This requires knowledge about dependencies and may significantly complicate the
|
|
implementation, however.
|
|
|
|
=item Concurrency
|
|
|
|
As with a shared database connection, database access is exclusive: Only a
|
|
single action can be performed on the database at a time. Unlike connection
|
|
sharing, however, any processing within the application will not further
|
|
degrade the maximum attainable concurrency. As long as unbounded asynchronous
|
|
queues are used to pass around messages, the database thread will be able to
|
|
continue working on the database without waiting for another thread to process
|
|
the results.
|
|
|
|
=item Portability
|
|
|
|
This is where message passing shines: SQLite is only used within the database
|
|
thread, no other thread will have a need to call any SQLite function. This
|
|
allows any version of SQLite to be used, even those that have not been compiled
|
|
with thread safety enabled.
|
|
|
|
=back
|
|
|
|
|
|
=head2 Thread-local connections
|
|
|
|
A rather different approach to giving each thread access to a single database
|
|
is to simply open a new database connection for each thread. This way each
|
|
connection will be local to the specific thread, which in turn has the power to
|
|
do with it as it likes without worrying about what the other threads do. The
|
|
following is a short example to illustrate the idea:
|
|
|
|
void *some_thread(void *arg) {
|
|
sqlite3 *db;
|
|
if(sqlite3_open("database.sqlite3", &db))
|
|
return ERROR;
|
|
|
|
// Do some work on the database
|
|
|
|
sqlite3_close(db);
|
|
}
|
|
|
|
int main(int argc, char **argv) {
|
|
int i;
|
|
for(i=0; i<10; i++)
|
|
thread_create(some_thread, NULL);
|
|
|
|
// Wait until the threads are done
|
|
|
|
return 0;
|
|
}
|
|
|
|
=over
|
|
|
|
=item Implementation size
|
|
|
|
Giving each thread its own connection is practically not much different from
|
|
the single-threaded case where there is only a single database connection. And
|
|
as the example shows, this can be implemented quite trivially.
|
|
|
|
=item Memory/CPU overhead
|
|
|
|
If we assume that threads are not created very often and each thread has a
|
|
relatively long life, then the CPU and I/O overhead caused by opening a new
|
|
connection for each thread will not be very significant. On the other hand, if
|
|
threads are created quite often and lead a relatively short life before they
|
|
are destroyed again, then opening a new connection each time will soon require
|
|
more resources than running the queries themselves.
|
|
|
|
There is a significant memory overhead: every new database connection requires
|
|
memory. If each connection also has a separate cache, then every thread will
|
|
quickly require several megabytes only to interact with the database. Since
|
|
version 3.5.0, SQLite allows sharing of this cache with the other threads,
|
|
which will reduce this memory overhead.
|
|
|
|
=item Prepared statement re-use
|
|
|
|
Prepared statements can be re-used without limitations within a single thread.
|
|
This will allow full re-use of prepared statements if each thread has a
|
|
different task, in which case every thread will have different queries and
|
|
access patterns anyway. But when every thread runs the same code, and thus also
|
|
the same queries, it will still need its own copy of the prepared statement.
|
|
Prepared statements are specific to a single database connection, so they can't
|
|
be passed around between the threads. The same argument for CPU overhead works
|
|
here: as long as threads are long-lived, then this will not be a very large
|
|
problem.
|
|
|
|
=item Transaction grouping
|
|
|
|
Each thread has full access to its own database connection, so it can easily
|
|
batch many queries in a single transaction. It is not possible, however, to
|
|
group queries from the other threads in this same transaction as well. The
|
|
grouping may therefore not be as optimal as a message passing solution could
|
|
provide, but it is still a large improvement compared to connection sharing.
|
|
|
|
=item Background processing
|
|
|
|
Background processing is not easily possible. While it is possible to spawn a
|
|
separate thread for each query that needs to be processed in the background, a
|
|
new database connection will have to be opened every time this is done. This
|
|
solution will obviously not be very efficient.
|
|
|
|
=item Concurrency
|
|
|
|
In general, it is not possible to get better concurrency than by providing each
|
|
thread with its own database connection. This solution definitely wins in this
|
|
area.
|
|
|
|
=item Portability
|
|
|
|
Thread-local connections are very portable: the only requirement is that SQLite
|
|
has been built with threading support enabled. Connections are not passed
|
|
around between threads, so any SQLite version will do. In order to make use of
|
|
the shared cache feature, however, SQLite 3.5.0 is required.
|
|
|
|
=back
|
|
|
|
|
|
=head2 Connection pooling
|
|
|
|
A common approach in server-like applications is to have a connection pool.
|
|
When a thread wishes to have access to the database, it requests a database
|
|
connection from a pool of (currently) unused database connections. If no unused
|
|
connections are available, it can either wait until one becomes available, or
|
|
create a new database connection on its own. When a thread is done with a
|
|
connection, it will add it back to the pool to allow it to be re-used in an
|
|
other thread.
|
|
|
|
The following example illustrates a basic connection pool implementation in
|
|
which a thread creates a new database connection when no connections are
|
|
available. A global C<db_pool> is defined, on which any thread can call
|
|
C<pool_pop()> to get an SQLite connection if there is one available, and
|
|
C<pool_push()> can be used to push a connection back to the pool. This pool can
|
|
be implemented as any kind of set: a FIFO or a stack could do the trick, as
|
|
long as it can be accessed from multiple threads concurrently.
|
|
|
|
// Some global pool of database connections
|
|
pool_t *db_pool;
|
|
|
|
sqlite3 *get_database() {
|
|
sqlite3 *db = pool_pop(db_pool);
|
|
if(db)
|
|
return db;
|
|
if(sqlite3_open("database.sqlite3", &db))
|
|
return NULL;
|
|
return db;
|
|
}
|
|
|
|
void *some_thread(void *arg) {
|
|
// Do some work
|
|
|
|
sqlite3 *db = get_database();
|
|
|
|
// Do some work on the database
|
|
|
|
pool_push(db_pool, db);
|
|
}
|
|
|
|
int main(int argc, char **argv) {
|
|
int i;
|
|
for(i=0; i<10; i++)
|
|
thread_create(some_thread, NULL);
|
|
|
|
// Wait until the threads are done
|
|
|
|
return 0;
|
|
}
|
|
|
|
=over
|
|
|
|
=item Implementation size
|
|
|
|
A connection pool is in essense not very different from thread-local
|
|
connections. The only major difference is that the call to sqlite3_open() is
|
|
replaced with a function call to obtain a connection from the pool and
|
|
sqlite3_close() with one to give it back to the pool. As shown above, these
|
|
functions can be fairly simple. Note, however, that unlike with thread-local
|
|
connections it is advisable to "open" and "close" a connection more often in
|
|
long-running threads, in order to give other threads a chance to use the
|
|
connection as well.
|
|
|
|
=item Memory/CPU overhead
|
|
|
|
This mainly depends on the number of connections you allow to be in memory at
|
|
any point in time. If this number is not bounded, as in the above example, then
|
|
you can assume that after running your program for a certain time, there will
|
|
always be enough unused connections available in the pool. Requesting a
|
|
connection will then be very fast, since the overhead of creating a new
|
|
connection, as would have been done with thread-local connections, is
|
|
completely gone.
|
|
|
|
In terms of memory usage, however, it would be more efficient to put a maximum
|
|
limit on the number of open connections, and have the thread wait until another
|
|
thread gives a connection back to the pool. Similarly to thread-local
|
|
connections, memory usage can be decreased by using SQLite's cache sharing
|
|
feature.
|
|
|
|
=item Prepared statement re-use
|
|
|
|
Unfortunately, this is where connection pooling borrows from connection
|
|
sharing. Prepared statements must be cleaned up before passing a connection to
|
|
another thread if one aims to be portable. But even if you remove that
|
|
portability requirement, prepared statements are always specific to a single
|
|
connection. Since you can't assume that you will always get the same connection
|
|
from the pool, caching prepared statements is not practical.
|
|
|
|
On the other hand, a connection pool does allow you to use a single connection
|
|
for a longer period of time than with connection sharing without negatively
|
|
affecting concurrency. Unless, of course, there is a limit on the number of
|
|
open connections, in which case using a connection for a long period of time
|
|
may starve another thread.
|
|
|
|
=item Transaction grouping
|
|
|
|
Pretty much the same arguments with re-using prepared statements also apply to
|
|
transaction grouping: Transactions should be committed to disk before passing a
|
|
connection back to the pool.
|
|
|
|
=item Background processing
|
|
|
|
This is also where a connection pool shares a lot of similarity with connection
|
|
sharing. With thread-local storage, creating a worker thread to perform
|
|
database operations on the background would be very inefficient. But since this
|
|
inefficiency is being tackled by allowing connection re-use with a connection
|
|
pool, it's not a problem. Still the same warning applies with regard to
|
|
dependent queries, though.
|
|
|
|
=item Concurrency
|
|
|
|
Connection pooling gives you fine-grained control over how much concurrency
|
|
you'd like to have. For maximum concurrency, don't put a limit on the number of
|
|
maximum database connections. If there is a limit, then that will decrease the
|
|
maximim concurrency in favor of lower memory usage.
|
|
|
|
=item Portability
|
|
|
|
Since database connections are being passed among threads, connection pooling
|
|
will require at least SQLite 3.3.1 compiled with thread safety enabled. Making
|
|
use of its cache sharing capibilities to reduce memory usage will require
|
|
SQLite 3.5.0 or higher.
|
|
|
|
=back
|
|
|
|
|
|
|
|
|
|
=head1 Final notes
|
|
|
|
As for what I used for ncdc. I initially chose connection sharing, for its
|
|
simplicity. Then when I noticed that the UI became less responsive than I found
|
|
acceptable I started adding a simple queue for background processing of
|
|
queries. Later I stumbled upon the main problem with that solution: I wanted to
|
|
read back a value that was written in a background thread, and had no way of
|
|
knowing whether the background thread had finished executing that query or not.
|
|
I then decided to expand the background thread to allow for passing back query
|
|
results, and transformed everything into a full message passing solution. This
|
|
appears to be working well at the moment, and my current implementation has
|
|
support for both prepared statement re-use and transaction grouping, which
|
|
measurably increased performance.
|
|
|
|
To summarize, there isn't really a I<best> solution that works for every
|
|
application. Connection sharing works well for applications where
|
|
responsiveness and concurrency isn't of major importance. Message passing works
|
|
well for applications that aim to be responsive, and is flexible enough for
|
|
optimizing CPU and I/O by re-using prepared statements and grouping queries in
|
|
larger transactions. Thread-local connections are suitable for applications
|
|
that have a relatively fixed number of threads, whereas connection pooling
|
|
works better for applications with a varying number of worker threads.
|
|
|
|
=cut
|