Database connection pooling

A technique that can be used with any kind of shared memory approach, but most commonly with threads.

To wit: each database handle is actually a process: a DB client process that parses the SQL, authenticates to the DB server, talks over the network and hands it to the DB in a more structured (internal) format. So anytime you call, say,:

# Connect to the database. Returns a handle.
my $dbconn = DBI->connect($datasource,$username,$password);

You're actually forking a process. Go you.

A step up in performance comes from persistent connections. PHP developers will recognize:

$db = pg_pconnect("host=localhost port=5432 dbname=mary");

In this case, pg_pconnect will look for an existing connection and try to use that; if one does not yet exist, it will create one, which will be used by later invocations of pg_pconnect. So your server only needs to fork the DB client driver once per, say, Apache process. So there's a nice bit of win. (Currently there are some wierd design failures in PHP's pconnect; like transaction blocks are carried between pages, since PHP doesn't transmit 'abort' to the handle on page clean-up.)

Can we do better? With a process-based server like Apache, each process has to create its own DB handle. With a large webserver, your DB might become very unhappy holding open all those client handles. Secondly, how frequently do you need to use the DB? Might it be better to only hold a handle for a dynamic page request, and not for a static file request? You could timeshare a smaller number of processes and make all the servers happy.

Enter the threaded server. All the threads play in the same memory space, so they can share handles. You can set a small number of handles, and any thread serving a dynamic page can request one. Then the overhead of (in AOLserver Tcl)

set db [ns_db gethandle]

is throwing down a Mutex and grabbing a pointer. This approach also makes it convenient to persist connections to DBs you might not use frequently, but a few pages might need it. You can define a new pool:

ns_section ns/db/pool/crazy-foreign-db
ns_param   connections        5
ns_param   verbose            $debug
ns_param   extendedtableinfo  true
ns_param   logsqlerrors       $debug

and not worry about it chewing up resources far out of scope to its usefulness, b/c again:

set db [ns_db gethandle crazy-foreign-db]

The 'pool' is where we get our terminology. With this approach, rather than handling DB connections in each page, you define a pool and then each page just asks for a handle. A big win, in execution time, space requirements, and clean code.

mod_scheme	Shared memory	mutex	vanity email addresses
Persistent	Anomalousness	email address	connection
Threads	Apache