Database Savoir Faire
June 19, 2000
Earlier we saw in our pre-loading script, startup.pl, a reference to
fastdb.pl. This mysterious script contains a variety of optimizations
which significantly grease the wheel of database queries between our Perl
scripts and the database management system, and is a recommended approach if
your Perl scripts perform any more than a smidgen of database queries.
Unfortunately, without these optimizations, database queries are horrifyingly
inefficient. Often times, each script invocation must establish a connection
to the database, which itself is a very costly process. Beyond that,
statement handles must be established and SQL statements need to be parsed
and compiled by the database driver. The more often you repeat similar
queries, the more inefficient this whole process becomes -- recall our
dishwashing example. Or, to use a different analogy, imagine unlocking and
entering a toolshed to retrieve a single tool, and then repeating this
process in its entirety for each of 10 individual tools.
The optimizations we've used in fastdb.pl are inspired and barely
modified from the work of
Jeffrey W. Baker,
who has nicely integrated a number of different database optimizations.
The first and foremost optimization when it comes to database queries is
persistent connections. As we said earlier, it is very costly to
re-establish a connection to the database every time a Perl script is
executed by Apache. Large savings are gained by opening a database
connection at the Apache parent server level, thus allowing each child
process to inherit the already open connection. In effect, this is like
unlocking and opening the toolshed door once, and leaving it open for
subsequent entries. Makes sense!
Sending an SQL to a database also involves a number of steps, including the
creation of a statement handle and the compilation of the SQL statement by
the database driver. An elegant way to optimize compilation of SQL
statements is to construct a statement "map" for the database
driver, which maps out what a particular SQL statement will look like. This
allows the driver to compile the statement once, and re-use this on
subsequent calls of the same statement. For example, suppose a typical
query performed by your Perl script requests the name of a user with a
known ID value:
select FIRSTNAME,LASTNAME
from USERTABLE
where ID = ?
Notice the question mark, which marks a placeholder in this SQL statement.
The database driver can compile this statement map, and when necessary, can
insert the actual value for ID in real-time by virtue of
parameter binding --
a feature of the DBI module which lets you assign a
specific value to a placeholder.
Using statement mapping, we can pre-construct the common SQL statements that
our script may need in the fastdb.pl script, both compiling these
statement maps in the database driver and establishing statement handles
that can be used "ready-to-wear" from within the Perl script. It
all sounds like a lot of nice talk, so let's look at fastdb.pl and
see how this works in action.
fastdb.pl
package MyPerl::FastDB;
#This package opens a persistent connection to the database
#(instead of using Apache::DBI), and prepares the primary SELECT
#statements we use, assigning each to a scalar statement handle.
#Optimization adapted heavily from Jeffrey W. Baker's guide to mod_perl performance
use strict;
use DBI;
sub connect {
if (defined $MyPerl::FastDB::conn) {
eval {
$MyPerl::FastDB::conn->ping;
};
if (!$@) {
return $MyPerl::FastDB::conn;
}
}
$MyPerl::FastDB::conn=DBI->connect('dbi:mysql:database_name',
'username','password',
{PrintError=>1,RaiseError=>1})
|| die $DBI::errstr;
#get username from ID query
$MyPerl::FastDB::selectUserFromID=$MyPerl::FastDB::conn->prepare(q{
select FIRSTNAME,LASTNAME
from USERTABLE
where ID = ?
});
#get ID from username query
$MyPerl::FastDB::selectIDfromUser=$MyPerl::FastDB::conn->prepare(q{
select ID
from USERTABLE
where FIRSTNAME = ?
and LASTNAME = ?
});
1;
As a package, fastdb.pl must end with a true value, hence the
1; on the last line. Moving to the top, the connect subroutine
will attempt to establish a database connection, if none exists, or else
will return the existing connection. This subroutine will be called from our
Perl script when it needs to retrieve a database handle (we'll see the code
shortly).
In the remainder of fastdb.pl, we pre-construct two statement maps:
one to retrieve a user's name given their ID, and one to retrieve the ID
given the user's name. Of course, these statements are based on a purely
fictional database that we simply imagine could exist. For each statement,
we prepare the statement map and assign the resulting statement handle to a
Perl scalar value, such as $MyPerl::FastDB::selectUserFromID. We'll
use these statement handles from within our Perl scripts, thereby enjoying
the benefits of the pre-compiled and cached SQL code.
More Who's Your Daddy
The Perl You Need to Know
Building the Optimized Beast
|