Run-time Timing - Page 159
January 22, 2001
While the benchmarking examples we've seen so far are compelling,
they've also been measured under laboratory conditions — in
this case, over many interations, outside of the context of a
larger script. As well, if you already have a large script and
want to start benchmarking, it may not be feasible to start
coding alternative subroutines all over the place.
Sometimes, a simple stopwatch approach is enough to draw your
focus to problem areas (no, not those problem areas).
Using Benchmark, we can simulate a stopwatch where we simply want
to trigger timing at a certain point in a script, and then stop
timing, and spit out the results, without stopping the flow of
the script.
So, imagine a back-end script that outputs some results to the
browser. Maybe we simply want to time the execution of the entire
script.
#!/usr/bin/perl
#This is an amazing back-end script
use Benchmark;
use Time::HiRes;
use LotsOfOtherStuff;
my $time0 = new Benchmark;
...all of the control code for the script...
my $time1 = new Benchmark;
print "Script Benchmark: ".timestr(timediff($t1,$t0));
Notice the use of Time::HiRes, which can help with
reporting benchmark times for fast scripts, where small fractions
of a second should be measurable. Like a stopwatch, we simply
take note of the time at the start of the script, and again at
the end, and report the difference. Assuming this is a CGI script
the results will probably be printed to the Web page (also
assuming proper headers, etc.), but you might alternatively wish
to open a filehandle to a log file and send the
print function to the log file.
Timing the entire script execution like this can be enlightening,
but should also be interpreted with caution. For one, remember
that this timer measures only the execution time of the script,
not the time it takes for its output to reach your browser.
Network performance is not being measured here. Web server
performance is also not generally being measured, although in
some cases -- such as Apache with mod_perl —
these numbers are partially reflecting the Web server's
performance at executing Perl code.
To get more use out of the stopwatch technique, consider adding
several stopwatches to your script, wrapped around different
portions of code. This way, you can get inside the machine's head
and see a behind-the-scenes peek at which portions of code are
taking how long to execute. A particularly effective technique is
to wrap each subroutine inside a stopwatch. Simply:
sub mySubroutine {
my $time0 = new Benchmark;
...original subroutine code...
my $time1 = new Benchmark;
print "mySubroutine benchmark: ".
timestr(timediff($t1,$t0));
}
Due to the scoping of my variables, you can re-use
$time0 and $time1 inside each
subroutine without conflict. Remember to modify the label for
each benchmark so that you can recognize which subroutine the
measurement applies to. When the script runs, for instance in a
Web environment, the resulting Web page may then contain a
rundown for each subroutine and its execution time. Because this
technique does not perform many iterations, you should invoke the
script a number of times to get a sense of consistency from the
measurements.
Ultimately, the goal of adding stopwatches to your script is to
find any abnormally slow subroutines or sections of code. This
quick diagnosis can point you to trouble spots quickly. In the
long run, focusing on the unusually slow sections of code will
yield greater efficiency than attempting to benchmark
alternatives for every routine in the script.
Troubleshooting in Context
Finding a slow segment of code is one thing — identifying
the hold-up, and a solution, can be quite another. As we saw in
earlier examples, there are many times that the Perl language
itself can be the source of slow performance. We've seen that
there can be a variety of ways to address a problem, using Perl,
and some are much faster than others. As a general rule of thumb,
Perl will perform faster when you rely on built-in functions; for
example, we saw that the map function performed much
faster than building a foreach loop of our own. The
lesson here is "don't re-invent the wheel" — let Perl do
the work, whenever possible.
Equally important to identifying slow Perl syntax, though, are
considerations beyond the language itself. For example, if a
particular subroutine is accessing data, consider where the data
is stored. Data retrieved from disk will likely be slower than
data retrieved from memory. Beyond that, the retrieval time of
data on disk can vary considerably, depending on whether the data
is stored in a simple, slow database or a fast, indexed database.
In any case, the delay in retrieving data will inflate the
execution time of such a subroutine, without being the fault of
the Perl code itself, but of the larger architecture.
Similarly, when a Perl subroutine retrieves data from a database,
does it establish a connection to the database on each
invocation, or can it rely on an existing connection already
established (known as "persistent connections")? Again, this
bottleneck can inflate the execution time of such a segment of
code.
Portions of code that make network requests are going to be
subject to all of the delays inherent in networks. Once again,
reducing these bottlenecks may not involve changing the Perl code
at all, but changing the architecture of the system, whether that
includes other pieces of software or the hardware itself.
Ultimately, the point is that measuring execution time of code is
important, but intepreting it in context is even more important:
the code itself may, or may not be, to blame for a bottleneck.
Apples to Apples, Oranges to Oranges - Page 158
The Perl You Need to Know
The Perl You Need to Know Part 22: Warts and All - Page 160
|