The &store_line Subroutine - Page 9
December 14, 2001
We previously saw how the &store_line subroutine
was being invoked to process each line from the log file that
didn't represent an image request. Now let's skip down to the
bottom of the script and see what that
&store_line subroutine actually does:
# script proper ends. subroutines follow.
sub store_line {
# store one line's worth of visit data
my($host, $date, $time, $url, $referer, $agent) = @_;
my $seconds = &get_seconds($date, $time);
if ($visit_num{$host}) {
# there is a visit currently "working" for this host
my $visit_num = $visit_num{$host};
my $elapsed = $seconds - $last_seconds{$visit_num};
if (($expire_time) and ($elapsed > $expire_time)) {
# this visit has expired, so start a new one
&new_visit($host, $date, $time, $url, $seconds,
$referer, $agent);
} else {
# this visit has not expired, so add to existing record
&add_to_visit($host, $date, $time,
$url, $seconds, $elapsed);
}
} else {
# there is no visit currently "working" for this host
&new_visit($host, $date, $time, $url, $seconds,
$referer, $agent);
}
}
Most of the Perl in this subroutine should look pretty familiar
by now. In essence, this routine is functioning as a traffic cop,
using the host and the time of this access to figure out if this
request represents a new visit, or another request in a
previously started visit. First, it checks to see if a key exists
for the current line's $host in the
%visit_num hash. If there is, it means we've
previously processed a request from this host, so the script
checks to see if the currently working visit for this host has
"expired." That is, it looks to see if the time difference
between this host's last access and the current access is greater
than the value stored in the $expire_time
configuration variable. If it is, it means enough time has gone
by that this access needs to be considered the beginning of a new
visit, and the script invokes the &new_visit
subroutine. If it isn't, the script invokes the
&add_to_visit subroutine instead. Finally, if
there wasn't any key for the current $host in the
%visit_num hash, if means this host hasn't been seen
before at all. Accordingly, the &new_visit
subroutine is invoked to create a new entry for it. Here's where
we've implemented the feature of turning off visit expiration for
cases where the $expire_time configuration variable
has been set to 0. We've done that by making the
logical test that determines whether a visit has ended actually
contain two logical tests, both of which
must be true for the "true" branch to be invoked:
if (($expire_time) and ($elapsed > $expire_time)) {
This works because joining two logical tests with
and requires both of them to be true for the
expression as a whole to be true. If $expire_time is
set to 0, which is a false value, the test can never
return true. So, again, three subroutines are invoked from within
this &store_line subroutine: the
&get_seconds subroutine, which accepts as
arguments the date and time strings from the current log file
line and converts them to something called Unix seconds.
That routine, and the date arithmetic it performs, is the
subject of the next chapter. The &new_visit and
&add_to_visit routines, which handle the
updating of the script's data structure, will be covered in
Chapter 10.
Storing the Data - Page 7
Perl for Web Site Management
|