All Sorts of Sorts: Lists
February 7, 2000
Sometimes the order of items matters ... for instance,
alphabetical order of people's names, or numerical order of
prices or test scores. The Perl sort function offers
great control in how we sort data. Starting with lists, the
default sort syntax performs an alphabetical sort of
list data, returning the sorted list as its result.
@names=("Mark","Wendy","John");
foreach $name (sort @names)
{ print "$name\n" }
The little foreach block above will print the list of
names in sorted order,
"John","Mark", and "Wendy". The
results aren't as perfect if we use numbers.
@grades=(80,100,95);
foreach $grade (sort @grades)
{ print "$grade\n" }
The results? 100, 80, 95. Uh-oh. The sort function
defaults to an alphabetical sort, in which case
"100" does come before "80". We need
a numerical sort ... in fact, the sort function lets
you sort on any criteria you wish. You can supply the code
to execute the criteria and sort supplies two values,
$a and $b. Your code should use these values to evaluate
any criteria you want, and you must return any positive
integer if $a should sort before $b, a negative integer if
$b should sort before $a, or zero if the two are equivalent.
This sounds more complicated than it really is.
foreach $grade (sort {$a<=>$b}
@grades)
{ print "$grade\n" }
Here we've added code for the sort function to use.
This code simply compares $a and $b using the "three-way
comparison" operator, sometimes known cheekily as the
spaceship operator. Simply put, the <=> symbol is
Perl shorthand which returns +1, 0, or -1 upon numerically
comparing $a against $b. Now, the script outputs the correct
sequence: 80, 95, 100.
Suppose you want to sort in reverse? Our numeric sort is
accurate but we'd like the results to be in the opposite
direction: 100, 95, 80. Easy enough! Just throw the word
reverse in front of the sort function and
the results will be sorted and returned in the opposite order.
foreach $grade (reverse sort {$a<=>$b}
@grades)
{ print "$grade\n" }
Although we've embedded our sort code inline with the
sort function, you can alternately call on a subroutine
to perform the comparison if your needs are more complex.
For instance, imagine that we have a list of full names:
@names=("john wallace","mike wallace","harris
bluebaum",
"gorky yeller","yehva stuntz");
However, we want to sort this list based on last name,
first name. Therefore, we'll need to code a custom
subroutine to perform these comparisons, called
by the sort function.
foreach $name (sort by_lastname @names)
{print "$name\n"}
sub by_lastname {
($a_firstname,$a_lastname)=$a=~/^(\w+)\s+(\w+)$/;
($b_firstname,$b_lastname)=$b=~/^(\w+)\s+(\w+)$/;
$sortResult=$a_lastname cmp $b_lastname;
if ($sortResult == 0)
{ $sortResult=$a_firstname cmp $b_firstname }
return $sortResult;
}
The call to sort sends $a and $b to our
by_lastname subroutine.
Inside this subroute we parse apart each of the names into
a first and last name using regular expression matching.
Then, the two lastnames are compared using the cmp
operator, which is another three-way comparison, the string
equivalent of the numeric <=> operator. If cmp returns
zero we conclude that the two last names are identical,
in which case we proceed to compare first names. Once
complete, the cmp results are returned back to the
sort function. And, yes, it works!
Hopefully the by_lastname subroutine illustrates
that your sort comparison can perform any actions it needs
to compare the values of $a and $b. In fact,
it needn't even compare the two directly -- for instance,
you can imagine that you might use $a and $b to look up
values in another data structure, such as a hash or even
a live database, and based on the values of those lookups
determine whether $a should sort before or after $b. As
they say about Lego, the possibilities are endless.
Refresher: Lists and Hashes
The Perl You Need to Know
All Sorts of Sorts: Hashes
|