Searching
July 19, 1998
Truthfully, the utilities you've learned in the
past few days will be enough to get you through 99% of your workday.
As a web technician, you will rarely need to do anything beyond file
creation and manipulation. UNIX-based process, hardware, and security
management will be tasks handled by the system administrator,
not you.
However, there are several utilities that merit
mention here because if you become comfortable with them, the remaining 1%
of your time will be made much happier.
The first set of tools you should become familiar
with are the searching tools, namely, "grep" and "find". These two
utilities allow you to search a file for keywords and the file system
for file names respectively.
You can imagine how crucial these tools will be when you
begin to manage sites with hundreds or thousands of web pages.
More than likely you will forget where certain bits of information are
kept. Without "grep" and "find", finding these files could be like
finding a needle in a haystack.
We'll discuss both these utilities in just a bit, but first
we need to say a word about "regular expressions" because regular
expressions (REGEX) are the foundations of searching.
We have already introduced
regular expressions when we talked about wild cards. If you recall,
when using utilities, UNIX allows you to use a special character to
represent some "pattern". For example, you'll recall,
"??" could substitute for any two characters in a file name.
Well, as it turns out, regular expressions appear again
in searching. They are, however, slightly different, so
let's take a look at how some of the most common
ones work.
| Character |
Explanation |
Example |
| [character] such as a |
Will match any word with the occurrence of the letter. |
Thus, searching for "a" will turn up "cat", "apple", "a"
(there must be an "a" in the match) |
| /a |
Turns off any special meaning for character a |
|
| ^ |
Positions the matching cursor at the beginning of the line |
"^cat" will match if the line begins with the word "cat". Thus
"The cat jumped" would not match but...
I was looking at the yellow
cat which jumped" would match. ("cat" must be the first word on the
line) |
| $ |
Positions the matching cursor at the end of the line |
Thus a match for "Eric$" would match
My name is Eric.
but it would not match "My name is Eric T" ("Eric" must be the last
word on the line) |
| . |
Matches any single character |
A search for ".ris" would match "Kris" or "Cris". By the way,
it would also match for "kris", "cris", and "9ris". |
| * |
Used to match any occurrence of a given character |
Thus "w*" would match "when" and "www". Note that this is not
at all similar to filename regular expressions that matches
"w" plus anything else. For example, "h*lo" would NOT match "hello"
but it would match "helo". |
| [characters] |
Matches the characters specified in the brackets |
[xyz] would match any occurrence of x, y, or z |
| [a-z] |
Matches any lower case letter |
^[a-z]..c would match "eric" but not "Eric" (e must be lower case
and at the beginning of the line). Likewise, it would
match "marc" but not "Mark" or "mac" (must be 4 letters) |
| [A-Z] |
Matches any upper case letter |
Same as above but with uppercase. |
| [a-zA-Z] |
Matches upper or lower case letters |
ditto |
| ![A-Z] |
Matches any character which is NOT an upper case letter |
![A-Z] would match "a" but not "A". |
| [0-9] |
Matches any number |
Chap[0-9] matches "Chap1" or "Chap2" |
| [^0-9A-Z] |
Matches any character OTHER THAN 0-9 or capitals letters. |
Part[^A-Z] matches "Parta" or "part2" but not "PartC" |
Okay, let's see how we actually use all these
regular expression tools. Let's look at grep.
Introduction to UNIX for Web Developers
Introduction to UNIX for Web Developers | Table of Contents
The "grep" Utility
|