Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Changing Delimiters - Page 14

March 23, 2001

You may have noticed that // and s/// looks like q// and qq//. Well, just like q// and qq//, we can change the delimiters when matching and substituting to increase the readability of our regular expressions. The same rules apply: Any non-word character can be the delimiter, and paired delimiters such as <>, (), {}, and [] may be used - with two provisos.

First, if you change the delimiters on //, you must put an m in front of it. (m for 'match'). This is so that perl can still recognize it as a regular expression, rather than a block or comment or anything else.

Second, if you use paired delimiters with substitution, you must use two pairs:

s/old text/new text/g;

becomes:

s{old text}{new text}g;

You may, however, leave spaces or new lines between the pairs for the sake of clarity:

s{old text} {new text}g;

The prime example of when you would want to do this is when you are dealing with file paths, which contain a lot of slashes. If you are, for instance, moving files on your Unix system from /usr/local/share/ to /usr/share/, you may want to munge the file names like this:

s/\/usr\/local\/share\//\/usr\/share\//g;

However, it's far easier and far less ugly to change the delimiters in this case:

s#/usr/local/share/#/usr/share/#g;

Modifiers

We've already seen the /i modifier used to indicate that a match should be case insensitive. We've also seen the /g modifier to apply a substitution. What other modifiers are there?

/m - treat the string as multiple lines. Normally, ^ and $ match the very start and very end of the string. If the /m modifier is in play, then they will match the starts and ends of individual lines (separated by \n ). For example, given the string: "one\ntwo", the pattern /^two$/ will not match, but /^two$/m will.

/s - treat the string as a single line. Normally, . does not match a new line character; when /s is given, then it will.

/g - as well as globally replacing in a substitution, allows us to match multiple times. When using this modifier, placing the \G anchor at the beginning of the regexp will anchor it to the end point of the last match.

/x - allow the use of whitespace and comments inside a match.

Regular expressions can get quite fiendish to read at times. The /x modifier is one way to stop them becoming so. For instance, if you're matching a string in a log file that contains a time, followed by a computer name in square brackets, then a message, the expression you'll create to extract the information may easily end up looking like this:

# Time in $1, machine name in $2, text in $3/^
 ([0-2]\d:[0-5]\d:[0-5]\d)\s+\[([^\]]+)\]\s+(.*)$/

However, if you use the /x modifier, you can stretch it out as follows:

/^( # First group: time [0-2]\d : [0-5]\d : [0-5]\d
)\s+\[ # Square bracket ( # Second group: machine name [^\]]+
# Anything that isn't a square bracket )\] # End square bracket

\s+ ( # Third group: everything else .* )$/x

Another way to tidy this up is to put each of the groups into variables and interpolate them:

my $time_re = '([0-2]\d:[0-5]\d:[0-5]\d)';my $host_re
 = '\[[^\]]+)\]';my $mess_re = '(.*)';
/^$time_re\s+$host_re\s+$mess_re$/;

Working with RegExps - Page 13
Beginning Perl
Split - Page 15


Up to => Home / Authoring / Languages / Perl / BeginningPerl




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers