Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Well-Defined Repetition - Page 10

March 9, 2001

If you want to be more precise about how many times a character or groups of characters might be repeated, you can specify the maximum and minimum number of repeats in curly brackets. '2 or 3 spaces' can be written as follows:

> perl matchtest.plx
Enter some text to find: \s{2,3}
'\s{2,3}' was not found.
>

So we have no doubled or trebled spaces in our string. Notice how we construct that - the minimum, a comma, and the maximum, all inside braces. Omitting either the maximum or the minimum signifies 'or more' and 'or fewer' respectively. For example, {2,} denotes '2 or more', while {,3} is '3 or fewer'. In these cases, the same warnings apply as for the star operator.

Finally, you can specify exactly how many things are to be in a row by simply putting that number inside the curly brackets. Here's the five-letter-word example tidied up a little:

> perl matchtest.plx
Enter some text to find: \b\w{5}\b
'\b\w{5}\b' was not found.
 >

Summary Table

To refresh your memory, here are the various metacharacters we've seen so far:

Metacharacter Meaning
[abc] any one of the characters a, b, or c.
[^abc] any one character other than a, b, or c.
[a-z] any one ASCII character between a and z.
\d \D a digit; a non-digit.
\w \W a 'word' character; a non-'word' character.
\s \S a whitespace character; a non-whitespace character.
\b the boundary between a \w character and a \W character.
. any character (apart from a new line).
(abc) the phrase 'abc' as a group.
? preceding character or group may be present 0 or 1 times.
+ preceding character or group is present 1 or more times.
* preceding character or group may be present 0 or more times.
{x,y} preceding character or group is present between x and y times.
{,y} preceding character or group is present at most y times.
{x,} preceding character or group is present at least x times.
{x} preceding character or group is present x times.

Backreferences

What if we want to know what a certain regular expression matched? It was easy when we were matching literal strings: we knew that 'Case' was going to match those four letters and nothing else. But now, what matches? If we have /\w{3}/, which three word characters are getting matched?

Perl has a series of special variables in which it stores anything that's matched with a group in parentheses. Each time it sees a set of parentheses, it copies the matched text inside into a numbered variable - the first matched group goes in $1, the second group in $2, and so on. By looking at these variables, which we call the backreference variables, we can see what triggered various parts of our match, and we can also extract portions of the data for later use.

Repetition - Page 9
Beginning Perl
Try It Out - Page 11


Up to => Home / Authoring / Languages / Perl / BeginningPerl




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers