Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Form Validation

May 10, 1999

Suppose that you have a web page that collects user registrations -- these registrations store user information in a simple database, such as telephone numbers and e-mail addresses. Of course, for the sake of ethics, we're assuming that the user has volunteered this information with upfront knowledge about its ultimate use. In fact, we'll look at coding this simple registration log in the next example -- first, though, let's look at validating the user's input. After all, this registration log isn't of much use if it contains invalid data.

Telephone numbers are easier to validate than e-mail addresses, so let's begin there. Assuming a standard North American telephone number, the format for such a number is an area code (3 digits) followed by seven digits. Spaces and dashes should be optional but tolerated by our validator.

The aim in this example is to verify that the telephone number provided in the user's registration is at least of a valid format --obviously we can't be sure the number isn't fictional, but some validation is better than nothing. Within the HTML that constructs the registration form there is a field where the user inputs their telephone number:

Telephone # (including area code):
<input type="text" width="10" name="userphone"><Br>

In our Perl program, we use the CGI object to retrieve the value of the userphone parameter:

$userphone=$cgiobject->param("userphone");

And we can use a conditional pattern match to assign a true or false value to a validation variable:

$fieldValid=$userphone=~/^\D*\d{3}?\D*\d{3}?\D*\d{4}?\D*$/;

Yikes! On the left hand side of the assignment operator is our validation variable, $fieldValid. This variable will ultimately receive a true or false value, depending on the success of the right hand operation. That right hand operation is the now-familiar conditional pattern match. In this pattern match, the user's phone number ($userphone) is compared against a somewhat cryptic regexp syntax. The logic behind our regexp can be eloquently stated as: "Starting at the beginning of the data there are zero or more non-digits (the \D character class) followed by exactly three digits (the \d character class), followed by zero or more non-digits followed by exactly three digits, followed by zero or more non-digits, followed by exactly four digits followed by zero or more non-digits, followed by the end of the data string." Whew!

The tolerance of this regular expression will successfully match, for example, "(555) 555-2222" or "555-555-2222" or "5555552222" and so on, but will reject a phone number missing any digits or with extra digits.

If this field has validated successfully, we might also want to apply a substitution to $userphone, so that all phone numbers reside in the same format in our future registration log. We can simply strip out all non-digits from the user's entry:

if ($fieldValid)
 { $userphone=~s/\D//g }

Thus far we've seen the bits of pieces of form validation. Let's reconsider the big picture -- we're using form validation as a precursor to storing a registration log. So, let's begin building our real-life registration script, register.cgi, focusing first on the validation code.

register.cgi (preliminary)

#!/usr/bin/perl

use CGI;


#create an instance of the CGI object 
$cgiobject = new CGI;


#grab the values submitted by the user
$userphone=$cgiobject->param("userphone");

#output HTML header to web browser
print $cgiobject->header;

#test form validation, output error if necessary 
#otherwise proceed to registration log
if ( &validateForm )
 { &registerForm }
else
 { &output_fail }

# subroutine which validates form fields and 
#returns a true or false result
sub validateForm
 { $failedFields="";
   $formValid=1; 
   $fieldValid=$userphone=~/^\D*\d{3}?\D*\d{3}?\D*\d{4}?\D*$/;
   if ($fieldValid)
    { $userphone=~s/\D//g }
   else
    { $failedFields.="Telephone Number,";
      $formValid=0 }   
   
   return $formValid
 }

#subroutine which outputs failure message 
#if form does not validate
sub output_fail
 { chop($failedFields); 
    $resultPage="<html><head>".
		"<title>Uh-Oh: Registration Problem</title>".
		"</head><body bgcolor=\"white\">".
		"<h2>Sadly, there seems to be a problem with your ".
		"form submission. Specifically, the following ".
		"mandatory fields were filled in improperly:</h2>".
		"<Br><h3>$failedFields</h3>".
		"<Br>Please go back and try again.".
		"</body></html>";
   print $resultPage;
 }

In looking over the first version of register.cgi, we cover a fair amount of Perl territory. Notice the introduction of subroutines -- we use subroutines to "bundle" a section of code. Subroutines often return a result, such as true or false, which lets us call the subroutine from within a conditional statement -- in this example, we call the &validateForm subroutine from within an if statement (the ampersand preceding a subroutine name is often optional but it is good and safe practice). This if statement is the main control of program flow: if the form is valid then we proceed to the registration subroutine, which is still fictional at this point; if the form is not valid, we output an error message to the user's browser detailing which field(s) failed validation.

Returning attention to the task at hand, we probably want to validate other fields in addition to the user's telephone number. What other fields might we validate? Were this the Family Feud, and were I a quaintly amorous Richard Dawson, I'd hereby shout "Survey Says!?" -- and, ding, the number one answer would be "e-mail addresses"! So here is the bad news -- e-mail addresses are darn hard to validate. In fact, they are so difficult to validate that we won't attempt it in this article, but the Resources section will contain some links to information on this very matter. In brief, the reason e-mail address validation is such a trauma is because the valid syntax for an e-mail address is quite flexible, and too difficult to capture in a single regexp pattern match.

For simplicity's sake, then, let's say that we will add validation for the user's name and ZIP code. The name field must simply contain any alphabetical input while the ZIP code should conform to either the traditional 5-digit number or the newfangled 5 + 4-digit ZIP. Modifying our &validateForm subroutine with the proper logic and regexp comparisons yields:

# subroutine which validates form fields and 
#returns a true or false result
sub validateForm
 { $failedFields="";
   $formValid=1; 
   #validate phone number   
   $fieldValid=$userphone=~/^\D*\d{3}?\D*\d{3}?\D*\d{4}?\D*$/;
   if ($fieldValid)
    { $userphone=~s/\D//g }
   else
    { $failedFields.="Telephone Number,";
      $formValid=0 }   
   
   #validate user name
   $fieldValid=$username=~/^[a-zA-Z]+/;
   unless ($fieldValid)
    { $failedFields.="User Name,";
      $formValid=0 }

   #validate ZIP code
   $fieldValid=$userZIP=~/^\d{5}(-\d{4})?$/;
   unless ($fieldValid)
    { $failedFields.="ZIP Code,";
      $formValid=0 }

   
   return $formValid
 }

Our new, beefier &validateForm subroutine simply builds on its predecessor. The user name test verifies that there be at least one alphabet character. The ZIP code test uses a regular expression to allow either a 54321 ZIP code or a 54321-1234 ZIP code. Quick narration of ZIP regexp logic: "Starting at beginning of data, there must be 5 digits. The group of characters represented by one dash followed by four digits may appear zero or one times, followed by the end of the data."

At the start of the subroutine we set a flag, the variable $formValid, to 1 -- meaning that we begin validation with the assumption that the form is valid (and that man is basically good). As we validate each field, if that field should fail, then $formValid is set to 0, tripping the flag to indicate that there is an invalid field in the form. We use the $failedFields variable to simply collect the names of fields as they fail, for later output to the browser.

That, then, sums up the basics of form validation. Needless to say, there are many types of data that a web page may request, and validating different sorts of information often requires different strategies. Typically, though, regular expression pattern matching is a key tool in validating user input.

CGI and Object Oriented Perl: Output
The Perl You Need to Know
Registration Log


Up to => Home / Authoring / Languages / Perl / PerlfortheWeb




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers