6.4 User Authentication - Page 6
July 3, 2002
Novice web programmers are sometimes surprised that web servers have no real idea
whom they are talking to when serving up files and running scripts. Programmers
who learned their skills on Unix and other user-oriented operating systems are accustomed
to having a function that returns a user ID. Mailing lists for most any web
development product get questions like " Where do I get the user's name?"
The web server knows the IP address of the requesting browser and an identifying
string that should indicate its make and model, 4 but not much more. Most ISPs recycle
IP addresses for dial-up users, and even if the address is static there is no guarantee
that a particular user will always use the same machine, so this information isn't useful
as a user ID.
4. Applications which use advanced HTML or client-side scripting rely on the HTTP_USER_AGENT
environment variable to identify the browser, so they can decide on which set of incompatible features
to use.
There are two basic approaches to user authentication in a web application: use the
authentication protocol built into HTTP or do it yourself.
6.4.1 Using HTTP authentication
Chances are you've already encountered the HTTP authentication protocol already:
you request a URL from your browser, and before a new page appears the browser
pops up a window or displays a prompt asking for your username and password.
That's the authentication protocol in progress.
What's actually going on is more complex than it appears. The protocol works
this way:
1 The browser sends the usual request to the web server for a URL.
2 The web server's configuration indicates that authentication is required for that
URL. It sends back a 401 response to the browser along with a realm for authentication;
the realm is a human-readable name used by the server to identify a
group of secured documents or applications.
3 If the browser implements authentication caching, it checks its cache for the
given realm and server ID. If it already has a username and password for the
realm, it uses it to skip the next step.
4 If the browser doesn't have a cache, or the realm isn't there, it displays a dialog
box or prompts the user for his username and password for the given realm. The
realm should be displayed here so that the user knows which user and password
to send.
5 The browser sends the URL request to the server again, including the username
and password in the request headers.
6 The server checks the URL, sees that it requires validation (again— remember
that this is a stateless protocol), and sees that it has validation headers. It looks
up the given username and password in some form of database.
7 If the information is valid, the web server applies the authentication rules for
the URL and verifies that the user is authorized to read the associated document
or run the application. Everything proceeds as normal if so; if not, it sends back
an error page.
8 If the username and password didn't validate, the server sends another 401
response back to the browser, and the cycle continues.
The main advantages of using HTTP authentication is that it already works; Apache
has excellent support for it and comes with a few simple user database modules.
mod_ perl extends Apache with a module that provides authentication against any
DBI database, making it trivial to keep your user IDs and other user data together (see
the Apache:: DBI module's documentation for more information). Many databases
(including MySQL and PostgreSQL) have Apache authentication modules as well, so
slimmed-down Apache servers can share an authentication database with mod_ perl
or other applications.
The primary disadvantage of the HTTP authentication mechanism is that it is
unfriendly to new users. GUI browsers display a small dialog box prompting for the
username and password without much in the way of helpful information. One way
to work around this problem is to send a helpful page of information when user
authentication fails, instructing the user on how to get an account or what to do at
the prompts; this also lets experienced users log in without hand-holding.
HTTP authentication is good for protecting static pages, download directories, or
other data for which you would not otherwise write a custom application. It's also
fine for administrative functions or private applications when the users will know
what to do.
The next section will discuss other reasons to handle authentication yourself. In the
meantime, let's look at an example using Apache's simple text file user database.
Suppose we want to protect a page of server status information— the Apache::
Status example from the previous chapter. Recall that it was configured in
mod_perl.conf like so:
# Server status
<Location /perl-status>
SetHandler perl-script
PerlHandler Apache::Status
order deny,allow
deny from all
allow from 192.168.
</ Location>
The deny and allow directives restrict access to a protected network. For purposes
of remote administration it would be more helpful to set password protection on the
/perl-status URL. The new configuration to handle that is:
# Server status
<Location /perl-status>
SetHandler perl-script
PerlHandler Apache::Status
AuthUserFile data/admin_users
AuthName "Administrator functions"
AuthType basic
require valid-user
</ Location>
Optionally we could keep the deny and allow directives to further restrict access.
The AuthUserFile directive gives the path to a password file to be used in
authenticating requests. Remember that all relative file paths begin with Apache's root
directory. AuthName gives the realm name for authentication, and AuthType
basic tells Apache that it shouldn't expect the browser to encrypt the information—
more on that later in the chapter. The require valid-user directive tells Apache
that any user with a valid password may retrieve the URL.
Now we need a password file. Apache comes with a utility for creating and managing
passwords: htpasswd. Run it with the -c switch to create a password file and add
a user:
/usr/local/apache/bin/htpasswd -c /usr/local/apache/data/admin_users theo
The name of the password file matches the path given in AuthUserFile earlier (if
you add Apache's root directory to the front). The program will prompt for a password,
or you can supply one on the command line after the username.
After performing these steps, restart Apache and try the /perl-status URL. If all
is well you will be prompted for the user you just created, and then will see the status
information page. That's all there is to adding password protection to important
pages.
There are more options than shown in this example. For instance, require can
list a subset of valid users, or specify groups instead of usernames. See the online
Apache manual for more information.
The password file created by htpasswd contains user names and encrypted passwords.
Make sure that the file is readable by Apache's user. If your applications add
users automatically or let them change passwords, then the application's effective user
will need write access also.
The text file system is fine for pages that aren't accessed too often and only by a
small number of users. To validate the user, Apache has to scan through the file
sequentially until it matches the username, so this mechanism will be too slow for a
larger user base. Apache comes with hash file authentication modules that are more
efficient, but if you have a large user base you probably also have a relational database
somewhere. See the examples in the next section for ways to have Apache use your
database for authentication.
6.3 OpenSSL and Apache (Cont.) - Page 5
Web Development with Apache and Perl
6.4.2 Doing your own authentication - Page 7
|