Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


CGI Sucks!

Well, as you might expect, for all its dynamism, CGI was not a holy grail. In fact, there are a lot of sysadmins out there who would be ecstatic if CGI were outlawed. CGI simply causes too many problems.

  • CGI introduces security holes. Lincoln Stein writes the following eloquent warning on the problem,
    Unfortunately, there's a lot to worry about [when running a web server with CGI]. The moment you install a Web server at your site, you've opened a window into your local network that the entire Internet can peer through. Most visitors are content to window shop, but a few will try to peek at things you don't intend for public consumption. Others, not content with looking without touching, will attempt to force the window open and crawl in.

    It's a maxim in system security circles that buggy software opens up security holes. It's a maxim in software development circles that large, complex programs contain bugs. Unfortunately, Web servers are large, complex programs that can (and in some cases have been proven to) contain security holes.

    Furthermore, the open architecture of Web servers allows arbitrary CGI scripts to be executed on the server's side of the connection in response to remote requests. Any CGI script installed at your site may contain bugs, and every such bug is a potential security hole.

    It is one thing to allow any freako on the Internet access to your web server, when the communication is controlled through the boundaries defined by HTTP and implemented by web browsers. It is another thing to allow a stranger access to an unlimited amount of applications housed on the same server through a renegade CGI script.

    In the WWW Security FAQ, Stein identifies four overlapping types of risk:

    • Private or confidential documents stored in the Web site's document tree may fall into the hands of unauthorized individuals.
    • Private or confidential information sent by the remote user to the server (such as credit card information) might be intercepted.
    • Information about the Web server's host machine might leak through, giving outsiders access to data that can potentially allow them to break into the host.
    • Bugs can allow outsiders to execute commands on the server's host machine, allowing them to modify and/or damage the system. This includes "denial of service" attacks, in which the attackers pummel the machine with so many requests that it is rendered effectively useless.

    I recommend checking out the following CGI Security sites if you are interested in getting more detailed information.

  • CGI is at the mercy of HTTP. It is important to note that HTTP only provides for a one-time, question/answer type of communication. Afterall, it was defined primarily for web browsers and web servers to exchange HTML documents. Thus, by definition, HTTP is not very dynamic.

    One-time, question/answer communication works like this: the web browser and the web server are only connected as long as it takes for the web browser to send one document request and the web server to send one requested document. If the browser wants a second document, it must recontact the server and ask again. Each request is new, the server maintains no ongoing connection or record of past exchanges.

    While this is very efficient for network traffic (because the bandwidth is only used when information needs to be exchanged), it is a big pain in the butt when it comes to CGI, because CGI is about conversations, not about one-time question/answers

    Imagine that when talking on the phone you had to hang up and redial every time you said something and received an answer. Imagine further that everytime you called back you had to go over every previous exchange before you could get to the next piece. That is the way web browsers work with web servers and this makes communication tough.

    This makes communication tough for three reasons.

    First, if the client and server are to maintain information over several exchanges, the CGI must be responsible for keeping a running dictation of the conversation so that every time there is a new exchange, the web server can consult the record of the entire conversation up to that point. This is what CGI aficionados call "maintaining state". The CGI script must be able to keep track of certain information like username or the contents of a virtual shopping cart for every "instance" of a script. (1). That is, there must be a way to tie the current HTTP request to related ones that have gone on before. Maintaining state is possible with CGI using hidden variables, by encoding the URL, or by maintaining a state file on the server, it's just not easy or efficient. (2).

    Second, every set of question/answers causes the web server to execute a unique instance of the CGI script. This is pretty expensive, especially on a high volume web site which may have 100 instances of a CGI script executing at any given moment, each, perhaps, with its own Perl interpreter. (3) Every one of those CGI scripts takes a little bit of umph out of the server engine. If we were not limited to question/answer format, we would not need to execute so many instances.

    Consider the following CGI application executing....

      
        Client: Hello?
    
        Server: Welcome, what would you like 
    	(CGI script executed once)
       
        Client: I would like a list of products 
    	you are selling 
    
        Server: Here is a list (another one)
    
        Client: I want to purchase this product
    
        Server: Okay.  (yep)
    
        Client: I'm done, can I check out?
    
        Server: Yes, what is your credit card number? 
    	(another script)
    
        Client: Here it is.
    
        Server: Thanks (another instance of the script 
    	which also emails the results to some
            store admin) (4)
        

    Yuck, this exchanged caused 5 instances of the store script to be executed as well as 5 Perl interpreters if the CGI script was written in Perl.

    Third, CGI is extremely slow. Everytime the client does something, the CGI Script must recreate the entire dialog and execute a new request. Add a new item to a virtual shopping cart - new request. Calculate a running total - new request. Submit an order - yet another request. Each request takes time and since the CGI script must be executed again and everyone must wait for a busy internet.

  • CGI is ugly. Finally, CGI scripts produce fairly ugly user-interfaces. Basically, CGI is limited to bland HTML-based forms and whatever bells and whistles can be provided by surrounding HTML layout. Thus, no CGI application looks like your swank bootleg copy of Word.

    This may not seem like a big issue at first, but when you start competing for web hits with multi-million dollar companies, image is indeed everything. CGI simply cannot compare with web based applications which are not limited to HTML.

Well, those are some pretty damning flaws. Like I said, many systems administrators would love to see CGI fall off the face of the Earth. Unfortunately for those system administrators, the fact is that CGI has continued to be the workhorse of the web, powering 90% of the dynamic web pages out there.

The fact is that CGI, especially CGI/Perl is easy to work with and most non-technically oriented webmasters out there can get their needs filled, and filled right away. However amazingly, brand-fantasmagorically wonderful other technologies sound, they are still vaporware as far as the average web developer is concerned. Either the ISP does not provide those technologies, or the learning and development curve is too steep or expensive. And of course for small applications typical of most websites, the big guns of C or C++ are just overkill.

CGI, for all its flaws, works, and works pretty darn well if done carefully. "Intranet" developers with massive budgets can yack all they want to about servlets and SQL gateways and Server Side Includes and customized server applications written in Java, but for most "internet" developers out there, CGI is the only tool available for solving their problems. And with creativity and care, CGI can also be the right tool.

Footnotes

  1. You can think of an instance of a script as a unique and independent version of a generic script. It is called an "instance" because ten web surfers could all execute a CGI script at the same time. Though each web surfer would be using the same generic CGI script, each instance of that script would be personalized to that web surfer. Thus you may have ten instances of the exact same script running in parallel on the web server hardware.

  2. Hidden variables allow you to maintain state using the HTML "Hidden" form tag. Essentially, you include information in your HTML form that will not be visible to the user when they look at the form in their web browser window, but which will be transferred to the CGI script with the user-supplied data. The format of the tag looks something like the following:
    <INPUT TYPE = "HIDDEN" NAME = "first_name" VALUE = "selena">
    <INPUT TYPE = "HIDDEN" NAME = "last_name" VALUE = "sol">
    When the CGI script processes the information which the user enters into the HTML form, it will also receive the variable "first_name" with the value of "selena" as well as "last_name" equal to "sol".

    If the user is not using a FORM tag to navigate through a site, the admin can still encode state information in the URL by using the HTTP standard for URL encoding. For example, the following hyperlink would send the same info as above to the CGI script.

    <A HREF = "www.extropia.com/test.cgi?first_name=selena&last_name=sol">click here</A>
    Notice that variables to be passed along are listed after the question mark, name/value pairs are separated by the ampersand sign, and the variable name and variable values are separated by an equal sign.

    Finally, the CGI script may write out state information to a file on the server and then simply pass along the location of the file using one or both of the above methods. This is best when there is a large amount of state information.

    By the way, maintaining state can also be achieved using Netscape Cookies, however, we will not address cookies here because they require their own article.

  3. Perl is a fun language to use because it keeps the nuts and bolts of machine code as invisible as possible. One of the ways Perl does this is by adding an extra step between you and the computer. This extra step is called a "Perl interpreter". This interpreter (which your sysadmin must install) reads a Perl program that you write and translates it "on the fly" into machine code which can be understood by your computer. Your "executable" can then be moved to any other system with a Perl interpreter and be run without problems. Further, the code can be easily modified and understood. Unfortunately, in order to run your executable, you must also run the interpreter and this can be expensive in terms of server resources.

    In more intense languages like C or C++, there is no interpreter. You must use a special "compiler" program to translate your code into machine code. This affords greater power to your programs since you do not need to run a separate interpreter when you run your executable, but it does mean that executables are specific to each operating system and that the source code is stored separately from the executable code.

  4. Notice that CGI scripts must be smart enough to answer all sorts of questions.


Up to => Home / Authoring / Scripting / WebWare / Server




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers