Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


The Perl You Need to Know: Part 5 "Processing and Parsing Web Pages"

August 9, 1999

In The Perl You Need to Know we've explored a variety of means to add content to web pages, but we've yet to see how to retrieve information from a web page using Perl. Last month's exploits featured the use of templates to easily insert dynamic information into pre-structured pages such as the Smallville Gazette. This month we extend this concept, retrieving information from the web which will then be dynamically included in a template-based output page. Our partner in this scheme is Perl library LWP which, like a Swiss army knife, provides a number of tools for carving, slicing, dicing and parsing web pages.

In The Perl You Need to Know we've explored a variety of means to add content to web pages, but we've yet to see how to retrieve information from a web page using Perl. In fact, there are many possible reasons you'd want to read and access pages from your Perl scripts, instead of or in addition to generating web pages as output. Last month's exploits featured the use of templates to easily insert dynamic information into pre-structured pages such as the Smallville Gazette. This month we extend this concept, retrieving information from the web which will then be dynamically included in a template-based output page. That's a mouthful of jargon, to be sure, but the results are simple and elegant.

Our partner in this scheme is the library for WWW access in Perl, thankfully also known as simply LWP. LWP encompasses a set of Perl modules which, like a Swiss army knife, provide a number of tools for chopping, carving, slicing, and dicing web pages. Some of LWP's capabilities can be quite complex to use while others are graciously simple. We begin our look at LWP and its simpler uses in combination with the template technique seen in Part 4 of The Perl You Need to Know.


Contents:

A Simple Goal
Simply, LWP::Simple
Grasping for Tags
Pulling Tags Like Taffy: TokeParser
Parsing Attributes with Ease
The Proof is in the Parsing: A Web Page Summarizer

Conclusion
The Perl You Need to Know
A Simple Goal


Up to => Home / Authoring / Languages / Perl / PerlfortheWeb




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers