Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
 Discussion Forums
 HTML, XML, JavaScript...
 Software Reviews
 Editors,Others...
 Top100
 JavaScript Tutorials, ...
 Tutorials
 ASP, CSS, Databases...
 Discussion List
 FAQ, Roundup, Configure ...
 Authoring
 HTML, JavaScript, CSS...
 Design
 Layout, Navigation,...
 Graphics
 Tools, Colors, Images...
 Software
 Browsers, Editors, XML...
 Internet
 Domains, E-Commerce, ...
 WDVL Resources
  Intermdiate, Tutorials,...
 WDVL
 Discussion Lists, Top 100,...
 Technology Jobs


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us
Computer Deals
Get Business Software
Best Price
Car Donations
Compare Prices
Desktop Computers
Auto Insurance Quote
Laptops
Phone Cards
PDA Phones & Cases
Corporate Awards
Find Software
Shop Online
Imprinted Promotions

Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers



Quality Management ROI Calculator - Focus on Test Automation
The Rational Quality Management ROI calculator is intended to give you an idea of what return you can garner from implementing our functional testing solutions. Our quality management solutions offer tools to develop a continuous process, powered by automation to govern software delivery. »

Gartner MarketScope: Application Quality Management Solutions, 1Q 08
This Gartner MarketScope provides guidance for enterprises seeking to purchase tools to manage risk and software quality. We focus on tools fit for large-scale enterprise use and that are ready out of the box to manage quality requirements and functional testing. »

Whitepaper: Tips for Writing Good Use Cases
Writing a good use case isnt easy, but, fortunately, our experience can be your guide. The concepts and principles assembled here represent the works of many people at IBM, and they form a foundation of proven best practices. »

Whitepaper: The Role of Integrated Requirements Management in Software Delivery
Learn about the critical role integrated requirements management can play in helping ensure your business goals and IT projects are continuously aligned-whether you are sourcing, integrat-ing, building or maintaining your software. It also looks at ways that integration and automation can help ensure managing projects and the required changes can be executed using manageable processes that satisfy stakeholders and development teams. »
Top 10 Articles
  1. Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
  2. JavaScript Tutorial for Programmers
  3. Design
  4. JavaScript Tutorial for Programmers - Objects
  5. JavaScript Tutorial for Programmers - JavaScript Grammar
  6. JavaScript Tutorial for Programmers - Versions of JavaScript
  7. Cascading Style Sheets
  8. JavaScript Tutorial for Programmers - Embedding JavaScript
  9. JavaScript Tutorial for Programmers - Functions
  10. Authoring JavaScript
Domain Name Lookup
Search to find the availability of a domain name. Just enter the complete domain name with extension (.com, .net, .edu)

Processing Text with Perl Modules - Page 11

September 24, 2001

In the previous article, we learned how to use Perl's built-in routines to perform many common text manipulation function. In the final article of this series on text processing, we will take a tour through a cornucopia of useful text processing modules that will kick the tar out of some of those arduous text processing tasks.

The Power of CPAN

The Comprehensive Perl Archive Network is a group of servers around the world that provide access to the Perl source code and hundred of Perl modules that have been contributed by volunteers. CPAN is one of the things I imagine other language authors wish they had for their respective languages (like Java) but don't. Fortunately for us, pre-built modules that bundle up the code and logic for performing many common tasks are freely available for the taking. See the list of resources on the last page of this article for a list of resources.

Installing Modules

Part of what makes CPAN powerful is the fact that Perl supports it directly with the CPAN.pm module, which has been distributed with the Perl source code for several years now. The module is capable of searching for, downloading, and installing modules directly from CPAN. It will even handle module dependencies where the module you're trying to install requires other modules from CPAN before it can be installed.

On most operating systems, you can install a CPAN module by typing:

perl -MCPAN -e 'install HTML::Parser'

where HTML::Parser is the name of the module you wish to install. This will automatically find, download, compile, and install the module onto your system.

If you are using Activestate Perl and the module you are installing is available in Activestate's repository, you can type: ppm install GD

PPM is a command-line utility that is only available if you are using Activestate Perl. Note that not all Perl modules from CPAN are available to PPM. So if you're running Activestate Perl on a win32 platform, you will also need to have Visual C++ and nmake installed on your system to load modules from CPAN that are not available to PPM.

Making Text HTML Safe

I'm sure most of you have had at least one occasion where you needed to effectively cut and paste a text file into an HTML file. If that text file contained any reserved characters like & or <, you probably had to convert them to HTML-safe entities such as &lt; for < by hand. Or maybe you haven't fixed the text and you now have an invalid HTML document out there on your Web site.

Well, if you find yourself doing this hand tuning on a regular basis or if you're routinely posting text into HTML files without checking to see if it's HTML safe, stop; because CPAN has a module called HTML::Entities which does all of the work for you.

The module contains a function appropriately named encode_entities() that automatically encodes all HTML reserved characters. So for example, if you have a string of text that's contained in a variable named $text that needs to be HTML encoded, you would first add the statement: use HTML::Entities to the top of your script and then type:

encode_entities($text);

somewhere in the main body of your source code. So if $text contained the string "Fred & Barney's Bowling Academy", it would be converted into "Fred &amp; Barney's Bowling Academy".

We could also build a simple script that converts an entire file such that we can execute the following on the command-line:

html_encode.pl < sample.txt > newtext.txt

Or in plain english, we direct a text file called sample.txt to the script as input and write the resulting encoded text to newtext.txt. The source of the script would look like the following:

#!/usr/bin/perl -w
use strict;
use HTML::Entities;

while (<>) {
    encode_entities($_);
    print;
}

Sending Bulk E-mails - Page 10
Weaving Magic With Regular Expressions
Encrypting Text with RC4 - Page 12


Up to => Home / Authoring / Languages / Perl / Weave




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers