Slapping together a search engine for your database is easy with PHP and MySQL
July 7, 2000
By Clay Johnson
So you've got a dynamic site, filled with all sorts of user inputs,
whether it be a 'phorum', or like my own site at
knowpost.com.
ht://dig
will take care of indexing and searching your html pages, but if you
are like me, you have very few html pages, and most of your
"content" resides in BLOBs in your database. You can't do anything
useful like using a %searchword% query, it just doesn't coming back
relevant.
There has to be a better way, and indeed there is, with a few easy
steps. Here's how to slap one together:
Part one: BNR--Blob Noise Reduction
The first problem with your content is that it is filled with clunky
"noisewords," like "a, the, where, look"; things that are there to help
us humans to communicate, but really don't have anything to do with
relevance. We gotta get rid of those. I've included a big list of
noisewords (noisewords.txt) for you to use, modify or mutilate.
Essentially, what we're trying to do here is get all of those
noisewords out of your data, and build a table with two columns, the
word, and its indicator (the content associated with it). We want
something that will eventually look like this:
+------+------------+
| qid | word |
+------+------------+
| 6 | links |
| 5 | Fire |
| 5 | topics |
| 5 | related |
| 5 | Shakespeare|
| 4 | people |
| 4 | Knowpost |
| 3 | cuba |
| 3 | cigar |
+------+------------+
Lets create our table now--
CREATE TABLE search_table(
word VARCHAR(50),
qid INT)
Contents:
Make All of Your Data Compatible
Search and Print
Make All of Your Data Compatible
|