Jump to content

Site Searches


cags

Recommended Posts

Question 1:

Ok, so this is probably a stupid question, but I'm not afraid of appearing stupid so here goes. How do search engines perform advanced searches. As an example, Google supports the site:http://www.phpfreaks.com syntax for only including results from a certain site and the define:alcoholic to find a description of myself (along with other 'advanced' syntax. I assume this is a simple matter of parsing the search string (using Regular Expressions for example) and then using the information to both structure an appropriate SQL query and to output the data in a specific style. So for example if you used the site: syntax it would simply append "site='$site'" to the WHERE clause. Is that the right general idea?

 

Question 2:

With regards to a more basic search, if I wanted to create a search box for my own site how would you go about searching for results from different tables. As an example, I made a website for tracking various things within a pool league. This included information about venues, teams, players etc, etc. If I was to add a search box and the user entered let's say 'Lion', this should return the results {Venue: White Lion}, {Team: Lion Tamers} and {Player: John Lion}. Since all the information is stored in different tables I'm not sure conceptually how to go about this? Should I run a separate query for each table, then collate the results using PHP? Is it possible to do it as a single query? Finally, whichever way this is achieved what methods could be employed to sort them for relevance?

Link to comment
Share on other sites

Thanks for the information ignace, I'll consider getting that for some advanced reading. Since you say it's a good book am I to assume you have actually read it? If so (or if not really) can you give me any quick tips to get me going in the right direction.

Link to comment
Share on other sites

Question 1:

I believe that is the general idea, google just uses regex to find that and use it if it is apparent in the string. I do think that they only apply those if the site: is at index 0 of the string.

 

Question 2:

Searching multiple tables is actually not too bad, the issue you will have more or less is having to setup the search terms for each field that can be searched.  Depending on your table structure and the references for the table searches you may be able to do it in 1 query (if they all link to each other). If not I think you would be better off doing separate queries for each table. Then when displaying you can display it in "categories" as a sense, IE: Players Found; Teams Found; Venues Found; etc.

 

I would just set it up to search 3 different queries, but I would also make check boxes on the search form so they can actually choose what they want to search.

 

Let me know if you want anymore information :)

Link to comment
Share on other sites

Chapter 5 discusses the differing algorithms used in search queries like boolean-, natural language-, thesaurus-, term searches query

 

Chapter 6 discusses optimalisation or relevant results for a certain query

 

Chapter 7 discusses both PageRank and HITS-method.

 

These chapters are the most interesting chapters of the total 8 chapters included. Ch 5 deals specifically with what you are looking for. Check your local city library if they don't have it recommend it to them and pick it up later ;)

 

The book has in total ~100 pages so if your Math skills are not so bad you may be able to read it in one-night

 

If your not to keen of writing your own search engine, you may want to try Lucene. Zend framework has a decent implementation for interfacing with Lucene.

Link to comment
Share on other sites

I assume this is a simple matter of parsing the search string (using Regular Expressions for example) and then using the information to both structure an appropriate SQL query and to output the data in a specific style. So for example if you used the site: syntax it would simply append "site='$site'" to the WHERE clause. Is that the right general idea?

Similar, however a search engine will not have its data stored in a database, noo way. Could you imagine how long it would take to search. They use massive indexes.

 

Probably worth taking a look at sphinx. This is far better for a search feature on any website as opposed to using a database. http://sphinxsearch.com

Link to comment
Share on other sites

Question 1: I do think that they only apply those if the site: is at index 0 of the string.

Actually, you can do it anywhere in the string. Try out:

test site:phpfreaks.com
(and yeah, they just use regex)

I second the motion to take a look at sphinx. I've heard a lot of good things about it - never actually worked with it myself, but I know of a site that I kept up with the developers and it works pretty well for them.

Link to comment
Share on other sites

I second the motion to take a look at sphinx. I've heard a lot of good things about it - never actually worked with it myself, but I know of a site that I kept up with the developers and it works pretty well for them.

Got a system working from it where I have a database containing a few million records of text content. Using keywords I can search the index for the records that match, these are then selected from the database to produce new content. This takes less than a second to return results. If you tried to use a database query to do the same thing you would be waiting for minutes.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.