Jump to content

New Search Engine


johnathanamber

Recommended Posts

I don't understand what the difference is. Or would do.

This is your requirement yes?

I have an auction site that the client wants the search engine to be able to search for items within a given mileage radius.

To find cities, towns, train stations, mcdonalds, whatever, within a given distance of a set point you use geo co-ordinates (latitude and longitude). It seems that you have this data in your database. So, once you have the lat, lon of your start point you can use what is called the haversine formula to find the nearest...

You are using this in the code, yes, but you do not have to. Use in the actual query to obtain the results required. The query that you are currently using is highly inefficient.

Link to comment
Share on other sites

it doesn't display also the originating ZIP

You are specifically excluding the search zip code with these two lines:

AND (latitude != $lat1 
AND longitude != $lon1)

since $lat1 and $lon1 are the values you retrieved from the database for the search zipcode, you are telling the query NOT to return that record.

 

I'm not sure why it would be including 74133.  Check the latitude and longitude for that record, make sure there is only one record for that zip, echo the sql and look to see what is causing it to be included.

 

 

Link to comment
Share on other sites

@Neil,

 

So what would it look like if you used this as a basis:

            $query = "SELECT  id, title, shortdescription, description, picture, BIN_price, auction_type, start_date, end_date, hits, cat, auction_nr, nr_items_left, reserve_price, zipcode, weight, zip, city, state
                  FROM jos_bid_auctions JOIN zip_codes ON zipcode = zip
                  WHERE description OR shortdescription like '%$term%'
                  AND(latitude <= $latN
                     AND latitude >= $latS
                     AND longitude <= $lonE
                     AND longitude >= $lonW)
                  ORDER BY zipcode";
            }

Link to comment
Share on other sites

@Neil

I do not see how your query is more efficient than the one I submitted.  In fact, it looks less efficient.  Without the limitations on the latitude and longitude, this query will have to calculate the distance for every auction in the table. 

 

SELECT a.description, a.zipcode, z.city, z.state, 
SQRT(POW(69.1 * (z.latitude - ".$latitude."), 2) + POW(69.1 * (".$longitude." - z.longitude) * COS(latitude / 57.3) , 2)) AS distance 
FROM jos_bid_auctions a INNER JOIN zipcodes z 
ON (a.zipcode=z.zipcode) 
HAVING distance  < ".$distance." 
ORDER BY distance ASC

 

 

The query I submitted, on the other hand, can take advantage of indexes on both tables to reduce the number of rows to be processed.  If Latitude and/or Longitude is indexed on the zipcode table, then that will limit the number of zipcodes to be checked.  Once a zipcode is found that fits the limits, an index on zipcode in the aution table can be used to retrieve the record.

SELECT A.description, A.zipcode, Z.city, Z.state
FROM jos_bid_auctions A JOIN zipcodes Z ON A.zipcode = Z.zipcode
WHERE A.description like '%auction%'
AND Z.latitude BETWEEN 36.286982858078 AND 35.997537141922
AND Z.longitude BETWEEN -95.699526863467 AND -96.058611136533
ORDER BY Z.state, Z.city, Z.zipcode$

 

 

Granted, having the distance calculation in the query could be handy.  But that could be done in PHP against the resultset.  I'm not sure which process there might be more efficient, I'm not sure it would make much difference.  And I will admit I do not know the internals of mySql; but I do know that a table scan looses to an indexed read in almost every case.  Also, I have never used a float for an index, but I don't see how it could cause significant problems.  In fact, I might personally be inclinded to make the lat and long DECIMAL columns with 5 or 6 digits on the right.

 

@johnathan

Use BETWEEN instead of <= AND >=.  I think the server can make better use of the indexes since it will know both limits in one clause.  At any rate, it should not be less efficient.  Just keep in mind that BETWEEN is inclusive (both limits also satisfy the condition).

Link to comment
Share on other sites

In fact, it looks less efficient.  Without the limitations on the latitude and longitude, this query will have to calculate the distance for every auction in the table.

This is not an issue. Having indexes on latitude, longitude will be fast.

 

Indexes should also exist for zipcode as there is a join on the field.

The returned result set should not be large as you are limiting to a given distance.

 

Granted, having the distance calculation in the query could be handy.  But that could be done in PHP against the resultset.

You should not use code to manipulate data that can be done from an initial query and return the correct data. More code, more parse time and more issues when it comes to updating the program. This is good practice for any language.

In fact, I might personally be inclinded to make the lat and long DECIMAL
Geo data is best stored as a float.

 

I have sites running maps using these queries from millions of records without issue. Read up on geocoding with mysql. It is very efficient.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.