Jump to content

Archived

This topic is now archived and is closed to further replies.

dal_oscar

stats - spiders

Recommended Posts

I am storing all the IP addresses and their useragents of all visitors on my website in a mysql table, and I want to filter out the spiders and cralwers when I am displaying the totals for webstats.

Its a really long list - 557 user agents. I get about 6000 hits a day, and its impossible to query a seperate useragent table or perfom a simple search text (i have tried both) for each hit, especially since I need to display hits each day for 4 weeks.

Any suggestions?

Share this post


Link to post
Share on other sites
how can you tell which one is a spider and which is not?

if that's easy, why not write the spiders to another stats-tabel?

Share this post


Link to post
Share on other sites
[!--quoteo(post=362205:date=Apr 6 2006, 08:24 AM:name=Desdinova)--][div class=\'quotetop\']QUOTE(Desdinova @ Apr 6 2006, 08:24 AM) [snapback]362205[/snapback][/div][div class=\'quotemain\'][!--quotec--]
how can you tell which one is a spider and which is not?

if that's easy, why not write the spiders to another stats-tabel?
[/quote]


that's the problem - checking which is a spider.

i have a list of useragents, and when I get a new hit, I am currently just dumping it in a different table. During stats display, i retrieve each hit and compare that useragent to the spider useragent's list.

now, i can do this comparison during the initial dumping of the hit record but dont want to slow it down for the visitor.

what i am really looking for, is an optimum way to check if a hit is by a spider - a text search or a mysql query are very slow -

suggestions?

Share this post


Link to post
Share on other sites
maybe you could speed up the query by putting the spiders in categories.

say you create a table Spiders.
In this table you create columns ID, A, B, C, D through Z

every spider gets written down in the col which maches the spiders useragents first char.

so basically, you don't have your query searching all fields when checking for a spider, but only checking the col which matches the first char.

Think this should at least decrease load and thus waiting time.


but if it's a good solution, I don't know really.

Share this post


Link to post
Share on other sites
[!--quoteo(post=362215:date=Apr 6 2006, 08:55 AM:name=Desdinova)--][div class=\'quotetop\']QUOTE(Desdinova @ Apr 6 2006, 08:55 AM) [snapback]362215[/snapback][/div][div class=\'quotemain\'][!--quotec--]
maybe you could speed up the query by putting the spiders in categories.

say you create a table Spiders.
In this table you create columns ID, A, B, C, D through Z

every spider gets written down in the col which maches the spiders useragents first char.

so basically, you don't have your query searching all fields when checking for a spider, but only checking the col which matches the first char.

Think this should at least decrease load and thus waiting time.
but if it's a good solution, I don't know really.
[/quote]

Hmm....that sounds better. I'll implement this immediately.

But I'll be happy to hear of other methods if someone knows any.

Thanks Desdinova

Share this post


Link to post
Share on other sites
[!--quoteo(post=362216:date=Apr 6 2006, 08:59 AM:name=dal_oscar)--][div class=\'quotetop\']QUOTE(dal_oscar @ Apr 6 2006, 08:59 AM) [snapback]362216[/snapback][/div][div class=\'quotemain\'][!--quotec--]
Hmm....that sounds better. I'll implement this immediately.

But I'll be happy to hear of other methods if someone knows any.

Thanks Desdinova
[/quote]


Hi,
I have done that, but it still takes ages to load!!!
Any other ideas?

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.