Jump to content


Photo

stats - spiders


  • Please log in to reply
5 replies to this topic

#1 dal_oscar

dal_oscar
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 06 April 2006 - 11:25 AM

I am storing all the IP addresses and their useragents of all visitors on my website in a mysql table, and I want to filter out the spiders and cralwers when I am displaying the totals for webstats.

Its a really long list - 557 user agents. I get about 6000 hits a day, and its impossible to query a seperate useragent table or perfom a simple search text (i have tried both) for each hit, especially since I need to display hits each day for 4 weeks.

Any suggestions?

#2 Desdinova

Desdinova
  • Members
  • PipPipPip
  • Advanced Member
  • 41 posts

Posted 06 April 2006 - 01:24 PM

how can you tell which one is a spider and which is not?

if that's easy, why not write the spiders to another stats-tabel?

#3 dal_oscar

dal_oscar
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 06 April 2006 - 01:31 PM

[!--quoteo(post=362205:date=Apr 6 2006, 08:24 AM:name=Desdinova)--][div class=\'quotetop\']QUOTE(Desdinova @ Apr 6 2006, 08:24 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
how can you tell which one is a spider and which is not?

if that's easy, why not write the spiders to another stats-tabel?
[/quote]


that's the problem - checking which is a spider.

i have a list of useragents, and when I get a new hit, I am currently just dumping it in a different table. During stats display, i retrieve each hit and compare that useragent to the spider useragent's list.

now, i can do this comparison during the initial dumping of the hit record but dont want to slow it down for the visitor.

what i am really looking for, is an optimum way to check if a hit is by a spider - a text search or a mysql query are very slow -

suggestions?


#4 Desdinova

Desdinova
  • Members
  • PipPipPip
  • Advanced Member
  • 41 posts

Posted 06 April 2006 - 01:55 PM

maybe you could speed up the query by putting the spiders in categories.

say you create a table Spiders.
In this table you create columns ID, A, B, C, D through Z

every spider gets written down in the col which maches the spiders useragents first char.

so basically, you don't have your query searching all fields when checking for a spider, but only checking the col which matches the first char.

Think this should at least decrease load and thus waiting time.


but if it's a good solution, I don't know really.

#5 dal_oscar

dal_oscar
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 06 April 2006 - 01:59 PM

[!--quoteo(post=362215:date=Apr 6 2006, 08:55 AM:name=Desdinova)--][div class=\'quotetop\']QUOTE(Desdinova @ Apr 6 2006, 08:55 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
maybe you could speed up the query by putting the spiders in categories.

say you create a table Spiders.
In this table you create columns ID, A, B, C, D through Z

every spider gets written down in the col which maches the spiders useragents first char.

so basically, you don't have your query searching all fields when checking for a spider, but only checking the col which matches the first char.

Think this should at least decrease load and thus waiting time.
but if it's a good solution, I don't know really.
[/quote]

Hmm....that sounds better. I'll implement this immediately.

But I'll be happy to hear of other methods if someone knows any.

Thanks Desdinova

#6 dal_oscar

dal_oscar
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 07 April 2006 - 10:53 AM

[!--quoteo(post=362216:date=Apr 6 2006, 08:59 AM:name=dal_oscar)--][div class=\'quotetop\']QUOTE(dal_oscar @ Apr 6 2006, 08:59 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Hmm....that sounds better. I'll implement this immediately.

But I'll be happy to hear of other methods if someone knows any.

Thanks Desdinova
[/quote]


Hi,
I have done that, but it still takes ages to load!!!
Any other ideas?






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users