Jump to content


Photo

User Agent Question


  • Please log in to reply
7 replies to this topic

#1 blacksnday

blacksnday
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 07 September 2006 - 02:31 PM

Normally when viewing User Agents for Browsers, it would look similar to:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

My question is, if it shows by first saying User-Agent:
such as:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
would you believe it was real or fake, or do certain things really put
User-Agent: within the UA id?

My reason for this question is to either block or allow it.
Within the past few days been recieving hits with that type of UA
and the few different IPs showing that UA have been basiclly
loading up to 10 pages per second
which makes me think it is not a valid UA.

#2 ober

ober
  • Staff Alumni
  • Advanced Member
  • 5,337 posts
  • LocationEast Coast, USA

Posted 07 September 2006 - 02:35 PM

Not sure.  Keep in mind that some browsers (opera) can easily spoof the UA.  It's a shame, but some sites will not work because the webmaster actually goes out of his way to block UAs other than IE.  I run around half the time identifying opera as IE because of this.

Info: PHP Manual


#3 blacksnday

blacksnday
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 07 September 2006 - 02:42 PM

Yea, FF also offers a plugin to change the UA.
My IP/UA blocker I use with extreme caution, hence the question.
The way I block UA's are based on Keywords such as:
WebWhacker, WWWCopy, BackStreet Browser etc...

And alot of programs use same UA and cannot be changed such as stuff like:
Java and
WebCapture (Adobe Acrobat web grabbing for preserving as a PDF) etc..

I dont actually ban by Full UA... just keywords that shouldnt be in the UA which can
help show its Bad.. and which still allows other stuff like
RssFwd to work with no probs :P

which brings to try and figure out if User-Agent is good or bad
because why would a UA string tell you it's a User-Agent when it is already known to be?
hrmm..

#4 ober

ober
  • Staff Alumni
  • Advanced Member
  • 5,337 posts
  • LocationEast Coast, USA

Posted 07 September 2006 - 02:58 PM

Good question.  One I certainly don't have an answer for.

Info: PHP Manual


#5 shoz

shoz
  • Staff Alumni
  • Advanced Member
  • 600 posts

Posted 07 September 2006 - 03:12 PM

Normally when viewing User Agents for Browsers, it would look similar to:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

My question is, if it shows by first saying User-Agent:
such as:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
would you believe it was real or fake, or do certain things really put
User-Agent: within the UA id?

My reason for this question is to either block or allow it.
Within the past few days been recieving hits with that type of UA
and the few different IPs showing that UA have been basiclly
loading up to 10 pages per second
which makes me think it is not a valid UA.


It's probably an error in the bot that's being used to access the page. The "User-Agent" string should be part of the Headers "field name" and not a part of the value. The application is probably sending the header in the way shown below.
User-Agent: User-Agent: ...


#6 blacksnday

blacksnday
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 07 September 2006 - 03:28 PM

It's probably an error in the bot that's being used to access the page. The "User-Agent" string should be part of the Headers "field name" and not a part of the value. The application is probably sending the header in the way shown below.

User-Agent: User-Agent: ...


So basically I would be safe to assume it is a Bot when it appears this way?
As the logs show it sure acts like a bot when loading 10+pages in less then 1second.

#7 shoz

shoz
  • Staff Alumni
  • Advanced Member
  • 600 posts

Posted 07 September 2006 - 03:47 PM

You can't really know that it's a bot based on the User-Agent string, but it can be a good guess. Some individuals download premade bots that have hardcoded/default User Agent strings and never bother (or don't know) to change them.

I don't know of any bugs in any apps MSIE or otherwise that put the "User Agent string in the header value but it's not impossible. So I wouldn't block on the user agent string you posted, but you'll have to decide for yourself.

the few different IPs showing that UA have been basiclly
loading up to 10 pages per second


The behaviour that you describe above would be a better reason to decide that it's a bot.

If you're going to block anything you'll have to decide why you're doing it in the first place. Is there any real reason to try to block the bots from accessing the page?

Blocking based on a rule saying for example that if an ip accesses more than x number of pages within x time block for x minutes is ok, but you'll probably want to make exceptions for google yahoo etc (By IP/User-Agent or using some other method). Depending on why you're doing it, it may not be worth the trouble.

There may be other ways to do something about it but nothing comes to mind at the moment.

#8 blacksnday

blacksnday
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 07 September 2006 - 03:57 PM

Blocking based on a rule saying for example that if an ip accesses more than x number of pages within x time block for x minutes is ok, but you'll probably want to make exceptions for google yahoo etc (By IP/User-Agent or using some other method). Depending on why you're doing it, it may not be worth the trouble.


Currently I dont block/ban based on
ip accesses more than x number of pages within x time block for x minutes
(even though I got some alpha-type code coming for that soon)

At this time I probably will block the UA with User-Agent
and since I track all bans/who was blocked by what ban/what UA they had, etc...
I will be able to better determine at a later date if 'Good' people are being wrongly banned.

Thanks for the help!




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users