Jump to content

getting a visiting bot to automatically run a program called from index


ocpaul20

Recommended Posts

I want to get a bot such as sogou to run run a program when it visits my index page but for some reason it doesn't work.

 

Maybe they are programmed not to follow header('location .... statements,

so how would I go about this?

 

can it be done?

Yes and no. If you look at a list of bot identifiers there are lots. There are also so many browser user-agents that to kepp track of them is pretty hard. What you could do is this.

Grab the user agent of all that hit your index page. Take the list of browsers and bots from here:

http://www.useragentstring.com/pages/useragentstring.php

Parse the UA against the list of browsers, if there is no match then you can parse against the list of bots. You could just stop after the browser parse but if you parse against the bot list and dont get a hit you can log the  UA and manually add it to the list it belongs in to help better your future results.

If the parse hits a bot you can then trigger your bot script or if its a complete miss you can find it in your logs and next time they will pass as a  browser or trigger the bot script.

This wont get something that uses a valid UA but good bots dont.

 

 

HTH

Teamatomic

I have done the programming and it works when I call index from another program using CURL and spoofing the user agent, so the logic works OK.

 

It is just that the real bot (for example sogou.com does not seem to trigger the running of the program, so it appears like it does not follow the header(.... statement like my curl program does. So maybe re-directs are not handled? (that maybe the answer?)

 

The thing is, I had to program the curl program to handle re-directs before it would run effectively.

 

I guess that bots are just looking at the text of the one page and not actually following links and re-directs like a browser would do.

 

Thats a shame as I was hoping to automate something without having to set up a cron job.

You can make it work. Once your script on the index page checks the visitors UA you could use javascript to send a form and the script you want run could be the action, if you dont want the include the code in your check script. You could even use GET to send the script params. The trick is you have to launch it all yourself. As you have found the bot wont launch your script for you.

 

Just curious: are you trying to poison a bot. If you are there are much easier ways to go about it.

 

HTH

Teamatomic

Thanks for your reply.

 

No, not trying to poison a bot, but thinking about how to get bots to 'register' themselves by placing a link in a place that cannot be found or 'seen' by a normal browser visitor.

 

I know you can generally see by the bot's action in the logs that it is a bot, but anyway....

 

I am curious how this idea of yours would work in practice as I did not think that bots ran javascript? I thought that was why most of the statistic code is in javascript.

 

I assume it would use document write and submit form.

Care to suggest a short example for me and others please? :-)

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.