imperium2335 Posted April 1, 2009 Share Posted April 1, 2009 Is there a way to make a php script browse Google search results (or any other SE) and explore those websites looking for a keyword? Or is there another language better suited to this? A part of it I think would be to make it strip the source code of every page looking for the word, but im not sure how to go about automating it to explore by itself. Regards, Tom. Link to comment https://forums.phpfreaks.com/topic/152031-php-explorer-bot/ Share on other sites More sharing options...
thebadbad Posted April 1, 2009 Share Posted April 1, 2009 That should be perfectly doable in PHP. You will need to scrape a search page for results (URLs), and then scrape those pages. Have a look at my posts in some earlier threads: http://www.phpfreaks.com/forums/index.php/topic,224108.0.html http://www.phpfreaks.com/forums/index.php/topic,221824.0.html Link to comment https://forums.phpfreaks.com/topic/152031-php-explorer-bot/#findComment-798434 Share on other sites More sharing options...
Yacoby Posted April 1, 2009 Share Posted April 1, 2009 If you are going to do anything more complex than just look at words, there is a dom library that has jQuery like selectors, which is for some things (IMHO) far better than regex. It depends on how much you like regex I suppose. http://simplehtmldom.sourceforge.net/ Link to comment https://forums.phpfreaks.com/topic/152031-php-explorer-bot/#findComment-798470 Share on other sites More sharing options...
imperium2335 Posted April 1, 2009 Author Share Posted April 1, 2009 Thanks for your replies. Ultimatly what i want to make is something that scrapes the se results for a given keyword "blog". Then explores all those urls, checking the pagerank of posts and if they are dofollow. If they are, then I want them to be saved to a database. Link to comment https://forums.phpfreaks.com/topic/152031-php-explorer-bot/#findComment-798473 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.