jasonxxx102 Posted January 8, 2013 Share Posted January 8, 2013 I have a basic PHP web crawler script and I need to expand its functionality, the problem is I'm a total noob at PHP and my knowledge is very basic so I'm coming here for some help. My goal is to have a basic user input (text box) and when the user types in a phrase; let's say "Red Apples" and hits the enter button the script should start crawling the web for the phrase "Red Apples" and store the plain text results along with the URL they originated from in a database. Here is what I've got so far: error_reporting( E_ERROR ); define( "CRAWL_LIMIT_PER_DOMAIN", 50 ); $domains = array(); $urls = array(); function crawl( $url ) { global $domains, $urls; echo "Crawling $url... "; $parse = parse_url( $url ); $domains[ $parse['host'] ]++; $urls[] = $url; $content = file_get_contents( $url ); if ( $content === FALSE ) { echo "Error.\n"; return; } $content = stristr( $content, "body" ); preg_match_all( '/http:\/\/[^ "\']+/', $content, $matches ); echo 'Found ' . count( $matches[0] ) . " urls.\n"; foreach( $matches[0] as $crawled_url ) { $parse = parse_url( $crawled_url ); if ( count( $domains[ $parse['host'] ] ) < CRAWL_LIMIT_PER_DOMAIN && !in_array( $crawled_url, $urls ) ) { sleep( 1 ); crawl( $crawled_url ); } } } If anybody could point me in the right direction that would be awesome. Quote Link to comment Share on other sites More sharing options...
cpd Posted January 8, 2013 Share Posted January 8, 2013 I see no specific problem and I see no offer of payment for work so what exactly are you looking for because you're sure-as-hell not gonna get someone to write the code for you... Quote Link to comment Share on other sites More sharing options...
jasonxxx102 Posted January 9, 2013 Author Share Posted January 9, 2013 (edited) Did I ask for somebody to write the code for me? I asked for somebody to point me in the right direction. If you're not going to be constructive just save your time and don't post. Edited January 9, 2013 by jasonxxx102 Quote Link to comment Share on other sites More sharing options...
haku Posted January 9, 2013 Share Posted January 9, 2013 What are you asking? You've showed us a code, but didn't tell us what the problem is or what issues you are facing. Quote Link to comment Share on other sites More sharing options...
gizmola Posted January 9, 2013 Share Posted January 9, 2013 Not really looking at the code you have, it's clear that there are 2 obvious elements to your question: 1. Accept input from a text box How about an html form? Code that up, and have the form post to your crawler script. The phrase will be available in the $_POST superglob 2. Store the results in a database Pick a database... many to choose from including no-sql db's like mongodb. You'll have to design an appropriate schema. It's not clear what the structure should be, or the purpose of storing the data in the first place. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.