rhock_95 Posted September 1, 2007 Share Posted September 1, 2007 Can someone explain how to use this class...it is a scraping tool that uses curl and although I am familiar with basic php I don't have a clue how to use this class...cURL is entirely new to me... does this file get edited or do I need separate script that calls this class the way it is? any help or insights is appreciated <?php /******************************************************************************* * grabber.php * by voyager, 2003 * * A class which is usefull for grabbing any information of any site over the net. * It can retusrn a single value (text) or an array (texts) with given 'markup' * strings. The class uses PHP CURL functions so you need CULR installed on your * server ********************************************************************************/ class Grabber { var $content; var $content_array; var $noURL; //this is the boolean which will mark if to open the URL or not var $text; //the text without starting and ending unneeded parts var $searchar; // the searched array var $searchtxt; //the searched text // The constructor opens an URL and writes it in give file and dir // if 4th argument is given it check if this file exists already // when is the last modification and if it is older, opens the URL, // else opens the file. If $ifmod=0 it always opens the URL //it defaults to 24 hours function Grabber($url,$tmpdir='tmp/',$tmpfile='tmp.txt',$ifmod=86400) { $this->content=""; $ch = curl_init (); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_TIMEOUT, 60); $useragent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040206 Firefox/0.8"; curl_setopt($ch,CURLOPT_USERAGENT,$useragent); $this->content = curl_exec ($ch); curl_close ($ch); } //this grabs only a piece of text function grab_unit($start,$end) { //cut from start to end $this->text=substr($this->content,strpos($this->content,$start)+strlen($start)+1); $this->text=substr($this->text,0,strpos($this->text,$end)); $this->searchtxt=$this->text; } //it gets start delimeter, end delimeter and an array of pieces to //put out. Returns the array of needed infomation //delimstart and delimend are arround the pieces of searched data function grab_array($start,$delimstart,$delimend,$end) { //cut from start to end $this->text=substr($this->content,strpos($this->content,$start)+strlen($start)+1); $this->text=substr($this->text,0,strpos($this->text,$end)); //getting out the unndeeded and pushing into the array $this->searchar=preg_split("@$delimstart|$delimend@",$this->text); } //the elemnts of the array arent still what we need? function refine_array($start,$end,$clear_html=0) { for($i=0;$i<sizeof($this->searchar);$i++) { $this->searchar[$i]=substr($this->searchar[$i], strpos($this->searchar[$i],$start)+strlen($start)); if(!empty($end)) { $this->searchar[$i]=substr($this->searchar[$i], 0,strpos($this->searchar[$i],$end)); } if($clear_html) { $this->searchar[$i]=strip_tags($this->searchar[$i]); } } } //You still have some unregular data which makes everything bad? // REmove the trash, giving an array of it function remove_trash($trash) { for($i=0;$i<sizeof($trash);$i++) { for($j=0;$j<sizeof($this->searchar);$j++) { $this->searchar[$j]=str_replace($trash[$i],"",$this->searchar[$j]); } $this->searchtxt=str_replace($trash[$i],"",$this->searchtxt); } } //this function does not work with the members of the grabber. //it just takes start and end limits and the content - $word // to return whats inside. You can easy debug it with giving 1 to testvar function cut($start,$end,$word,$testvar=0) { $word=substr($word,strpos($word,$start)+strlen($start)); if($testvar) die($word); $word=substr($word,0,strpos($word,$end)); return $word; } function send_post($vars, $url) { $strRequestBody = ""; while (list($key, $val) = each($ascVarStream)) { if($strRequestBody != "") $strRequestBody.= "&"; $strRequestBody.= $key."=".$val; } $ch = curl_init(); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_URL, $strURL); curl_setopt ($ch, CURLOPT_POST, $strRequestBody); curl_setopt ($ch, CURLOPT_POSTFIELDS, $strRequestBody); $return_string = curl_exec ($ch); curl_close ($ch); if ($return_string=="") { $message="Error: Could not post to remote system."; return $message; } return $return_string; } // End function } ?> Quote Link to comment Share on other sites More sharing options...
dsaba Posted September 1, 2007 Share Posted September 1, 2007 here's a function to grab anything off the internet: file_get_contents() cURL is an extra library you can compile with php using the libcurl library http://curl.haxx.se/ http://us.php.net/curl-setopt basically once you compile php with curl, which php 5 defaults already compilied with it u just have to un-semicolon it to activate it... once u do this you have a whole set of curl() functions available to use, curl is used for accessing and imitating HTTP requests, FTP and a whole bunch more, a common HTTP request used with it is to HTTP POST to websites, fill out forms automatically ..etc.... it is not a class, but a set of functions in php once u enable it, just like GD or any other library you're above post is indeed a class someone made using curl functions and other stuff, it is no way "universal" to grab things online, sure it might work for most sites, but usually HTTP requests and algorothims are specific for each website, for example to fill out the form to sign up a new email at yahoo is different from doing a POST request to accessing paypal services or logging in bla bla bla, use live http headers, wireshark to investigate how ur client pc interacts with the server pc, or the remote script, then mimic it with curl, don't know how to use curl? read the docs..see examples don't ask how to use it all though, ask how to do a specific thing.. good luck Quote Link to comment Share on other sites More sharing options...
rhock_95 Posted September 1, 2007 Author Share Posted September 1, 2007 I already have cURL installed don't ask how to use it all though, ask how to do a specific thing.. All I asked at this time was "does this file get edited" ? or "do I need to write a seperate script that calls this class (without any editing)? Quote Link to comment Share on other sites More sharing options...
rhock_95 Posted September 1, 2007 Author Share Posted September 1, 2007 can anyone offer some insights on how to use this class? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.