monkeytooth Posted October 28, 2008 Share Posted October 28, 2008 Ok, I know this is possible its been done by others.. I know there is no simple answer to this (which if there is lay it on me). So is it curl? That I would use to get information off a myspace profile, or facebook profile? I am mostly aiming at myspace at the moment. Im not looking to log in or change anything at the moment.. just mostly get things off any given profile and display it on another site. What is it I am looking for is there any key word descirptions I can google for a tutorial or something to get me started on this or is there anything anyone here can do to help me get started on what I need to do for this? Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/ Share on other sites More sharing options...
DarkWater Posted October 28, 2008 Share Posted October 28, 2008 Well, you'd have to log in unless they had a public profile, but otherwise, cURL and regex could probably do whatever you're asking for. Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/#findComment-677007 Share on other sites More sharing options...
monkeytooth Posted October 28, 2008 Author Share Posted October 28, 2008 Well for my purposes at the moment I am assuming public based profiles.. you know of any good tutorials on how to use regex and cURL outside of php.net don't even care if it deals with myspace profile (though that would be nice) but something a bit on the advanced side of learning either or both. Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/#findComment-677015 Share on other sites More sharing options...
aseaofflames Posted October 29, 2008 Share Posted October 29, 2008 a quick function (will work until myspace changes the profile): function get_myspace_info($url) { //get the contents of the page $page = file_get_contents($url); //get html info info table $e = preg_match('#<table id="Table2"(.*?)</table>#msi',$page,$i); $f = preg_match('#</td></td>(.*?)</td>#msi',$i[1],$in); //get profile picture $g = preg_match('#<img border="0"(.*?)</a>#msi',$i[1],$pic); //get mood $h = preg_match('#<span class="searchMonkey-mood">(.*?)</span>#msi',$i[1],$cm); $mood = $cm[1]; $picture = str_replace("</a>","",$pic[0]); $info = explode("<br />",strip_tags(trim($in[1]),"<br><img>")); $last_login = str_replace("Last Login: ","",$info[9]); //build info array $myspace = array("url" => $url, "slogan" => $info[0], "sex" => $info[2], "age" => $info[3], "location" => $info[4], "country" => $info[5], "last_login" => $last_login, "picture" => $picture, "mood" => $mood); return $myspace; } call example: $myspace_info = get_myspace_info("http://www.myspace.com/tom"); $myspace_info will contain an array similar to this: Array ( [url] => http://www.myspace.com/tom [slogan] => ":-)" [sex] => Male [age] => 33 years old [location] => Santa Monica, CALIFORNIA [country] => United States [last_login] => 10/28/2008 [picture] => <img border="0" alt="" src="http://b2.ac-images.myspacecdn.com/00000/20/52/2502_m.jpg" /> [mood] => pinkfloyd ) test: http://backup.aseaofflames.com/myspace.php hope this can get you started. Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/#findComment-677044 Share on other sites More sharing options...
monkeytooth Posted October 29, 2008 Author Share Posted October 29, 2008 That is deffinately a good start.. curiosity has me though #msi is seen through out, is that something myspace specific? also is this reading the meta tag somehow or the actual page and breaking the page down? lastly is there anyway i can get a raw like output so I can work on getting other information if possible Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/#findComment-677220 Share on other sites More sharing options...
aseaofflames Posted October 30, 2008 Share Posted October 30, 2008 the #msi is some sort of preg_match option. Someone else could probably explain it better. The first time i used preg_match the example use it and it doesn't work without it, so I use it. The code reads information from the actual page. It reads the page, then gets the html of the info table (the one with the picture) then it splits it up and pushes it into an array. To get more infomation from the page you would have to find a unique element in the html and then use a preg_match to retrieve it. It's really guess and check. for example to get number of comments: $i = preg_match('#<span class="redtext">(.*?)</span>comments#msi',$page,$comments); then $comments[1] would contain the number of comments on the page. this works because the html near the number of comments is: <b>Displaying<span class="redtext"> 50 </span>of<span class="redtext"> 792602 </span>comments Link to comment https://forums.phpfreaks.com/topic/130501-curl-scrape-how-what/#findComment-678150 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.