acidglitter Posted December 15, 2007 Share Posted December 15, 2007 i'm google-ing some screen scraping codes and tutorials but it's overwhelmingly confusing.... could anyone help me by just showing me a code where it only gets everything between the <a> tags on a page and displays the links on my page? Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted December 15, 2007 Share Posted December 15, 2007 Are you familiar with regular expressions? They will make this task much easier. // UNTESTED // Assumes $content has the content you wish to scrape $regexp_a = '/<a[^>].*<\/a>/'; // $regexp_a = '/([<]a[^>]*[>](.*)[<][/]a[>])/'; // try this one if above fails preg_match($regexp_a, $content, $matches); echo '<pre style="text-align: left;">' . print_r($matches, true) . '</pre>'; Quote Link to comment Share on other sites More sharing options...
acidglitter Posted December 15, 2007 Author Share Posted December 15, 2007 Thanks that helped a little. I'm not very good at regular expressions but I'm trying to get like the default image off of my myspace page. This is what I have so far... $data = file_get_contents('http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=000'); $regexp_a = '/<a.*[^id="ctl00_Main_ctl00_UserBasicInformation1_hlDefaultImage"$].*[^><img].* \/><\/a>/'; preg_match($regexp_a, $data, $matches); echo $matches[0]; but instead of getting the default picture it gets the send message link <a href="http://messaging.myspace.com/index.cfm?fuseaction=mail.message&friendID=000" id="ctl00_Main_ctl00_UserContactLinks1_MailLink"><img src="http://x.myspace.com/images/profile/mail_1.gif" border="0" align="middle" /></a> ??? Quote Link to comment Share on other sites More sharing options...
acidglitter Posted December 15, 2007 Author Share Posted December 15, 2007 i tried testing different things, so now i have this.. $regexp_a = '/<a type=.*><img.* \/><\/a>/'; i would think it would work but for some reason it doesn't but if i change it to "a href.." then it works but it just shows the messaging link again Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted December 15, 2007 Share Posted December 15, 2007 How about pasting the text you want to capture from and telling which part you want to capture? Quote Link to comment Share on other sites More sharing options...
acidglitter Posted December 15, 2007 Author Share Posted December 15, 2007 Okay this is the entire code from my page.. <a type="text/javascript" id="ctl00_Main_ctl00_UserBasicInformation1_hlDefaultImage" href="http://viewmorepics.myspace.com/index.cfm?fuseaction=user.viewAlbums&friendID=0"><img border="0" alt="" src="http://a963.ac-images.myspacecdn.com/images01/64/m_dfd895b94371623d5059281421c137da.jpg" /></a> and I want to be able to pull this out to show just my default picture on my site. Everytime I change my picture the address in the above code will change too.. http://a963.ac-images.myspacecdn.com/images01/64/m_dfd895b94371623d5059281421c137da.jpg Quote Link to comment Share on other sites More sharing options...
acidglitter Posted December 15, 2007 Author Share Posted December 15, 2007 I finally got it to work I looked up more codes and then changed a couple things and now have this.. <?php $ch = curl_init() or die(curl_error()); curl_setopt($ch, CURLOPT_URL,"http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=0"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $data1=curl_exec($ch) or die(curl_error()); $okay="/<a type.*>/"; if(preg_match($okay, $data1, $matches)){ echo $matches[0]; } echo curl_error($ch); curl_close($ch); ?> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.