denoteone Posted March 23, 2009 Share Posted March 23, 2009 how can I get what is between this tag? <b class="yfi-price-change-down">122.42</b></span> <span id="yfs I need to find the number in the middle of the above string? I think it is a regex but I am not able to figure it out. Quote Link to comment Share on other sites More sharing options...
shadiadiph Posted March 23, 2009 Share Posted March 23, 2009 where is the number 122.42 originating from? does it change? if you are trying to copy it from a website its impossible without knowing the string that caused it Quote Link to comment Share on other sites More sharing options...
Yesideez Posted March 23, 2009 Share Posted March 23, 2009 It's not impossible - just trying to access my web space so I can get some regex code... EDIT: Just remembered I no longer have the code on my web space. At work and can't access my home PC from here Quote Link to comment Share on other sites More sharing options...
syed Posted March 23, 2009 Share Posted March 23, 2009 Are you creating a data scraper? Quote Link to comment Share on other sites More sharing options...
shadiadiph Posted March 23, 2009 Share Posted March 23, 2009 how can you scrape data from a webpage? I spent two hours trying earlier ?? Quote Link to comment Share on other sites More sharing options...
denoteone Posted March 23, 2009 Author Share Posted March 23, 2009 I am making a data scraper. and no the number will not always be the same. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted March 23, 2009 Share Posted March 23, 2009 Here ya go: <?php $str = '<b class="yfi-price-change-down">122.42</b></span> <span id="yfs'; preg_match('~class="yfi-price-change-down">([0-9.,]+)~i', $str, $matches); echo $matches[1]; ?> Will match one or more (the plus) digits, dots or commas (character class: [0-9.,]) where you specified. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted March 23, 2009 Share Posted March 23, 2009 Note: If class="yfi-price-change-down"> appears more than one place in the source code, the script will only match the following number at the first place. So just tell me if it needs tweaking. Quote Link to comment Share on other sites More sharing options...
denoteone Posted March 23, 2009 Author Share Posted March 23, 2009 "ChangeDown" => array("pattern" => "/<b class=\"yfi-price-change-down\">(.+?)<\/b>/", "value" => "N/A"), this what i had but it is not working later in the script it checks the data and if it returns nothing I get "N/A" Quote Link to comment Share on other sites More sharing options...
thebadbad Posted March 23, 2009 Share Posted March 23, 2009 That regex looks fine, so post all the relevant code, so we can see where it's failing. Quote Link to comment Share on other sites More sharing options...
denoteone Posted March 23, 2009 Author Share Posted March 23, 2009 ok it looks as if <b class="yfi-price-change-down">GET THIS DATA</b> appears more then once in the html. but if I look for <b class="yfi-price-change-down">GET THIS DATA</b></span> <span id=> that only appears once. what would the regex be for that? Quote Link to comment Share on other sites More sharing options...
thebadbad Posted March 23, 2009 Share Posted March 23, 2009 '~<b class="yfi-price-change-down">([0-9.,]+?)</b></span> <span id=~i' If you want to match any characters instead of only digits, dots and commas, change ([0-9.,]+?) to (.+?). But if it's not working, show us the part of your script where the match is actually searched for. Quote Link to comment Share on other sites More sharing options...
shadiadiph Posted March 23, 2009 Share Posted March 23, 2009 is this for getting data from another site? Quote Link to comment Share on other sites More sharing options...
denoteone Posted March 23, 2009 Author Share Posted March 23, 2009 yes it is for getting data from another site. Quote Link to comment Share on other sites More sharing options...
thebadbad Posted March 23, 2009 Share Posted March 23, 2009 how can you scrape data from a webpage? I spent two hours trying earlier ?? Although you're topic hijacking: You can use file_get_contents() or cURL to store an external site's source code in a variable. Or use the function below; it uses cURL if installed, else file_get_contents() if allow_url_fopen is enabled on the server: <?php //store page in variable using cURL if available, else using file_get_contents() //set user agent string ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; da; rv:1.9) Gecko/2008052906 Firefox/3.0'); function store_page($url) { if (function_exists('curl_init')) { $c = curl_init(); curl_setopt($c, CURLOPT_RETURNTRANSFER, true); curl_setopt($c, CURLOPT_URL, $url); $contents = curl_exec($c); curl_close($c); return $contents; } elseif (ini_get('allow_url_fopen') == 1 && $contents = @file_get_contents($url)) { return $contents; } else { die('Unable to crawl page. Be sure to have cURL installed or to have \'allow_url_fopen\' set to \'On\' in php.ini.'); } } ?> Then scrape the content you need by using regular expressions (via the preg_match() or preg_match_all() function). Writing complex regexes takes a lot of practice/experience. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.