Jump to content

help finding string


denoteone

Recommended Posts

Here ya go:

 

<?php
$str = '<b class="yfi-price-change-down">122.42</b></span> <span id="yfs';
preg_match('~class="yfi-price-change-down">([0-9.,]+)~i', $str, $matches);
echo $matches[1];
?>

 

Will match one or more (the plus) digits, dots or commas (character class: [0-9.,]) where you specified.

Link to comment
Share on other sites

ok it looks as if 

<b class="yfi-price-change-down">GET THIS DATA</b>

 

appears more then once in the html. 

 

but if I look for

 

<b class="yfi-price-change-down">GET THIS DATA</b></span> <span id=>

 

that only appears once.  what would the regex be for that?

Link to comment
Share on other sites

'~<b class="yfi-price-change-down">([0-9.,]+?)</b></span> <span id=~i'

 

If you want to match any characters instead of only digits, dots and commas, change ([0-9.,]+?) to (.+?).

But if it's not working, show us the part of your script where the match is actually searched for.

Link to comment
Share on other sites

how can you scrape data from a webpage? I spent two hours trying earlier ??

 

Although you're topic hijacking: You can use file_get_contents() or cURL to store an external site's source code in a variable. Or use the function below; it uses cURL if installed, else file_get_contents() if allow_url_fopen is enabled on the server:

 

<?php
//store page in variable using cURL if available, else using file_get_contents()
//set user agent string
ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; da; rv:1.9) Gecko/2008052906 Firefox/3.0');
function store_page($url) {
if (function_exists('curl_init')) {
	$c = curl_init();
	curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
	curl_setopt($c, CURLOPT_URL, $url);
	$contents = curl_exec($c);
	curl_close($c);
	return $contents;
} elseif (ini_get('allow_url_fopen') == 1 && $contents = @file_get_contents($url)) {
	return $contents;
} else {
	die('Unable to crawl page. Be sure to have cURL installed or to have \'allow_url_fopen\' set to \'On\' in php.ini.');
}
}
?>

 

Then scrape the content you need by using regular expressions (via the preg_match() or preg_match_all() function). Writing complex regexes takes a lot of practice/experience.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.