Jump to content

Ah I suck at Regex


jamesbrauman

Recommended Posts

Lol  ;)

 

I'm trying to extract the data between "Page:" and "Next Page" from a web page... no idea why this isn't working, but it isn't. Page source looks like this:

<p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div>

...And my regex code looks like this:

$pattern = "/^Page:(.*)Next Page$/";
preg_match($pattern, $buffer, $matches);
var_dump($matches);

 

It's not matching though  :o

Link to comment
https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/
Share on other sites

no need regex

$string = "<p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div>";
$startpos = strpos($string,"Page:");
$endpos = strpos($string,"Next Page");
echo substr($string,$startpos,$endpos-$startpos);

Link to comment
https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-669899
Share on other sites

Be careful when using .* as this usually very inefficient.

In cases like this, I think I would use lazy quantifiers instead (.+?):

 

$str = <<<EOT
<p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div>
EOT;

preg_match('#Page:(.+?)Next Page >>#', $str, $match);
echo $match[1];

 

Ouput:

<div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> 

 

Space are captured if there are found at the beginning / end.. you can simply through a trim on $match[1].

If there are multiple instances of Page: ... Next Page >>, you can use preg_match_all... if there are new lines, you can use s modifier (I kept my example simple).

Link to comment
https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-670193
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.