jamesbrauman Posted October 20, 2008 Share Posted October 20, 2008 Lol I'm trying to extract the data between "Page:" and "Next Page" from a web page... no idea why this isn't working, but it isn't. Page source looks like this: <p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div> ...And my regex code looks like this: $pattern = "/^Page:(.*)Next Page$/"; preg_match($pattern, $buffer, $matches); var_dump($matches); It's not matching though Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/ Share on other sites More sharing options...
jamesbrauman Posted October 20, 2008 Author Share Posted October 20, 2008 Don't worry, figured it out. (.*) wasn't matching newline characters. Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-669878 Share on other sites More sharing options...
ghostdog74 Posted October 20, 2008 Share Posted October 20, 2008 no need regex $string = "<p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div>"; $startpos = strpos($string,"Page:"); $endpos = strpos($string,"Next Page"); echo substr($string,$startpos,$endpos-$startpos); Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-669899 Share on other sites More sharing options...
nrg_alpha Posted October 20, 2008 Share Posted October 20, 2008 Be careful when using .* as this usually very inefficient. In cases like this, I think I would use lazy quantifiers instead (.+?): $str = <<<EOT <p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div> EOT; preg_match('#Page:(.+?)Next Page >>#', $str, $match); echo $match[1]; Ouput: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Space are captured if there are found at the beginning / end.. you can simply through a trim on $match[1]. If there are multiple instances of Page: ... Next Page >>, you can use preg_match_all... if there are new lines, you can use s modifier (I kept my example simple). Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-670193 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.