jamesbrauman Posted October 20, 2008 Share Posted October 20, 2008 Lol I'm trying to extract the data between "Page:" and "Next Page" from a web page... no idea why this isn't working, but it isn't. Page source looks like this: <p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div> ...And my regex code looks like this: $pattern = "/^Page:(.*)Next Page$/"; preg_match($pattern, $buffer, $matches); var_dump($matches); It's not matching though Quote Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/ Share on other sites More sharing options...
jamesbrauman Posted October 20, 2008 Author Share Posted October 20, 2008 Don't worry, figured it out. (.*) wasn't matching newline characters. Quote Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-669878 Share on other sites More sharing options...
ghostdog74 Posted October 20, 2008 Share Posted October 20, 2008 no need regex $string = "<p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div>"; $startpos = strpos($string,"Page:"); $endpos = strpos($string,"Next Page"); echo substr($string,$startpos,$endpos-$startpos); Quote Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-669899 Share on other sites More sharing options...
nrg_alpha Posted October 20, 2008 Share Posted October 20, 2008 Be careful when using .* as this usually very inefficient. In cases like this, I think I would use lazy quantifiers instead (.+?): $str = <<<EOT <p >Page: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Next Page >> </a></div></div> EOT; preg_match('#Page:(.+?)Next Page >>#', $str, $match); echo $match[1]; Ouput: <div style='width:300px;'><div id=page-no><a href='/cat.php?cat_id=46&begin=0&num=1&numBegin=1'></a><font color='#ff3366'><b>1</b></font><a></a> </div><div style='float:left; clear:left;'><a href='/cat.php?cat_id=46&numBegin=1&num=2&begin=14'> Space are captured if there are found at the beginning / end.. you can simply through a trim on $match[1]. If there are multiple instances of Page: ... Next Page >>, you can use preg_match_all... if there are new lines, you can use s modifier (I kept my example simple). Quote Link to comment https://forums.phpfreaks.com/topic/129208-ah-i-suck-at-regex/#findComment-670193 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.