alexdemers Posted August 18, 2008 Share Posted August 18, 2008 Hi, I just want to extract ALL the content from a specific tag with a specific ID. Here's the sample of the HTML: There are alot of id+integer on the page. <div id="id1">all the tags and text in here</div> <div id="id2">all the tags and text in here</div> <div id="id3">all the tags and text in here</div> Here's my preg_match_all function: preg_match_all ('<div id=\"id(?:[\d]+)\">(.*)<\/div>/ism', $str, $r); echo '<pre>'; print_r ($r); The first sub pattern is to catch all the id with an integer (ie: id1, id2, id3 ... ) ... no trouble there Then I want to get ALL the text and HTML between the "div" tags. For some reason, it doesn't stop and match at the first </div>. It continues to get all the content before reaching THE LAST "</div>". This has been an issue for the past 2 hours. I can't get it to work. Maybe anyone can help me? Thanks. Quote Link to comment Share on other sites More sharing options...
alexdemers Posted August 18, 2008 Author Share Posted August 18, 2008 I'll specify that I need the content of everything inside every different div tag with the id name like I specified. Example: Array ( [0] => "content of <div id="id1"></div>" [1] => "content of <div id="id2"></div>" [2] => "content of <div id="id3"></div>" [3] => "content of <div id="id4"></div>" ) Thanks again Quote Link to comment Share on other sites More sharing options...
corbin Posted August 18, 2008 Share Posted August 18, 2008 preg_match_all ('<div id=\"id(?:[\d]+)\">(.*)<\/div>/ism', $str, $r); to preg_match_all ('<div id=\"id(?:[\d]+)\">(.*?)<\/div>/ism', $str, $r); Look into greediness of regular expressions. Quote Link to comment Share on other sites More sharing options...
alexdemers Posted August 18, 2008 Author Share Posted August 18, 2008 preg_match_all ('<div id=\"id(?:[\d]+)\">(.*)<\/div>/ism', $str, $r); to preg_match_all ('<div id=\"id(?:[\d]+)\">(.*?)<\/div>/ism', $str, $r); Look into greediness of regular expressions. Wow, I can't thank you enough Quote Link to comment Share on other sites More sharing options...
corbin Posted August 19, 2008 Share Posted August 19, 2008 No problem. Quote Link to comment Share on other sites More sharing options...
effigy Posted August 19, 2008 Share Posted August 19, 2008 There's no need for non-capturing parentheses or a character classed \d: %<div id="id\d+">(.*?)</div>%ism Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.