dsaba Posted December 13, 2007 Share Posted December 13, 2007 (yes that's the most descriptive title I could come up with, other than "regex help") haystack 1: <?php $haystack = '<td class="panelsurround" align="center"> <div class="panel"> <div align="right"> <div class="fieldset"> <div style="padding:3px"> תאריך הצטרפות: <strong>04-28-2007</strong> </div> </div> <fieldset class="fieldset"> <legend>הודעות</legend> <table cellpadding="0" cellspacing="3" border="0"> <tr> <td> סך הכל הודעות: <strong>3,117</strong> (13.63 הודעות בכל יום) </td> </tr> <tr> <td><a href="search.php?do=finduser&u=3095" rel="nofollow">מצא את כל ההודעות על ידי ThaiB0X</a></td> </tr> <tr> <td><a href="search.php?do=process&showposts=0&starteronly=1&exactname=1&searchuser=ThaiB0X" rel="nofollow">מצא את כל הנושאים שנכתבו על ידי ThaiB0X</a></td> </tr> </table> </fieldset>'; ?> *edit haystack 1 is encoded in utf-8, the language is hebrew, but the codebox converts it to html entities so i'm putting it in the quotebox instead: <td class="panelsurround" align="center"> <div class="panel"> <div align="right"> <div class="fieldset"> <div style="padding:3px"> תאריך הצטרפות: <strong>04-28-2007</strong> </div> </div> <fieldset class="fieldset"> <legend>הודעות</legend> <table cellpadding="0" cellspacing="3" border="0"> <tr> <td> סך הכל הודעות: <strong>3,117</strong> (13.63 הודעות בכל יום) </td> </tr> <tr> <td><a href="search.php?do=finduser&u=3095" rel="nofollow">מצא את כל ההודעות על ידי ThaiB0X</a></td> </tr> <tr> <td><a href="search.php?do=process&showposts=0&starteronly=1&exactname=1&searchuser=ThaiB0X" rel="nofollow">מצא את כל הנושאים שנכתבו על ידי ThaiB0X</a></td> </tr> </table> </fieldset>' haystack 2: <?php $haystack = '<tr> <td class="tcat" width="50%">Around the world</td> <td class="tcat" width="50%">Bla bla bla</td></tr><tr valign="top"> <td class="panelsurround" align="center"> <div class="panel"> <div align="left"> <div class="fieldset"> <div style="padding:3px"> Coffee Donuts: <strong>06 Jan 2002</strong> </div> </div> <fieldset class="fieldset"> <legend>Books</legend> <table cellpadding="0" cellspacing="3" border="0"> <tr> <td> Cheap Electronics: <strong>15,300</strong> (7.06 scratch your head) </td> </tr> <tr> <td><a href="url" rel="nofollow">Sunshine is good for the body</a></td> </tr> <tr> <td><a href="url.php/whatever/ok.php" rel="nofollow">The great corral reef is in Australia</a></td> </tr> </table> </fieldset>'; ?> <?php $pat = '/\<td\>[ ]+?[^ ].+: \<strong\>([0-9,]+?)\<\/strong\> \(([0-9\.]+?)[ ].+?\)[ ]+?\<\/td\>/'; preg_match_all($pat, $haystack, $out); print_r($out); ?> the two substrings that I would like to match are shown in this simplified example below, the substrings I want to match are enclosed in { } brackets The actual haystrack though does not have these brackets <td> סך הכל הודעות: <strong>{3,116}</strong> ({13.63} הודעות בכל יום) </td> The haystacks will be encoded in utf-8. Can you fix my regex code, or tell me what i'm doing wrong?? My above regex comes up with no matches at all. - Thank you. Quote Link to comment Share on other sites More sharing options...
rajivgonsalves Posted December 13, 2007 Share Posted December 13, 2007 try $pat = '~<strong>([0-9,]+)</strong> \(([0-9\.]+)~'; Quote Link to comment Share on other sites More sharing options...
dsaba Posted December 13, 2007 Author Share Posted December 13, 2007 Array ( [0] => Array ( ) [1] => Array ( ) [2] => Array ( ) ) That's my result with that change, even after I fixed the extra ) parenthesis you forgot to add. I tried: $pat = '~<strong>([0-9,]+)</strong> \(([0-9\.]+)\)~'; Quote Link to comment Share on other sites More sharing options...
rajivgonsalves Posted December 13, 2007 Share Posted December 13, 2007 The extra parentess was not suppossed to be added Quote Link to comment Share on other sites More sharing options...
dsaba Posted December 13, 2007 Author Share Posted December 13, 2007 I tried it verbatim as you first suggested and it didn't work, then I added the extra parenthesis. Still didn't work. Now I tried this: $pat = '~: \<strong\>([0-9,]+?)\</strong\> \(([0-9\.]+?) .~'; Doesn't work either. So then I tried this: (took the question marks out) $pat = '~: \<strong\>([0-9,]+)\</strong\> \(([0-9\.]+) .~'; also didn't work Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.