daydreamer Posted October 5, 2009 Share Posted October 5, 2009 Hi, <th scope="row"> some words in here <span class="thisisonecrazyclass"></span> </th> <td>get all in here</td> "some words in here" will be the same all the time. "get all in here" changes and I want to store this using a preg_match. This is what I am trying to do with no results: <?php preg_match("~some\swords\sin\shere[\n.\s]*<td>(.*)</td>~i", $xxx, $matches); ?> Where am i going wrong? Thanks. Quote Link to comment Share on other sites More sharing options...
cags Posted October 5, 2009 Share Posted October 5, 2009 Is there only going to be one set of <td> </td> tags in the source or will there be multiple. If theres multiple do you want all the values or just the first? I assume the information between some words in here and <td> will vary? Quote Link to comment Share on other sites More sharing options...
daydreamer Posted October 5, 2009 Author Share Posted October 5, 2009 there will be multiple <td> </td> tags. But I only need what ever is inside of them after the text "some words in here". Yes the information between these two will vary, but not alot. The class of the span might change, or the span might not be their. "some words in here" will be the same. Quote Link to comment Share on other sites More sharing options...
cags Posted October 5, 2009 Share Posted October 5, 2009 This should work, but I'm sure somebody could come up with a better solution... preg_match("~some words in here.+?<td>(.+)?</td>~s", $src, $out); echo $out[1]; Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted October 6, 2009 Share Posted October 6, 2009 My take on it (using DOM / XPath): Example: $html = <<<EOF <table> <th scope="row"> some words in here <span class="thisisonecrazyclass"></span> </th> <td>get all in here, because I'm 1st!</td> <td>Some garbage...</td> <th scope="row"> some words in here <span class="thisisonecrazyclass"></span> </th> <td>Get it all in here too! 2nd, yo!</td> <a href="blah">text</blah> <h2>this is a header</h2> </table> EOF; $dom = new DOMDocument; @$dom->loadHTML($html); // change loadHTML to loadHTMLFile and put a legit url in quotes within the parenthesis if you want to apply this to a live site $xpath = new DOMXPath($dom); $tdTag = $xpath->query('//th[@scope="row"]/text()[contains(.,"some words in here")]/../following-sibling::td[1]'); // change "Some words here" to the actual words in question foreach ($tdTag as $val) { echo $val->nodeValue . "<br />\n"; } Output: get all in here, because I'm 1st! Get it all in here too! 2nd, yo! This all makes some assumptions; a) It is a th tag that precedes the desired td tag in question b) that the th tag needs to have the attribute scope which has the value "row". If this part is not required, you can simply delete the the first predicate ([@scope=row]) from the query. Obviously, since the 'some words in here' is going to be the same (and thus used as part of determining which th is being used), use the actual words in place of that in the xpath query. Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted October 6, 2009 Share Posted October 6, 2009 To elaborate on assumption a), the code will fetch the first td tag it finds after the correct th tag is found (so in other words, there could be more tags between the th and the first td afterwards... Quote Link to comment Share on other sites More sharing options...
daydreamer Posted October 8, 2009 Author Share Posted October 8, 2009 Thanks for the suggestion nrg_alpha, ill have a look into the xpath way of getting data. cags, that code didnt work, but I got the expression to work by using a different expression. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.