Jump to content

[SOLVED] How do I match this?


daydreamer

Recommended Posts

Hi,

 

<th scope="row">

some words in here

<span class="thisisonecrazyclass"></span>

</th>

<td>get all in here</td>

 

"some words in here" will be the same all the time.

 

"get all in here" changes and I want to store this using a preg_match.

 

This is what I am trying to do with no results:

 

<?php

preg_match("~some\swords\sin\shere[\n.\s]*<td>(.*)</td>~i", $xxx, $matches);

?>

 

Where am i going wrong?

 

Thanks.

 

 

Link to comment
Share on other sites

Is there only going to be one set of <td> </td> tags in the source or will there be multiple. If theres multiple do you want all the values or just the first? I assume the information between some words in here and <td> will vary?

Link to comment
Share on other sites

there will be multiple <td> </td> tags.

 

But I only need what ever is inside of them after the text "some words in here".

 

Yes the information between these two will vary, but not alot. The class of the span might change, or the span might not be their. "some words in here" will be the same.

Link to comment
Share on other sites

My take on it (using DOM / XPath):

 

Example:

$html = <<<EOF
<table>
<th scope="row">

some words in here

<span class="thisisonecrazyclass"></span>
</th>

<td>get all in here, because I'm 1st!</td>
<td>Some garbage...</td>
<th scope="row">

some words in here

<span class="thisisonecrazyclass"></span>

</th>

<td>Get it all in here too! 2nd, yo!</td>
<a href="blah">text</blah>
<h2>this is a header</h2>
</table>
EOF;

$dom = new DOMDocument;
@$dom->loadHTML($html); // change loadHTML to loadHTMLFile and put a legit url in quotes within the parenthesis if you want to apply this to a live site
$xpath = new DOMXPath($dom);
$tdTag = $xpath->query('//th[@scope="row"]/text()[contains(.,"some words in here")]/../following-sibling::td[1]'); // change "Some words here" to the actual words in question

foreach ($tdTag as $val) {
    echo $val->nodeValue . "<br />\n";
}

 

Output:

get all in here, because I'm 1st!
Get it all in here too! 2nd, yo!

 

This all makes some assumptions;

a) It is a th tag that precedes the desired td tag in question

b) that the th tag needs to have the attribute scope which has the value "row". If this part is not required, you can simply delete the the first predicate ([@scope=row]) from the query.

 

Obviously, since the 'some words in here' is going to be the same (and thus used as part of determining which th is being used), use the actual words in place of that in the xpath query.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.