Help with extracting innerHTML of an element

quuxbazer · January 9, 2011

Hi,

I just read a quick tutorial about regex but couldn't make what I wanted to work...sorry if this is something frequently asked.

So, I want to extract the HTML code between <td> and </td> in a string that I used cURL to get. I know what comes right before the <td> tag:

...
<strong>Speed:</strong>
"  Faster"
</div>
</td>
<td>STRING I WANT TO EXTRACT</td>
<td class="align-center" ......

I hope I'm being somewhat clear on this; any help would be much appreciated!

Zurev · January 9, 2011

Hi,

I just read a quick tutorial about regex but couldn't make what I wanted to work...sorry if this is something frequently asked.

So, I want to extract the HTML code between <td> and </td> in a string that I used cURL to get. I know what comes right before the <td> tag:
...
<strong>Speed:</strong>
"  Faster"
</div>
</td>
<td>STRING I WANT TO EXTRACT</td>
<td class="align-center" ......
I hope I'm being somewhat clear on this; any help would be much appreciated!

You ask and you shall receive!

function getTD($string)
{
$matchArray = array();
$pattern    = "~^<td>(.*?)</td>$~";
preg_match($pattern, $string, $matchArray);
return $matchArray[0];
}

Just run the function like so:

echo getTD("<td>this string</td>");
// @return this string

quuxbazer · January 9, 2011

Hi Zurev,

Thanks for the response...but for my purposes, I need to find the correct <td> tag based on what comes right before it and right after. In the HTML document stored in the variable, there are a lot of <td> elements, and I need a specific one based on the pattern as I have shown in my first post.

Thanks!

By the way, why do you write (.*?) with the question mark? I know .* means 0 or more of any character, but why the ? afterwards?

Zurev · January 9, 2011

Hi Zurev,

Thanks for the response...but for my purposes, I need to find the correct <td> tag based on what comes right before it and right after. In the HTML document stored in the variable, there are a lot of <td> elements, and I need a specific one based on the pattern as I have shown in my first post.

Thanks!

By the way, why do you write (.*?) with the question mark? I know .* means 0 or more of any character, but why the ? afterwards?

As per the reasoning behind the question mark, interestingly enough, I really probably shouldn't have used it, more on that here:

Greedy vs Non-Greedy

http://www.itworld.com/nl/perl/01112001

Simply put, use the ? if you're just trying to validate, not extract is what it seems to be saying.

So can you tell me more about what comes before and after and what the different variations are? Perhaps what code you have right now that you're stuck on?

quuxbazer · January 9, 2011

Now that I think about it...a combination of the strpos and substr functions would probably work better for my case, since the "pattern" I'm looking for is always the same characters

I just wanted to use regex because it seemed so cool! But I'm sure I'll have to look at it again when I'm validating the form data

Thanks again for the input.

Sign In

Help with extracting innerHTML of an element

Recommended Posts

quuxbazer

Link to comment

Share on other sites

Zurev

Link to comment

Share on other sites

quuxbazer

Link to comment

Share on other sites

Zurev

Link to comment

Share on other sites

quuxbazer

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information