quuxbazer Posted January 9, 2011 Share Posted January 9, 2011 Hi, I just read a quick tutorial about regex but couldn't make what I wanted to work...sorry if this is something frequently asked. So, I want to extract the HTML code between <td> and </td> in a string that I used cURL to get. I know what comes right before the <td> tag: ... <strong>Speed:</strong> " Faster" </div> </td> <td>STRING I WANT TO EXTRACT</td> <td class="align-center" ...... I hope I'm being somewhat clear on this; any help would be much appreciated! Quote Link to comment Share on other sites More sharing options...
Zurev Posted January 9, 2011 Share Posted January 9, 2011 Hi, I just read a quick tutorial about regex but couldn't make what I wanted to work...sorry if this is something frequently asked. So, I want to extract the HTML code between <td> and </td> in a string that I used cURL to get. I know what comes right before the <td> tag: ... <strong>Speed:</strong> " Faster" </div> </td> <td>STRING I WANT TO EXTRACT</td> <td class="align-center" ...... I hope I'm being somewhat clear on this; any help would be much appreciated! You ask and you shall receive! function getTD($string) { $matchArray = array(); $pattern = "~^<td>(.*?)</td>$~"; preg_match($pattern, $string, $matchArray); return $matchArray[0]; } Just run the function like so: echo getTD("<td>this string</td>"); // @return this string Quote Link to comment Share on other sites More sharing options...
quuxbazer Posted January 9, 2011 Author Share Posted January 9, 2011 Hi Zurev, Thanks for the response...but for my purposes, I need to find the correct <td> tag based on what comes right before it and right after. In the HTML document stored in the variable, there are a lot of <td> elements, and I need a specific one based on the pattern as I have shown in my first post. Thanks! By the way, why do you write (.*?) with the question mark? I know .* means 0 or more of any character, but why the ? afterwards? Quote Link to comment Share on other sites More sharing options...
Zurev Posted January 9, 2011 Share Posted January 9, 2011 Hi Zurev, Thanks for the response...but for my purposes, I need to find the correct <td> tag based on what comes right before it and right after. In the HTML document stored in the variable, there are a lot of <td> elements, and I need a specific one based on the pattern as I have shown in my first post. Thanks! By the way, why do you write (.*?) with the question mark? I know .* means 0 or more of any character, but why the ? afterwards? As per the reasoning behind the question mark, interestingly enough, I really probably shouldn't have used it, more on that here: Greedy vs Non-Greedy http://www.itworld.com/nl/perl/01112001 Simply put, use the ? if you're just trying to validate, not extract is what it seems to be saying. So can you tell me more about what comes before and after and what the different variations are? Perhaps what code you have right now that you're stuck on? Quote Link to comment Share on other sites More sharing options...
quuxbazer Posted January 9, 2011 Author Share Posted January 9, 2011 Now that I think about it...a combination of the strpos and substr functions would probably work better for my case, since the "pattern" I'm looking for is always the same characters I just wanted to use regex because it seemed so cool! But I'm sure I'll have to look at it again when I'm validating the form data Thanks again for the input. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.