tibberous Posted August 6, 2007 Share Posted August 6, 2007 I have some html that looks like this. <body> <td id="top" align="center"> <div id="topNav">Rate PeopleMeet PeopleBest OfMeet Jim and James</div> <div id="userPanel"> <td> Login</td> <td>Join<br> HOTorNOT</td> </body> This html will always have the right amount of opening to closing tags. Basically, I just need the content from the inner cells. If a tag is nested, like <div>Outter div<div>Hello</div>text</div>, then I want the inner div to get the value Hello, and the outer div to get the value 'outer div text'. I'm not sure if I should use xml functions for this, or explodes, or preg_match and replace. Anyone know a good way to do this? Quote Link to comment https://forums.phpfreaks.com/topic/63606-extract-text-from-nexted-tags/ Share on other sites More sharing options...
tibberous Posted August 6, 2007 Author Share Posted August 6, 2007 I almost got this. Basically I am cleaning it with Tidy, cleaning it with str_replace and preg_replace, wrapping it in a root node and then parsing it as XML. Only think that is still giving me a problem is with html entities (<,  , >, ect). I replaced   with ' ', but I can't do the same thing for < because it will break my xml. Is there some way I can have the PHP XML parser ignore HTML entities? I thought about replacing & with a random string, then replacing it back, but it seems inefficient and tacked together. Quote Link to comment https://forums.phpfreaks.com/topic/63606-extract-text-from-nexted-tags/#findComment-316998 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.