tibberous Posted August 6, 2007 Share Posted August 6, 2007 I have some html that looks like this. <body> <td id="top" align="center"> <div id="topNav">Rate PeopleMeet PeopleBest OfMeet Jim and James</div> <div id="userPanel"> <td> Login</td> <td>Join<br> HOTorNOT</td> </body> This html will always have the right amount of opening to closing tags. Basically, I just need the content from the inner cells. If a tag is nested, like <div>Outter div<div>Hello</div>text</div>, then I want the inner div to get the value Hello, and the outer div to get the value 'outer div text'. I'm not sure if I should use xml functions for this, or explodes, or preg_match and replace. Anyone know a good way to do this? Link to comment https://forums.phpfreaks.com/topic/63606-extract-text-from-nexted-tags/ Share on other sites More sharing options...
tibberous Posted August 6, 2007 Author Share Posted August 6, 2007 I almost got this. Basically I am cleaning it with Tidy, cleaning it with str_replace and preg_replace, wrapping it in a root node and then parsing it as XML. Only think that is still giving me a problem is with html entities (<,  , >, ect). I replaced   with ' ', but I can't do the same thing for < because it will break my xml. Is there some way I can have the PHP XML parser ignore HTML entities? I thought about replacing & with a random string, then replacing it back, but it seems inefficient and tacked together. Link to comment https://forums.phpfreaks.com/topic/63606-extract-text-from-nexted-tags/#findComment-316998 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.