Jump to content

Archived

This topic is now archived and is closed to further replies.

bhikkhu

parsing non-well formed XML

Recommended Posts

I'm new here and I've been trying to find the answer to my question without posting but I have been unsuccessful. If this has already been covered, please provide me with a link, and accept my apology for asking the same question again. As I begin using these forums more often, it won't happen.

I am trying to parse out an XML file, but the file itself is not well formed. Here is an example snippet.

[code]
<root>
   <book id="foo">
      <chapter id="1">
         <sentence id="1" />This is the 'CDATA' that I need..
         <sentence id="2" />Another sentence example...
      </chapter>
   </book>
</root>
[/code]

I need to pull out the data from the sentence, but it isn't wrapped in <sentence>data</sentence> format. The sentence element is closed immediately.

I know this is a bit of an XML question, but I can't change the XML, I have to parse it as it is, and I'm using PHP to do it.

Any help is greatly appreciated. I don't have the code I've got in front of me, but it isn't much anyway, and I'm really just looking for direction.

Thanks again.

Share this post


Link to post
Share on other sites
For xml to be valid it MUST have a closing tag for each element!

That particular file is not valid xml and should not be parsed by any compliant app (only a superdooper error friendly one MAY still do it but as far as I am aware, or concerned for that matter, that example should fail hands down.

You could make it valid by parsing the content of the file and looking for elements with no closing tag and give them one (no entendre!!!!).

Share this post


Link to post
Share on other sites
[!--quoteo(post=385665:date=Jun 19 2006, 11:35 AM:name=ToonMariner)--][div class=\'quotetop\']QUOTE(ToonMariner @ Jun 19 2006, 11:35 AM) [snapback]385665[/snapback][/div][div class=\'quotemain\'][!--quotec--]You could make it valid by parsing the content of the file and looking for elements with no closing tag and give them one (no entendre!!!!).
[/quote]
Thanks for the reply. I thought about this... but, ironically... how do I parse it to add the closing tag?!?!

[img src=\"style_emoticons/[#EMO_DIR#]/unsure.gif\" style=\"vertical-align:middle\" emoid=\":unsure:\" border=\"0\" alt=\"unsure.gif\" /] I'm really out of luck here aren't I?... [img src=\"style_emoticons/[#EMO_DIR#]/wink.gif\" style=\"vertical-align:middle\" emoid=\":wink:\" border=\"0\" alt=\"wink.gif\" /] [img src=\"style_emoticons/[#EMO_DIR#]/wink.gif\" style=\"vertical-align:middle\" emoid=\":wink:\" border=\"0\" alt=\"wink.gif\" /]

Thanks again. [img src=\"style_emoticons/[#EMO_DIR#]/smile.gif\" style=\"vertical-align:middle\" emoid=\":smile:\" border=\"0\" alt=\"smile.gif\" /]

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.