e11rof Posted October 2, 2009 Share Posted October 2, 2009 I am sure somebody will tell me that I am in the wrong forum. Anyway, I wonder if somebody can help me. I have a RSS feed fed from a Myslq table. I want to display the same feed as a Html page. The feed has some embedded tags and hence has a CDATA tag. ( as shown) I have read the Mysql record and have tried to extract the CDATA start and end tags, but failed. Any ideas? The string is shown below. <![CDATA[<p>We would like to announce that our speaker for the October dinner will be Mr Jones. Mr Jones is an outward bound leader and a teacher.</p> <p>Would anybody not coming please inform Steve.</p>]]> How can I strip the CDATA tag and the end tag. Quote Link to comment Share on other sites More sharing options...
cags Posted October 2, 2009 Share Posted October 2, 2009 Assuming I understood your intension correctly. You could do it with regular expressions like so... preg_match("/^<!\[CDATA\[(.*)\]\]>$/s", $src, $out); $output = $out[1]; Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted October 2, 2009 Share Posted October 2, 2009 If I understand correctly, you want <![CDATA[ and ]]> gone, but leave the rest in place? If so, perhaps something along the lines of: $str = <<<EOF <![CDATA[<p>We would like to announce that our speaker for the October dinner will be Mr Jones. Mr Jones is an outward bound leader and a teacher.</p> <p>Would anybody not coming please inform Steve.</p>]]> EOF; $str = preg_replace('#<!\[CDATA\[(.+?)\]\]>#s', '$1', $str); echo $str; Hopefully I understand the end desired goal here. If not, my apologies. @cags.. be careful with the use of .*, as you *might* run into trouble if there are multiple instances of <![CDATA[ ... ]]> in the source code / string. You can read up about why stuff like .* and .+ are 'generally' bad ideas here (post #11 and #14). If there is only one chunk of CDATA, then it's all good.. but if not, you might end up wiping out more than you bargained for. Quote Link to comment Share on other sites More sharing options...
cags Posted October 2, 2009 Share Posted October 2, 2009 Thanks for the info nrg_alpha, I'll try to keep that in mind. I only started learning Regular Expressions at the start of the week, it took my a while to work out I needed the s at the end as the darn string had a newline char in it. I must admit I did pretty much assume that there would only be one instance of <![CDATA, and nearly suggested simply using substr to grab everything but the start and end tag. With that in mind I think I got the regex to do what I wanted it to do, which is a minor miracle in itself. Quote Link to comment Share on other sites More sharing options...
e11rof Posted October 3, 2009 Author Share Posted October 3, 2009 Thanks for that I did not have the $1 in the preg_match statement. Quote Link to comment Share on other sites More sharing options...
salathe Posted October 3, 2009 Share Posted October 3, 2009 I know this topic is marked as SOLVED already, and that manually playing with the XML will get the job done. However, when working with XML documents, it would be advisable to use a proper XML parser (there are a number of different approaches in PHP). Using one would make this CDATA problem a non-issue since the parsers will properly handle that type of XML node. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.