JimBob00 Posted April 23, 2012 Share Posted April 23, 2012 Hi, I'm very new to regex and I'm not having an easy time with it. I'm trying to remove tags with its contents from an xml page. I use the start tag, then a wildcard and then the end tag. When doing this on tags that are on one line it's fine, but when the contents makes it multiple lines, it seems to break. The reason for it breaking is because the tag is used multiple times but instead of taking the nearest, it is taking the last and wiping out all the rest of the data. Here's an example of the original data... <?xml version="1.0" encoding="utf-8"?> <album> <album info> <artist></artist> <title></title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed iaculis nibh sed nisi dapibus tempus. Sed congue quam at magna tempor dictum. Quisque tempor, elit at lobortis pellentesque, ligula quam malesuada magna, in sagittis diam arcu vel sapien. Donec nibh nulla, fermentum eu feugiat et, imperdiet imperdiet risus. Fusce vestibulum lacus non nibh rhoncus faucibus. Morbi pellentesque dolor a ligula gravida ac iaculis velit dapibus. In viverra, odio vitae congue pharetra, lorem tellus tincidunt est, quis malesuada risus urna quis dolor. Donec ut nulla ligula, non vulputate magna. Phasellus hendrerit neque nulla, eget scelerisque elit. Aenean nec massa in turpis tempor accumsan a vel augue. </description> </album info> <tracks> <track 1> <artist></artist> <title></title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</description> </track 1> <track 2> <artist></artist> <title></title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</description> </track 2> <track 3> <artist></artist> <title></title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</description> </track 3> <track 4> <artist></artist> <title></title> <description>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</description> </track 4> </tracks> </album> Here is what I would like to end up with... <?xml version="1.0" encoding="utf-8"?> <album> <album info> <artist></artist> <title></title> </album info> <tracks> <track 1> <artist></artist> <title></title> </track 1> <track 2> <artist></artist> <title></title> </track 2> <track 3> <artist></artist> <title></title> </track 3> <track 4> <artist></artist> <title></title> </track 4> </tracks> </album> But this is what I've been getting... <?xml version="1.0" encoding="utf-8"?> <album> <album info> <artist></artist> <title></title> </track 4> </tracks> </album> The code I've been using (that works fine on single lined tags) is... preg_replace('#<created(.*)</created>#', '', $a) I also tried changing things around, like using / instead of #. Adding ^ to the start and $ to the end. I also tried to escape the < and the / at the end tag. I had a look around some forums and also tried things others have done, but I still end up with the same problem. EDIT - Ops I forgot to also say that I tried to add Pattern Modifiers to the end. I tried s, m i. If anyone could help, I would be very greteful Quote Link to comment Share on other sites More sharing options...
xyph Posted April 23, 2012 Share Posted April 23, 2012 Where the heck are you getting this feed? XML tags should never have spaces in them :| Use the 's' flag in your RegEx. It allows dots to match the newline characters as well. #<created(.*)</created>#s Ideally, you'd use SimpleXML for something like this, but with spaces in the tags, a proper parser will throw errors all day. Quote Link to comment Share on other sites More sharing options...
JimBob00 Posted April 23, 2012 Author Share Posted April 23, 2012 Thanks for your help, but it seems to give me the same effect. The spaces were my fault. I just made a small quick example, because the real file was quite big. The file comes from SoundCloud and here is another that I found that is smaller. <?xml version="1.0" encoding="UTF-8"?> <playlist> <kind>playlist</kind> <id type="integer">1679848</id> <created-at type="datetime">2012-02-28T14:40:57Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">2624890</duration> <sharing>public</sharing> <tag-list>house</tag-list> <permalink>house-house-house-house</permalink> <description>Bootlegs for your DJing</description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>house</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id nil="true"></label-id> <label-name></label-name> <type>other</type> <playlist-type>other</playlist-type> <ean></ean> <title>house house house</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/playlists/1679848</uri> <permalink-url>http://soundcloud.com/emporium-mc/sets/house-house-house-house</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000019325311-ismj3u-large.jpg?639a060</artwork-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <tracks type="array"> <track> <kind>track</kind> <id type="integer">8386487</id> <created-at type="datetime">2010-12-24T13:25:07Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">580060</duration> <commentable type="boolean">true</commentable> <state>finished</state> <original-content-size type="integer">23171580</original-content-size> <sharing>public</sharing> <tag-list>house progressive tech techno trance</tag-list> <permalink>jk-bootleg_get-funky-vs-open-your-eyes</permalink> <description>This bootleg is made by JK. Very tough progressive. "GET FUNKY(Cyberx&Lilas) VS OPEN YOUR EYES" You can download and drop to the floor this weekend !! Enjoy !! </description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>progressive</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id nil="true"></label-id> <label-name></label-name> <isrc></isrc> <video-url nil="true"></video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG_GET FUNKY VS OPEN YOUR EYES(downloadable)</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>mp3</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/8386487</uri> <permalink-url>http://soundcloud.com/emporium-mc/jk-bootleg_get-funky-vs-open-your-eyes</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000003767803-9ilo8k-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/PEzs5CTgZHbL_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <stream-url>http://api.soundcloud.com/tracks/8386487/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/8386487/download</download-url> <playback-count type="integer">7277</playback-count> <download-count type="integer">621</download-count> <favoritings-count type="integer">21</favoritings-count> <comment-count type="integer">15</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/8386487/attachments</attachments-uri> </track> <track> <kind>track</kind> <id type="integer">8387039</id> <created-at type="datetime">2010-12-24T13:49:38Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">555443</duration> <commentable type="boolean">true</commentable> <state>finished</state> <original-content-size type="integer">22188132</original-content-size> <sharing>public</sharing> <tag-list>house progressive tech techno trance electro</tag-list> <permalink>jk-bootleg_summer-voyage-vs-yeah-ha-saeed-younan</permalink> <description>If you like this, just download and drop to the floor !!! And feel free to leave your nice comments !!! </description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>progressive</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id nil="true"></label-id> <label-name></label-name> <isrc></isrc> <video-url nil="true"></video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG_SUMMER VOYAGE VS YEAH HA(Saeed younan)_(downloadable)</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>mp3</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/8387039</uri> <permalink-url>http://soundcloud.com/emporium-mc/jk-bootleg_summer-voyage-vs-yeah-ha-saeed-younan</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000003768042-ykc0z8-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/WXihq88WwRMf_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <stream-url>http://api.soundcloud.com/tracks/8387039/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/8387039/download</download-url> <playback-count type="integer">13530</playback-count> <download-count type="integer">946</download-count> <favoritings-count type="integer">48</favoritings-count> <comment-count type="integer">52</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/8387039/attachments</attachments-uri> </track> <track> <kind>track</kind> <id type="integer">8386719</id> <created-at type="datetime">2010-12-24T13:35:05Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">455303</duration> <commentable type="boolean">true</commentable> <state>finished</state> <original-content-size type="integer">18187524</original-content-size> <sharing>public</sharing> <tag-list>progressive tech house electro techno trance</tag-list> <permalink>jk-bootleg_get-messy-vs-1000-lords-popof-remix-downloadable</permalink> <description>This is made by JK. You can download and drop this on the floor this weekend!! Feel free to leave your comments !!!</description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>progressive</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id nil="true"></label-id> <label-name></label-name> <isrc></isrc> <video-url nil="true"></video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG_GET MESSY VS 1000 LORDS(POPOF REMIX)_(downloadable)</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>mp3</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/8386719</uri> <permalink-url>http://soundcloud.com/emporium-mc/jk-bootleg_get-messy-vs-1000-lords-popof-remix-downloadable</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000003767883-tg7z0p-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/BRYrmpnYvwtP_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <stream-url>http://api.soundcloud.com/tracks/8386719/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/8386719/download</download-url> <playback-count type="integer">5783</playback-count> <download-count type="integer">591</download-count> <favoritings-count type="integer">33</favoritings-count> <comment-count type="integer">7</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/8386719/attachments</attachments-uri> </track> <track> <kind>track</kind> <id type="integer">31963476</id> <created-at type="datetime">2011-12-30T11:00:20Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">502369</duration> <commentable type="boolean">true</commentable> <state>finished</state> <original-content-size type="integer">88601052</original-content-size> <sharing>public</sharing> <tag-list>techno house progressive tech electro trance</tag-list> <permalink>jk-bootleg-for-2012</permalink> <description>Hi soundcloud friends ! This is a nu year present bootleg from Justinkase ! Enjoy !</description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>techno</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id nil="true"></label-id> <label-name></label-name> <isrc></isrc> <video-url nil="true"></video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG FOR 2012 COUNTDOWN</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>wav</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/31963476</uri> <permalink-url>http://soundcloud.com/emporium-mc/jk-bootleg-for-2012</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000016050977-xvg71p-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/9hF19TPZEc53_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <stream-url>http://api.soundcloud.com/tracks/31963476/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/31963476/download</download-url> <playback-count type="integer">4853</playback-count> <download-count type="integer">476</download-count> <favoritings-count type="integer">20</favoritings-count> <comment-count type="integer">4</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/31963476/attachments</attachments-uri> </track> <track> <kind>track</kind> <id type="integer">4418481</id> <created-at type="datetime">2010-08-05T14:48:49Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">531715</duration> <commentable type="boolean">true</commentable> <state>finished</state> <original-content-size type="integer">21241984</original-content-size> <sharing>public</sharing> <tag-list>SUKU EMPORIUM DIGWEED DEADMAU5 AQUATONIC JUSTIN KASE</tag-list> <permalink>aquatonic-digweed-vs-deadmau5</permalink> <description>Thanks for checking our stuff </description> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>Electrohouse</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id type="integer">26859</label-id> <label-name>Justin Kase</label-name> <isrc></isrc> <video-url>http://www.youtube.com/watch?v=4egyX4OGqEw</video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG_DIGWEED VS DEADMAU5_(downloadable)</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>mp3</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/4418481</uri> <permalink-url>http://soundcloud.com/emporium-mc/aquatonic-digweed-vs-deadmau5</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000002106159-77lp22-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/FACDax6amkKm_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <label> <id type="integer">26859</id> <kind>user</kind> <permalink>justin-kase</permalink> <username>Justin Kase</username> <uri>http://api.soundcloud.com/users/26859</uri> <permalink-url>http://soundcloud.com/justin-kase</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000010775662-l6c6jp-large.jpg?639a060</avatar-url> </label> <stream-url>http://api.soundcloud.com/tracks/4418481/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/4418481/download</download-url> <playback-count type="integer">8030</playback-count> <download-count type="integer">729</download-count> <favoritings-count type="integer">27</favoritings-count> <comment-count type="integer">34</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/4418481/attachments</attachments-uri> </track> </tracks> </playlist> There is a lot of information from the file that I don't need. I am ok when there is only 1 occurrence of the tag or if the tag's end is on the same line, but trying to get rid of this... <description>This bootleg is made by JK. Very tough progressive. "GET FUNKY(Cyberx&Lilas) VS OPEN YOUR EYES" You can download and drop to the floor this weekend !! Enjoy !! </description> But instead I end up stripping a lot of what I want to keep away. I end up with this... <?xml version="1.0" encoding="UTF-8"?> <playlist> <kind>playlist</kind> <id type="integer">1679848</id> <created-at type="datetime">2012-02-28T14:40:57Z</created-at> <user-id type="integer">1312104</user-id> <duration type="integer">2624890</duration> <sharing>public</sharing> <tag-list>house</tag-list> <permalink>house-house-house-house</permalink> <streamable type="boolean">true</streamable> <downloadable type="boolean">true</downloadable> <genre>Electrohouse</genre> <release></release> <purchase-url nil="true"></purchase-url> <purchase-title nil="true"></purchase-title> <label-id type="integer">26859</label-id> <label-name>Justin Kase</label-name> <isrc></isrc> <video-url>http://www.youtube.com/watch?v=4egyX4OGqEw</video-url> <track-type>other</track-type> <key-signature></key-signature> <bpm nil="true"></bpm> <title>House JK BOOTLEG_DIGWEED VS DEADMAU5_(downloadable)</title> <release-year nil="true"></release-year> <release-month nil="true"></release-month> <release-day nil="true"></release-day> <original-format>mp3</original-format> <license>all-rights-reserved</license> <uri>http://api.soundcloud.com/tracks/4418481</uri> <permalink-url>http://soundcloud.com/emporium-mc/aquatonic-digweed-vs-deadmau5</permalink-url> <artwork-url>http://i1.sndcdn.com/artworks-000002106159-77lp22-large.jpg?639a060</artwork-url> <waveform-url>http://w1.sndcdn.com/FACDax6amkKm_m.png</waveform-url> <user> <id type="integer">1312104</id> <kind>user</kind> <permalink>emporium-mc</permalink> <username>HoUSE</username> <uri>http://api.soundcloud.com/users/1312104</uri> <permalink-url>http://soundcloud.com/emporium-mc</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000001557667-grxcch-large.jpg?639a060</avatar-url> </user> <label> <id type="integer">26859</id> <kind>user</kind> <permalink>justin-kase</permalink> <username>Justin Kase</username> <uri>http://api.soundcloud.com/users/26859</uri> <permalink-url>http://soundcloud.com/justin-kase</permalink-url> <avatar-url>http://i1.sndcdn.com/avatars-000010775662-l6c6jp-large.jpg?639a060</avatar-url> </label> <stream-url>http://api.soundcloud.com/tracks/4418481/stream</stream-url> <download-url>http://api.soundcloud.com/tracks/4418481/download</download-url> <playback-count type="integer">8030</playback-count> <download-count type="integer">729</download-count> <favoritings-count type="integer">27</favoritings-count> <comment-count type="integer">34</comment-count> <attachments-uri>http://api.soundcloud.com/tracks/4418481/attachments</attachments-uri> </track> </tracks> </playlist> Thanks Quote Link to comment Share on other sites More sharing options...
JimBob00 Posted April 24, 2012 Author Share Posted April 24, 2012 Never mind, an added ? with the wild card seemed to do the trick. Funny, because I did try it with the ? earlier. Maybe it still wasn't the right sequence when I tried before. Anyways, thanks for helping and thanks for the links to the Regex information. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.