Jump to content

How can I sort Problematic Characters in PHP?


loo9162

Recommended Posts

Basically, I am writing a script in PHP, which can take YouTube videos from playlists, items from RSS feeds and podcasts, and individual YouTube videos and files, and places them into an XML document, so they can browsed and kept in one place. I also have a script which removes these items, if the user wants.

 

The problem I'm facing is with characters. Because I can't control what the user will name their videos/files, or how they're named in the feed, the titles could have quotes, brackets, ampersands, hashes  etc, which causes problems when they're being removed and Because I'm using Xpath (which can be temperamental at the best of times) in the remove script, any items with titles with these characters won't get removed. 

 

Here's my remove code:

 

<?php

 

$q = $_GET["q"];

$q = stripslashes($q);
 
$q = explode('|^', $q);
 
$counts = count($q);
 
unset($q[$counts-1]);
 
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
 
$dom->Load("../$userid.xml");
 
$xpath = new DOMXPath($dom);
 
foreach ($q as $r) {
 
$r = preg_replace("|&|", '&', $r);
$r = preg_replace('|"|', '"', $r);
 
$query1 = 'channel/item[title='.$r.']/title';
$query2 = 'channel/item[title='.$r.']/media:content';
$query3 = 'channel/item[title='.$r.']';
 
$entries = $xpath->query($query1);
$entries2 = $xpath->query($query2);
$entries3 = $xpath->query($query3);
 
foreach ($entries as $entry) {
foreach ($entries2 as $entry2) {
foreach ($entries3 as $entry3) {
 
$oldchapter = $entry->parentNode->removeChild($entry);
$oldchapter2 = $entry2->parentNode->removeChild($entry2);
$oldchapter3 = $entry3->parentNode->removeChild($entry3);
 
$dom->preserveWhiteSpace = false;
 
}
}
}
}
 
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
 
$dom->save("../$userid.xml")
 
?>
 
How it works is when the user selects the items they want to remove, using a select box, the selections are put into the URL. My code extracts the titles from the URL, separated by "|^" (For example title1|^title2|^title3|^). Because the "|^" is appended to the end of each title, I have to remove the empty value from the array. Then I load a new DOMdocument, and find the titles from the URL in my existing XML document. Then I want the code to remove the whole items (titles, urls and the item itself) which have the same titles as the ones in the URL, and then save the document, but because some of the titles could have &, ", * or #, they don't get removed.
 
Is there a way that I can maybe screen, and change the characters to get it to work (I tried this with "preg_replace", but it didn't work), or even change them before they're saved to the XML in the first place?
 
Any advice?
Link to comment
Share on other sites

I think the best advice would be to build this application around a proper database, which would allow you to handle stuff like this with ease. That would also allow you to drop the triple-nested loops, which is something that's really bad for performance. Especially if you get a lot of data.

 

Then, if you need to export to XML, then write a script that does just that: Export. ;)

 

Also, any reason why you're escaping XML characters manually, as opposed to be using the proper functions for it?

Link to comment
Share on other sites

I can't use a database, because the video player I'm using only supports xml and rss playlists. I don't understand what you mean by export to XML (sorry). Also, what are the proper functions I should be using. (sorry, complete noob)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.