For some reason the PHP XML parsing is splitting up the data within my elements if the data is unicode. I put the file paths within quotes hoping that would work, but no luck... My code is listed below, but first here is what the output looks like: [code] <FileData> ** ** <File> ** "/home/user/sandbox/ ** ** 田比首走雨.txt" ** ** ** <File> ** "/home/user/sandbox/English.txt" ** ** ** [/code] XML File: [code] <?xml version="1.0" encoding="UTF-8"?> <FileData> <File>"/home/user/sandbox/田比首走雨.txt"</File> <File>"/home/user/sandbox/English.txt"</File> </FileData> [/code] PHP code: [code] <?php // Parse the XML document containing the data for all of the file extensions. $xml_parser = xml_parser_create(); xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING, "UTF-8"); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "charData"); if (!($fp = fopen("/home/admin/sandbox/test.xml", "r"))) { die("Could not open XML input"); } while ($data = fread($fp, 4096)) { if (!xml_parse($xml_parser, $data, feof($fp))) { die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($xml_parser); fclose($fp); function startElement ($parser, $name, $attrs) { echo "<$name><br />"; foreach($attrs as $k => $v) { echo " $k - $v<br />"; } } function endElement ($parser, $name) { } function charData ($parser, $data) { echo "** $data **<br />"; } ?> [/code] If anyone has seen this before or has any suggestions I'd be very happy to hear them. This has been driving me nuts. I'd like to get my unicode paths in one shot just like I can with the english ones.