jaesun Posted May 27, 2022 Share Posted May 27, 2022 PHP Version: 7.0.10 Windows So I have this function, that grabs all the files within a directory. function getDirectoryListing($folder) { $aryListing = array(); $dir = new RecursiveDirectoryIterator($folder, FilesystemIterator::SKIP_DOTS); // Flatten the recursive iterator, folders come before their files $it = new RecursiveIteratorIterator($dir, RecursiveIteratorIterator::SELF_FIRST); foreach ($it as $fileinfo) { if ($fileinfo->isFile()) { $f = array(); $f['file'] = $fileinfo->getFilename(); $f['dir'] = "\\" . $it->getSubPath(); $f['pathfile'] = $it->getSubPathName(); $f['size'] = $fileinfo->getSize(); $f['size_human'] = bytesToHuman($fileinfo->getSize()); $f['time_mod'] = $fileinfo->getMTime(); $f['time_mod_full'] = date('F j, Y, g:i a', $fileinfo->getMTime()); $aryListing[] = $f; } elseif ($fileinfo->isDir()) { //print($fileinfo->__toString() . PHP_EOL); // directory } else { // echo $fileinfo->getFilename(); // not file or directory? } } return $aryListing; } But with certain accented characters such as ğ or ě (https://en.wikipedia.org/wiki/Ğ and https://en.wikipedia.org/wiki/Ě respectively), it returns as the letters g and e instead of the accented characters. So a file like Dağ_Piě.txt will return as Dag_Pie.txt So in the above code, it is not returned in the array as it is skipped over since Dag_Pie.txt is not a file (and is not a directory either). This doesn't happen with all files with accented characters. Files such as café.txt are fine. Sure, I can rename all the files manually I find on the server, but I prefer a solution that can read the filenames correctly (and then rename them accordingly if I choose to with the script). I don't want to go through every single filename ever so often. scandir returns the same thing. Any help would be appreciated Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/ Share on other sites More sharing options...
ginerjm Posted May 27, 2022 Share Posted May 27, 2022 I"m gonna guess that this is a character set issue. What are you using when you display the output? Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/#findComment-1596734 Share on other sites More sharing options...
jaesun Posted May 28, 2022 Author Share Posted May 28, 2022 14 hours ago, ginerjm said: I"m gonna guess that this is a character set issue. What are you using when you display the output? I am just displaying it to the browser to test it. I have tried writing the results of the contents of the directory to a file, but still, comes back as g/e respectively. While the other filenames like in cafe will show in the file fine (and I can change the encoding while viewing in notepad++ and see it). But its like the g/e are just that, the letters g/e itself, as if the iterator/scandir functions returned it to me as such. Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/#findComment-1596745 Share on other sites More sharing options...
ginerjm Posted May 28, 2022 Share Posted May 28, 2022 I do believe that if you are not referencing a proper charset your displays will be showing as you describe - not what you want. As you have noted in your use of notepad++, when you change the encoding it shows up fine. Have to do that with your browser and probably your script. Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/#findComment-1596762 Share on other sites More sharing options...
jaesun Posted May 28, 2022 Author Share Posted May 28, 2022 1 hour ago, ginerjm said: I do believe that if you are not referencing a proper charset your displays will be showing as you describe - not what you want. As you have noted in your use of notepad++, when you change the encoding it shows up fine. Have to do that with your browser and probably your script. When I mention that notepad++ shows it fine when changing the encoding, I am referring to café, not the characters ğ/ě. Notepad is showing ğ/ě as g/e as if php/iterator has replaced it with the letters g/e. Even if I change the encoding to be correct, it still shows as g/e And I am not worried about the display as in the end, I am not using it for display (more for backend stuff). The bigger problem for me is that I am not even able to reference the file or know that the file is in the directory. When I run the function, if the directory contains 5 files, but one file contains the g/e, it returns it to me with an array of 4 files. $fileinfo->isFile() returns false, and $fileinfo->isDir() returns false. So the rest of the code after the function call acts as if the file does not exist whatsoever. So a directory with café.txt, test1.txt, Dağ_Piě.txt ... the function returns an array(café.txt, test1.txt) The iterator that grabs all the files is able to pick up the file, just returns false on checking if its a file or directory. if I uncomment the echo in the else{} stm, it will display the filename, but the letters ğ/ě replaced with g/e, again, as if the 2 letters were replaced by e/g and $fileinfo->isFile()/$fileinfo->isDir() returns false no matter what ... which leads me to believe that no matter what I change the encoding on display, it won't matter what encoding I use to display. Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/#findComment-1596769 Share on other sites More sharing options...
ginerjm Posted May 28, 2022 Share Posted May 28, 2022 Guess I can't help you. Quote Link to comment https://forums.phpfreaks.com/topic/314845-getting-the-directory-listing-of-files-with-some-accented-characters/#findComment-1596770 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.