readDir / scanDir - formats of filenames?

Hi all

I hope I have come to the right place.

My system reads files stored on a drive and lists them to users through plain HTML, I made it 11 years ago and have to refresh my memory.

My problem is that filenames seem to come in different formats, to how to decode/encode them is an issue...

My users use Scandinavian letters (æøåäöüõ) and it seems like one filename is in one format and another in another format. There is no logic to what format the filenames comes it. All files are ok and downloads as they should, they just dont list well.
I tried downloading one with Æ and uploading it again as -2 and it lists differently.

Any idea how I can handle this issue?



Make sure you're doing everything with the same encoding - preferably UTF-8.

What operating system? IIRC Windows and Linux deal with it differently.

11 years ago, it was unusual for people to use UTF-8. This was also before HTML5 became the defacto standard, but previous to that, people often used a particular character set, so check your app to see what if any meta charset it's setting.


<meta charset="UTF-8">

//or possibly
<meta charset="ISO-8859-1">

In the past it was not uncommon for people in the west to use ISO-8859-1 as it covers english and a lot of the european languages, and Finnish and Swedish.  There is also ISO-8859-4 which supports "Scandinavia/Baltic".  They overlap to a fair degree, but obviously there are some characters that are different.

As requinix stated, we really need more info on the OS of the server.  Again, going back 11 years, windows servers were still possibly using a codepage rather than unicode.  There's also some issues with different OS's as to the support or lack thereof for case sensitive filenames.  

It would also be helpful if you could provide a specific example of a file that has one name on the filesystem and displays as garbage or something else in your app.

