Jump to content

Import text files in to SQL - Character conversion needed?


slowfib

Recommended Posts

I'm importing some text files from my windows servers in to a SQL database, but I'm running in to what I think is some sort of issue with either a special character or something to do with UTF-8, or UTF-16... I haven't dealt with this before, so I'm really not even sure.

 

I read in the file as such:

$handler = fopen($file, "r");
$Data = fread($handler, filesize($file));
fclose($handler);

 

The text file itself contains the data formatted like this:

 

Unable to deliver this message because the follow error was encountered: "This message is a delivery status notification that cannot be delivered.".

The specific error code was 0xC00402C7.

 

The message sender was <>.

 

The message was intended for the following recipients.

[email protected]

 

If I simply echo out this data:

echo $Data

 

It'll come out in Firefox like the following. And I can see FF is choosing to view it as Western (ISO-8859-1), but if I choose to use Unicode (UTF-16) then the data displays correctly.

U�n�a�b�l�e� �t�o� �d�e�l�i�v�e�r� �t�h�i�s� �m�e�s�s�a�g�e� �b�e�c�a�u�s�e� �t�h�e� �f�o�l�l�o�w� �e�r�r�o�r� �w�a�s� �e�n�c�o�u�n�t�e�r�e�d�:� �"�T�h�i�s� �m�e�s�s�a�g�e� �i�s� �a� �d�e�l�i�v�e�r�y� �s�t�a�t�u�s� �n�o�t�i�f�i�c�a�t�i�o�n� �t�h�a�t� �c�a�n�n�o�t� �b�e� �d�e�l�i�v�e�r�e�d�.�"�.� � � � �T�h�e� �s�p�e�c�i�f�i�c� �e�r�r�o�r� �c�o�d�e� �w�a�s� �0�x�C�0�0�4�0�2�C�7�.� � � � � � �T�h�e� �m�e�s�s�a�g�e� �s�e�n�d�e�r� �w�a�s� �<�>�.� � � � � � �T�h�e� �m�e�s�s�a�g�e� �w�a�s� �i�n�t�e�n�d�e�d� �f�o�r� �t�h�e� �f�o�l�l�o�w�i�n�g� �r�e�c�i�p�i�e�n�t�s�.� � � �O�n�l�i�n�e�H�e�l�p�@�E�l�i�t�e�R�a�c�i�n�g�.�c�o�m� � �

 

And finally, if I try to insert the data in to SQL, the data looks like this:

 

INSERT INTO Badmail VALUES(

2

, '00360053425643112201000000004.BDR'

, 'U n a b l e  t o  d e l i v e r  t h i s  m e s s a g e  b e c a u s e  t h e  f o l l o w  e r r o r  w a s  e n c o u n t e r e d :  " T h i s  m e s s a g e  i s  a  d e l i v e r y  s t a t u s  n o t i f i c a t i o n  t h a t  c a n n o t  b e  d e l i v e r e d . " .

 

 

 

T h e  s p e c i f i c  e r r o r  c o d e  w a s  0 x C 0 0 4 0 2 C 7 .

 

 

 

 

 

T h e  m e s s a g e  s e n d e r  w a s  < > .

 

 

 

 

 

T h e  m e s s a g e  w a s  i n t e n d e d  f o r  t h e  f o l l o w i n g  r e c i p i e n t s .

 

O n l i n e H e l p @ S o m e d o m a i n . c o m

 

'

, '2010-11-30 07:17:05'

, GETDATE())

 

 

So basically, I'm not really sure if I'm supposed to convert the ASCII to UTF-8 or if I just need to do a bunch of str_replace to correct the data before inserting.

I'd appreciate any feedback or suggestions anyone has.

 

Thank you.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.