gmcmudder Posted July 7, 2008 Share Posted July 7, 2008 Being this is my first post, I'll try and keep the question short I've got a text document that I'm using to add information to a database, the text file is usually pretty large 'line size' and is from a news print company. They use the text file to feed information into their news print computer and wanted to use that same text file to upload their news articles to their website. Anyway, long story short, the task seemed easy enough. The text file is passed through a couple of filters to remove any unwanted characters from the submitted text. Broken down into individual lines and categorized and then entered into the database. Everything works great until my php function hits a special character that I've never seen before. I can get another quick php program I wrote to print the questionable line on screen (I thought that if I could see the character on screen I might be able to identify it). On screen it looks like an & in a circle. This symbol is driving me nuts, I know I've seen it once before but now I can't remember how I removed it. I've googled for special characters, looked through countless sites etc. etc. Still no luck. It's usually at the end of a line and since my php function breaks each entry down using the line break at the end of a sentence. When the function hits this character it simply includes it in the line and continues so that the two lines are entered as one. i.e. - this is the first line this is the second line becomes - this is the first line(symbol)this is the second line I also opened the text document up in DOS .... yeah I said DOS thinking that maybe I could identify it that way as well, in DOS it looks like the number 6 with a ^ on top. All one character. I've opened the text document with countless programs and it simply isn't there when I view the opened text file in any editing program. Oh yeah, I almost forgot. The original text file is created on a MAC and the web server is Windows base so maybe that has something to do with it. Anyone got any suggestions on how to remove this character? Quote Link to comment Share on other sites More sharing options...
DarkWater Posted July 8, 2008 Share Posted July 8, 2008 Wow, I'd hate to see a long question. =P Anyway, can you open the text in a binary or hex editor and find me the hex representation of the character please? Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 Wow, I'd hate to see a long question. =P Anyway, can you open the text in a binary or hex editor and find me the hex representation of the character please? Sorry for that 'long' short question Funny thing is that it doesn't show up in my hex/binary editor, I've shortened the text document down to just a few words so that it is still printed in the browser, but I still don't see it in the hex editor. Quote Link to comment Share on other sites More sharing options...
DarkWater Posted July 8, 2008 Share Posted July 8, 2008 Can you link me to an example please? Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 give me a few to set it up on my server and I'll send you a link. Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 Ok, I found it in (I think) both binary and Hex form binary - 00001011 hex - 0x0000000b it was showing up in the hex/binary editor as a period that's why I missed it. Quote Link to comment Share on other sites More sharing options...
DarkWater Posted July 8, 2008 Share Posted July 8, 2008 That's a vertical tab....O_O I don't know why that would be in there. Try doing: str_replace(chr(0x0B), '', $string); Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 That's a vertical tab....O_O I don't know why that would be in there. Try doing: str_replace(chr(0x0B), '', $string); Sorry for the delay I wanted to check it using the big text file (2,235 lines) That got it, thanks so much DarkWater. I couldn't figure out what it was doing in the file either. Sure gave me H*** getting it out though, you made my day thanks again. Quote Link to comment Share on other sites More sharing options...
DarkWater Posted July 8, 2008 Share Posted July 8, 2008 No problem. =) Mark this topic as solved please. I'd be curious to know why those were in there if you ever find out though. =P Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 I sure will let you know if I find out. I am going to note the problem and solution in my personal notebook though. That character really messed up my function. To the point that about 30% of the articles were not being entered into the database. Another side note, MySql didn't like that character either, it would simply stop inserting the data when it found the character as well. Maybe I shouldn't have noted that here, but it might be easy enough to let that one character slip through some user input. Quote Link to comment Share on other sites More sharing options...
DarkWater Posted July 8, 2008 Share Posted July 8, 2008 Are you escaping the input with mysql_real_escape_string before entering it? I don't think I've ever experienced that problem before... o-O Quote Link to comment Share on other sites More sharing options...
gmcmudder Posted July 8, 2008 Author Share Posted July 8, 2008 No I wasn't because I thought the input from the text file was from a trusted source. It never dawned on me to use that function when inserting the data into the database. But I am adjusting my php function to filter all input and use mysql_real_escape_string before inserting anything into the database. I should have remembered 'Trust No One' when it comes to input when I started this project. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.