Jump to content


Deleting hidden control chars (\n, \r \t, etc.) from a RTF stream [FIXED]

  • Please log in to reply
2 replies to this topic

#1 tox_yray

  • Members
  • Pip
  • Newbie
  • 7 posts

Posted 11 August 2006 - 01:53 PM


See, I have this massive chunk of code to convert RTF into HTML and I get problems with special characters when their is "something" (other than a space) between the special character and the rest of the word. For example, the flow "\'c9cole Polytechnique de Montr\'e9 al \par." seems to have only a space between the last "é" and "al" and between "al" and "\par" too. However, when I mark, nl2br() and explode the string, here is what is returned on screen:

2078: $$!!CONTROL++--++SPACECHARc9
2079: cole
2080: Polytechnique
2081: de
2082: Montr
2083: $$!!CONTROL++--++CHARe9

2085: $$!!CONTROL++--par
As you can see, <br>s were introduced before and after "al", which means there is a \n and/or \r char there, that I an't see when I print the stream with echo.

My goal now is to detect those characters to delete them from the RTF stream I have to interpret. How to do it is where I hit a wall.

Any ideas how to do this?
Thanks a lot for any input,
Bruno M-A.

#2 effigy

  • Staff Alumni
  • Advanced Member
  • 3,600 posts
  • LocationIL

Posted 11 August 2006 - 01:58 PM

Use the "\r" and "\n" in str_replace.
Regexp | Unicode Article | Letter Database

#3 tox_yray

  • Members
  • Pip
  • Newbie
  • 7 posts

Posted 11 August 2006 - 03:12 PM

Thanks a lot, it works like a charm. However, for future reference, you have to make sure you write your search array like this:

$search = array("\r", "\n");
Escaping the slash (\\) will cause the script to search for the actual string "\r" instead of the char '\r'.
Single quotes will auto-escape the slash, so double quotes are required here.

Hoping this will help more people around here sometime,
Bruno M-A.

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users