Jump to content


Parsing RTF: space problems

  • Please log in to reply
No replies to this topic

#1 tox_yray

  • Members
  • Pip
  • Newbie
  • 7 posts

Posted 12 August 2006 - 03:56 AM


I have a little problem converting Word generated RTF document to HTML via PHP:

Here is a part of a test file:
{\insrsid14898988 \par Ceci, il n\rquote y en a }{\insrsid13044712 p}{\insrsid14898988 as.\par }
should output: Ceci, il n'y en a pas.
outputs: Ceci, il n'y en a p as.

Reason: "pas" is splitted because it was modified from "Pas" to "pas" from a save to another and there is a space between "insrsid" and "as" that ISN'T MEANT to be there.

Now here is another part:
{\insrsid12341801 avec sans barque}{\insrsid6517974 okok}
should output: avec sans barque okok
outputs: avec sans barque okok

Reason: Just normal behavior. The space between "insrsid" ans "okok" is MEANT to be there.

Now, my question is, has anyone worked with RTF enough to know a way of telling whether that space should be in the output? Anyone knows how Microsoft Word processes it (that might help me process it myself)?

Thanks a whole lot for any input,
Bruno M-A.

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users