Jump to content


This topic is now archived and is closed to further replies.


Parsing RTF: space problems

Recommended Posts


I have a little problem converting Word generated RTF document to HTML via PHP:

Here is a part of a test file:
[code]{\insrsid14898988 \par Ceci, il n\rquote y en a }{\insrsid13044712 p}{\insrsid14898988 as.\par }[/code]
should output: Ceci, il n'y en a pas.
outputs: Ceci, il n'y en a p as.

Reason: "pas" is splitted because it was modified from "Pas" to "pas" from a save to another and there is a space between "insrsid" and "as" that ISN'T MEANT to be there.

Now here is another part:
[code]{\insrsid12341801 avec sans barque}{\insrsid6517974 okok}[/code]
should output: avec sans barque okok
outputs: avec sans barque okok

Reason: Just normal behavior. The space between "insrsid" ans "okok" is MEANT to be there.

Now, my question is, has anyone worked with RTF enough to know a way of telling whether that space should be in the output? Anyone knows how Microsoft Word processes it (that might help me process it myself)?

Thanks a whole lot for any input,
Bruno M-A.

Share this post

Link to post
Share on other sites


Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.