Jump to content

I could use help with RSS/XML, charsets / encodings, and copy/paste from Word.


Kinsbane

Recommended Posts

So, I've had this problem for quite a while now and after extensively searching Google I haven't found anyone else with this problem who has posted a solution.

 

I'm trying to make a valid RSS feed for my company's different types of press releases. When I look at the raw RSS feed with Firefox, the different press releases don't have any line breaks, like how the PR is seen on the normal webpage.

 

I also ran the RSS feed through the validator, and have numerous errors, most of which pertain to illegal characters  or entities, like this:

'utf8' codec can't decode byte 0x84 in position 25415: unexpected code byte (maybe a high-bit character?)

 

What functions are available to me to fix this? Keep in mind, these PR's are copy/pasted directly from Word files into the webpage form and then saved to the database. I have asked and asked and asked and asked that our PR firm do NOT do this when posting PR's to the website, but my requests get ignored - I need to be able to do this automatically. What encoding should the database table fields be to help facilitate character encoding at every level?

 

Thus far I have not been able to find anything on the web that tells how to deal with text copy/pasted from Word, or how to go about making sure my feed validates. It's as if everyone's got inside knowledge of this except for me, and I honestly don't know where to begin looking for answers.

 

What kind of solutions has everyone else developed? How have you handled character sets and encodings? Do you use UTF-8? ISO-8859-1 ? Thanks for any advice in advance.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.