webhead2 Posted March 23, 2009 Share Posted March 23, 2009 Hello and thanks in advance. This is becoming quite the pesky problem! I am developing a script that scrapes some html from partner websites (I have no control over the formatting). The content gets inserted to a Wordpress database The problem I am having is when a carriage return is placed in the middle of a tag. Like so: <body style="color: rgb(0, 0, 0); background-color: rgb(79, 105, 59);" alink="#000099" link="#000099" vlink="#990099"> When displayed in wordpress, it actuall prints out "style="color: rgb(0, 0, 0); background-color: rgb(79, 105, 59);" alink="#000099" link="#000099" vlink="#990099">" because of the carriage return. I've tried this code to remove them: $body = str_replace(chr(13),' ',$body); $body = str_replace("\r"," ",$body); $body = str_replace("\n"," ",$body); The above does not seem to work. Thanks for looking at my problem! :chomp: Quote Link to comment Share on other sites More sharing options...
Floydian Posted March 23, 2009 Share Posted March 23, 2009 Have you tried stripping both the \r and the \n, or just one or the other? Quote Link to comment Share on other sites More sharing options...
webhead2 Posted March 23, 2009 Author Share Posted March 23, 2009 Have you tried stripping both the \r and the \n, or just one or the other? I've tried both individually and together. It doesn't seem to pick up breaks within HTML tags..... Quote Link to comment Share on other sites More sharing options...
webhead2 Posted March 23, 2009 Author Share Posted March 23, 2009 Have you tried stripping both the \r and the \n, or just one or the other? Seems like I had a problem with my crontab firing the old version of the cron. My above code seems to be working, for now! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.