Jump to content

strip_tags doesn't really remove all html?


Gutspiller

Recommended Posts

I'm very new to php. I have the following code:

 

$nohtmltitle = strip_tags($row['title']);

 

Yet, when I look at the processed code, I still get crap like this #8217; that represent characters. I either want to TRULY strip out all special characters that are processed with these types of short number codes, or I want them converted to be what they actually are. So that if $row['title'] is suppose to have ", then when I view the source, there are " and not something like #8217;

 

Can somebody fix my code line above so that it works like this?

Link to comment
Share on other sites

I have other code that cuts off at 47 characters. With things #8217; it ends up sometimes cutting it in between the code, when that happens, it leaves ugly squares in the line of text because the #8217; didn't get completed. I tried to get help on this problem here ( http://www.phpfreaks.com/forums/index.php?topic=336357.0 ), but for some reason, all the code recommendations didn't work in my code. Below is what I currently have. If someone could give me the code all together to replace the code below, that does all of the same stuff, maybe it would finally process without giving an error. I kinda gave up on that previous thread, since it never seemed to work right, but if someone wants to try and give me all one group of code that does the same stuff, I'll try it again.

 

<?

$nohtmltitle = strip_tags($row['title']);
$newtitle = preg_replace('/[^a-zA-Z0-9\&;\,#.\-\$ ]/','',$nohtmltitle);

   $nocaptitle = preg_replace_callback(
   '/\b\p{Lu}{3,}\b/u',
        create_function(
            '$matches',
            'return ucwords(strtolower($matches[0]));'
        ),
        $newtitle
    );

?>

 

I'm new to php, but I still can't believe there isn't a simple command that just cleans text up, so that it looks decent. I didn't think I was asking for so much to not want all caps, not want ugly characters, not have it cut off in the middle of a word, be limited to 47 characters, yet still be smart enough to leave abbreviations like EA, but changes things like inFAMOUS to Infamous.... I guess that's asking too much for just a quick command or short line of code, anyway give it ago if you can. Seeing special characters and broken code like mentioned above is annoying as ever.

Link to comment
Share on other sites

I'm new to php, but I still can't believe there isn't a simple command that just cleans text up, so that it looks decent.

Your "cleaned up" isn't the same as everybody else's.

You are right: it is too much to ask for to have some magic code that does everything you want. Instead, PHP provides is a set of functions that you can piece together to get the results you want. Such as the html_entity_decode() + preg_replace() I posted.

 

If you want to come up with a list of rules that the text should follow then it's certainly programmable. But you still have a lot of legwork to do: a list of abbreviations to keep, some criteria for determining when someone is SHOUTING SOMETHING versus legitimately capitalizing, rules for what constitute "ugly characters"...

Link to comment
Share on other sites

Are you asking for me to give you specs and you'll write the code, or were you only pointing out that I think be php should be awesome automatically?

 

I'll take the time to write it all down, if you are willing to help. :) I do have 1 question, does it take php a longer time to process or put more strain on the server, if you simply have code that is similar to this:

 

blah blah code

code that offers a spot of special characters I don't want

blah blah code

code that has spot for matches of things that should always be caps

blah blah code

etc.

 

Is what I mean is, is there a longer execution time for a single php command that strips out special characters, versus making it so there's an "array" of sorts that simply lets me put in all the characters I don't want to be displayed?

Link to comment
Share on other sites

But PHP is awesome automatically...

 

I can help, but if you continue using this thread then you'll probably get this all resolved sooner and with more input from others.

 

There are different methods to approaching slightly different problems, each with advantages (like compactness) and disadvantages (like slow speed). The best thing you can do now is come up with those criteria and rules, and then we can tell you how to implement them.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.