Jump to content

[SOLVED] preg_replace() help


ryy705

Recommended Posts

Hello,

 

I am trying to strip out junk tags from html generated by M$ office.  It generates code like the following:

<o:SmartTagType
=namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
name=3D"place"/>

 

Some times there is a line break and sometimes there isn't.  The following is my preg_match function:

preg_replace("<\w:.*>", '', $str);

 

It returns:

<
=namespaceuri=3D"ur
name=3D"place"/>

I'm trying my best to learn regex but it seems as though I should qualify for four year degree by the time I finish.  Please help.  I don't think I can do this by myself.  I thank you in advance.

Link to comment
Share on other sites

Thanks it works.  What is the purpose of #?

 

A little more help please.  I need to replace pairs of line break with a single line break.  So, something like the following needs to be replaced by three line breaks.

 

<br />
<br />

<br />
<br />
<br />
<br />
<br />

 

So I need to replace every pair of line break with a single line break as long as they are only separated by newline, blank space, or no space.  The following is what I'm trying to use.

$str = preg_replace("#<br />*\s$*<br />#", '<br />', $str)

This is returning error saying 'Compilation failed' and no string is returned. Kindly help me out.

Link to comment
Share on other sites

Thanks it works.  What is the purpose of #?

 

No problem. As for your question, the # is a delimiter. It doesn't have to be a # character (most commonly, it is the / character). I suggest you read up on preg expressions and get up to speed on things. It really is worth taking the time to familiarize yourself with regex and learn the basics. While it is the easy route to ask for solutions on these forums, you are still left without really understanding the mechanics of it all. If you learn regex, you'll be far more efficient at solving such problems in the future (I mean no offense by any of this of course).

 

A little more help please.  I need to replace pairs of line break with a single line break.  So, something like the following needs to be replaced by three line breaks.

 

<br />
<br />

<br />
<br />
<br />
<br />
<br />

 

So I need to replace every pair of line break with a single line break as long as they are only separated by newline, blank space, or no space.  The following is what I'm trying to use.

$str = preg_replace("#<br />*\s$*<br />#", '<br />', $str)

This is returning error saying 'Compilation failed' and no string is returned. Kindly help me out.

 

If I understand correctly, perhaps this is what you are looking for? (not sure if I got it right or not though).

$str = <<<DATA
<br />
<br />

<br />
<br />
<br />
<br />
<br />
DATA;
$str = preg_replace('#<br />(\r\n|\x20)<br />#', '<br />', $str);
echo $str;

 

You would have to right-click and view source in your browser to see the code replacement. I use the \r\n (return carriage new line) and \x20 (hex value for an explicit space).

 

Cheers,

 

NRG

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.