Jump to content

[SOLVED] preg_replace


anon4104

Recommended Posts

Hi.

 

I have been trying to learn how to use the preg_replace() function but I am getting nowhere. I have already read over the function at php.net but it doesn't help much. I am having trouble understand the formatting of the function.

 

For example (just some random string I found):

 

$output = preg_replace("/<a href=\"(.*)\">(.*)<\/a>/i", "\\2 (\\1)"m $source);

 

I understand the basic preg_replace($pattern, $replacement, $subject). But I am confused when it comes to the all the ([^\"]*) and (.*?) and \\1 \\2 and /i garble.

 

I understand that (.*) represents a value which could be a combination of any characters. And that for the $pattern part of the function, you used the / / to declare what you want to find. In this case its <a href=\"(.*)\">(.*)<\/a> but then what is with the /i after it? And as for what you want it to be replaced with, what does \\2 (\\1) mean? I also know the \ is used to escape characters such as the " or the / which would screw with the function and produce errors.

 

I hate to ask and I know I shouldn't ask for code, but for the sake of trying to understand this, what would $clean be in each example to generate the result $output?

 

Example 1:

$string = "The cat ran up <a href=\"http://www.google.ca\">Google's</a> dead oak tree.";
$clean = preg_replace(???);
$output = "The cat ran up dead oak tree.";

 

Example 2:

$string = "Click to see the <a href=\"http://www.google.ca\"><img src=\"images/cat.jpg\" alt=\"cat\" /></a> cat!";
$clean = preg_replace(???);
$output = "Click to see the <a href=\"http://www.google.ca\">furry</a> cat!";

 

I would greatly appreciate any help or "preg_replace for dummies" links. Thank you.

Link to comment
Share on other sites

Ok preg_replace uses something called Regex in case you couldn't find any reference material on it, Regex was basically created by the suckers who used to code in Binary (Not Really lol) I just said that because most of the time it makes no sense but you get used to it after banging your head against the walls for a few years.

 

I understand that (.*) represents a value which could be a combination of any characters. And that for the $pattern part of the function, you used the / / to declare what you want to find. In this case its <a href=\"(.*)\">(.*)<\/a> but then what is with the /i after it? And as for what you want it to be replaced with, what does \\2 (\\1) mean? I also know the \ is used to escape characters such as the " or the / which would screw with the function and produce errors.

 

To answer some of your questions lets start with the above block. First off the /i after the entire thing is like saying the string is not case sensitive so it compares case insensitive or case does not matter when searching. There are a few and I still have to re-look them up when using them and I've been doing it for awhile so don't feel bad if you can't remember all Regex because you probably never will thats what reference material was created for.

 

As for what does \\2 (\\1) mean? Well this means that we need to ignore the / before the 2 and ignore the / before the 1 because when you declare Regex you declare everything between two / and every other slash needs to be escaped as well as other things like single or double quotes depending on what your using.

 

Also things like the period, the * and the ? should all be escaped if you are searching for it directly because you can think of those as types of commands where (.*) gets everything, ect... now you can begin to group items with [] and () for instance using () says combine . and * which means grab everything and using [] is like a conditional statement so if we had [0-9] that means only grab numbers between 1 and 9.

 

$string = "The cat ran up <a href="http://www.google.ca">Google's</a> dead oak tree.";
$clean = preg_replace('/<a\shref=\"(.*)\"/');
$output = clean;

 

Ok in the above example the output would be <a href="http://www.google.ca" after which you would clean up the rest by removing the <a href=" and " like this

 

$replace = array('<a href="' => '', '"' => '');
$img[$b] = strtr("$subject", $replace);

 

Now lets examine the actual Regex shall we which looks like this <a\shref=\"(.*)\" Ok the first part says match the string <a href=" however there are no spaces in Regex so we replace a single space with \s which tells Regex that this is a space (Again there are lots of these like \w but you need to look at reference materials to understand them all) anyway the next part is this (.*) which says match everything between the two double quotes which we also escape by placing the / before them.

 

I'll stop there and this little explanation may or may not have helped you and I've refered to reference materials a lot in this description but a simple search on Google for Regex, Regex Tutorials, or PHP Regex will bring up a lot more then I can ever offer on the subject.

 

Just remember that you don't have to learn it all because it really is a complicated mess but eventually you'll remember a lot of common things and you'll be able to do almost anything with Regex.

Link to comment
Share on other sites

Well, I'm not first, but here's my explanations ;)

 

A grouping starting with a ^ means, "match anything except these character", so [^a123] will match any character except "a", "1", "2" and "3".

 

(.*?) is often used, and roughly means "match anything the lazy way". To explain what is meant by lazy:

In

"<strong>(.*?)</strong>"

, the parenthesis will match anything up until the FIRST

</strong>

tag. Without the question mark, it would be greedy, and match anything up until the LAST

</strong>

tag.

 

\\n, where n is a number, is a back reference to the n-th parenthesized match. So in

"<a href=\"(.*?)\">(.*?)</a>"

, \\1 is the URL and \\2 is the link text.

 

A letter after the pattern boundaries (these can be %, /, |, and lots of others) is a pattern modifier. The "i" modifier makes the search case IN-sensitive. Other modifiers: http://php.net/manual/en/reference.pcre.pattern.modifiers.php

 

A good resource for RegEx: http://www.regular-expressions.info/

Link to comment
Share on other sites

Ok preg_replace uses something called Regex in case you couldn't find any reference material on it, Regex was basically created by the suckers who used to code in Binary (Not Really lol) I just said that because most of the time it makes no sense but you get used to it after banging your head against the walls for a few years.

 

I am still laughing at that. lol. Man that was good. I needed the laugh. Its now 6:08 AM here and I have yet to go to bed. I feared for the hair on my head, my hand steadily hovering over in a plucking gesture, as I futilely searched google on this subject. Simply putting a term or definition to all this stuff, (Regex), has helped me MORE then you can possibly know. Its like trying to describe to a blind man (from birth) what the color blue looks like.

 

Then to take it even further and break it down a lot more is even a million times more helpful. Mind you, there were some typos in their which confused me a bit at times, but overall... my god. THANK YOU.

 

The same goes for you thebadbad. Your examples and links will help a lot!

 

I just cannot express how .. AHHH you know? lol. Anywho. Thank you both very much. I really appreciate you taking the time to help explain everything. I think now is a good time to get some rest and re-read this topic, follow those links and do some more research tomorrow... err. Well. Later today I should say.

 

Once again, THANK you both very much.

 

Take care.

Link to comment
Share on other sites

Okay, this is driving me bonkers!

 

I have a string.

 

$string = "Test image <a href="http://domain.com"><img src="http://domain.com/images/test.jpg" alt="test"></a> is a test."

 

I have this line:

 

$clean = preg_replace('%(<a href=\"(.*?)\">(.*?)</a>)%sim', '', $string);

 

I want to the final output to be:

 

Test image is a test.

 

Yet it does not seem to be working. I am using the parentheses to tell it to match the entire thing. I also use the s, i and m modifiers to make it case insensitive and in single or multi line mode (which if I understand correctly, is if in case the match contains a line break or something correct?). The percentage symbols I am unsure what they are for. (I am working from some example I found--they were in it, so I figured I best include it too).

 

So, what am I doing wrong?

 

(Kill me now. For the love of humanity. Kill me now :()

 

Thanks.

Take care.

Link to comment
Share on other sites

I would do this another way like so

 

$string = 'Test image <a href="http://domain.com"><img src="http://domain.com/images/test.jpg" alt="test"></a> is a test.';

$clean = strip_tags($string);

echo $clean;

 

I know your trying to hack it out with Regex or learning but sometimes its easier not to do it with Regex.

Link to comment
Share on other sites

That was actually the first thing I tried (which doesn't work) as it wont show the rest of the text after where the image is. Which is why I resorted to attempting to hack my way through Regex and preg_replace without actually taking a good long proper amount of time to learn it. Mind you, I will keep reading and practicing and writing until I get the hang of it down, but it will definitely take more then a day.

 

EDIT: Jeesh. I just realized after I posted this why it wasn't working before. Aside from cleaning tags and other formatting from the string, I also stripped it down to 100 characters in length as the string is just a preview of a longer string. Anyway, the problem is (was), I had invoked strip_tags() AFTER substr(). So While it never showed the HTML, it still counted towards the 100 characters. Blasted. To think, after all this time and all I had to do was reverse the lines.

 

while ($row = mysql_fetch_array($result)) {
$date = $row['date'];
$max_length = 100;
$message = $row['post_content'];

$clean_youtube = preg_replace('%(\[youtube\](.*?)\[/youtube\])%sim', '', $message);
$clean_html = strip_tags($clean_youtube);
$stripped = substr($clean_html, 0, $max_length);

echo '<div class="blog-preview">';
echo '<small><a href="blog/?p='.$row['id'].'">'.$row['post_title'].'</a> - '.$date.'</small><br />';
echo '<small>'.$stripped.'...</small>';
echo '</div>';
}

 

Either way, I still appreciate your help.

 

Take care.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.