Jump to content

Recommended Posts

Trying to parse quotes out of a block of text.

For example, someone leaves a comment such as

 

some text some text some text some text
[quote=username] some quote [/quote]
some textsome text

 

So What i need to do is get

 [quote=username] some quote [/quote]

out of there.

 

Notice: (0's used instead of o's from here on inside the quote tag so the forum doesnt parse it)

 

Catch 1:

=username is optional. it could be [qu0te] [/qu0te], which means

[qu0te] some quote [/qu0te] is valid, as wel as [qu0te=username] some quote [/qu0te]

 

Catch 2:

If indeed =username is used, is it possible to get what is after = and ] (which would be the username...and then perhaps store it in a variable so i can use it later) ?

 

Catch 3:

I then need to get the text between [qu0te] and [/qu0te]. and perhaps store it somewhere. is it possible?

 

 

 

Its 3am, sorry if this isnt very clear. If anyone need clarification, let me know.

 

 

I've only been able to come up with \[qu0te[^]]+\], which matches [qu0te=username]. So that is bad.

Link to comment
https://forums.phpfreaks.com/topic/176262-solved-a-semi-complicated-regex/
Share on other sites

preg_match_all('~\[quote(?:=([^\]]+))?\](.*?)\[/quote\]~is', $str, $matches);

If you have any rules for how the username can look, you can change

 

[^\]]+

 

to the appropriate regex (just ask if you need help). Usernames, if provided, will be stored in the $matches[1] array, and quote contents in the $matches[2] array. And it won't work with nested quotes.

While thebadbad's solution works, I think I would make some changes to better organize data and take into account when an username isn't present (which may or may not be desirable). I've amended the OP's example to include a second quote, but without an user name for the following example:

 

<?php
$str = <<<EOF
some text some text some text some text
[quote=username] some quote [/quOte]
some textsome text
[quote]yet another quote[/quote]
EOF;

preg_match_all('#\[quote(?:=([^]]+))?\](.+?)\[/quote\]#is', $str, $matches, PREG_SET_ORDER); // note that there is no need to escape ] inside the character class
$count = count($matches);
for ($a = 0; $a < $count ; $a++) {
    array_shift($matches[$a]); // ditch array element 0
    $matches[$a][0] = (empty($matches[$a][0]))? 'No Username': $matches[$a][0]; // set No Username if none is provided
    $matches[$a][1] = trim($matches[$a][1]); // trim off any initial / trailing spaces from quote text
}
echo '<pre>'.print_r($matches, true);
?>

 

Output:

Array
(
    [0] => Array
        (
            [0] => username
            [1] => some quote
        )

    [1] => Array
        (
            [0] => No Username
            [1] => yet another quote
        )
)

 

When using a capture within a group that is optional (and that section is not present within the source string), we get a 'blank' entry for that capture in the corresponding array element. In this case, there are a couple of options.. we can leave that entry as empty (in this case, when I say empty, I do not mean evaluating to see if a variable is true or not a la empty, but rather empty as in 'blank' - as in no text) and simply check if it is empty when looping through the array, or we can assign something to it otherwise (personally, I'm not a fan of array elements that don't contain text.. so I like to set those to something - but this is admittedly more a matter of taste I suppose).

 

With the use of the PREG_SET_ORDER flag, we get all the related matches / captures grouped into each root array element in a nice and neat fashion.

This outta be in the Code Repository when your done.

 

Speaking of which, zanus.. I was looking through that repository the other day..(and of course, I am refering to the regex repository - a.k.a common expressions, not the main code one). I know the code snippets in there are old (from 2006 if I'm not mistaken). I think it might serve us well to do a serious overhaul in there.. there are knowledgable members here who can contribute to modern revamped and well thought out regex solutions.

 

Granted, only mods+ have access to this section (as it's locked). I know there was discussions a while back between guru+ members about having updated sections with regards to code snippets and whatnot. Perhaps we can have a new subforum which members can pitch their solutions to common problems (perhaps a subforum called 'Solutions In Progress' or something to that effect - some title to denote that the stuff contained within is works in progress for member discussions / evaluations kind of thing), have other forum members give feedback (as the OP's solution may not be as good as it can be), and once approved by the members at large, is inserted into the common expressions section, knowing that the content there will indeed be correct and up-to-date.

 

What are your thoughts?

Oh, I wasn't necessarily suggesting unlocking it..more or less a process of code submission, discussion / evaluation then ultimately approval to be put into the common expressions section. Now if mods / admins don't want to have to be bothered to be the sole members responsible for submitting final code snippet posts, then sure, unlocking it would remove this responsibility.. but also potentially opens the doors to just any member posting stuff (which might not be wise in the event the snippet being posted is not as good as can be (in other words, circumventing the entire discussion / approval process outright)).

 

So I suppose it all boils down to how the staff would like this structured ultimately (if they even want to bother with any of this at all to begin with). I was thinking more along the lines of somehow ensuring that the snippets that do make it into the repository are tried, tested, evaluated for their best possible solutions prior to being added.. this way, everyone knows it's reputable enough and pretty solid.

  • 2 weeks later...

Although I am having a tough time figuring out how to use this.

 

Basically I am trying to accomplish the same thing this forum does when you quote somebody.

 

It puts the quoted text in a darker div, with the username of the person you're quoting on top of that box.

 

How can i accomplish this?

Not sure if i should split the string where it matches and replace the matches, or just use a replace function.....kinda lost here.

preg_replace should be able to achieve the job for you without too much hassle. Having said this, I'm not sure nrg_alpha (perfectly elegant) pattern would be the way to go if your doing this. You might be better off with multiple more simple matches. The reason I say this is I'm not sure it's possible to do a conditional replacement. What I mean by this is nrg_alpha's pattern matches both of the forms..

 

[qu0te] some quote [/qu0te]

 

and

 

[qu0te=username] some quote [/qu0te]

 

Depending on your choice of design you could potentially be replacing them with lets say blockquote tags for the quote part and a span for the username if required. Using the aforementioned pattern, with the replacement of...

 

"<blockquote><span>$1</span>$2</blockquote>";

 

So your output would look something like...

 

<blockquote>
<span>username</span>
some quote
</blockquote>

 

If no username is found you would have a pair of empty <span> tags, which you might not want. Another thing to take into account is if you wish to handle nested quotes, which I'd imagine you quite likely do. Prepare for headaches :)

 

 

Another thing to take into account is if you wish to handle nested quotes, which I'd imagine you quite likely do. Prepare for headaches :)

 

You can quite easily support nested tags by dealing with the opening and closing tags separately. Relevant posts: http://www.phpfreaks.com/forums/index.php/topic,205904.msg934306.html#msg934306 and http://www.phpfreaks.com/forums/index.php/topic,266663.msg1259603.html#msg1259603 (seeing if the tags match up)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.