Jump to content

Need regex help


asmith

Recommended Posts

Hello.

 

I have a string like this:

 

$content = 'word1 word2 "word3 word4" word5 "word6 word7"';

I need to have an array of each word like explode(' ', $content), but consider the words in double quote as one:

 

array('word1', 'word2', 'word3 word4',  'word5', 'word6 word7');

 

Any idea?

Thanks for your time

So far I did something like this: 

/("((?.+)(?<!"))+?)"| )/  

Link to comment
Share on other sites

Here's one way (of many!):

 

$words = preg_split('/\h+|"([^"]++)"/', $content, NULL, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
var_dump($words);

 

The logic being that we want to split words on whitespace or grab the phrases in quotes. That style of dual purpose (splitting yet also capturing) can take a while to get used to but it is very useful.

Link to comment
Share on other sites

@salathe

 

I'm not sure the possessive quantifier is necessary (or rephrased differently, using just a regular + quantifier yields the same results). Is there a special curcumstance that would require ++ in this case? I'm also finding the choice of using horizontal whitespaces odd (\s would suffice as well, no?)

 

I wasn't aware you could specify NULL for the count parameter (I'm used to simply using -1 instead).

Link to comment
Share on other sites

Regarding the possessive quantifier, that's actually a typo but either way will work.  I guess on the off-chance that there is an opening double-quote without a closing one, the possession might come in handy but truth be told writing it was purely accidental. So, no requirement but it wouldn't hurt to use it.

 

As for the \h; if there's only ever going to be a single space character, use that. I don't see why using \h is any more "odd" than using \s, perhaps its use is less common but that's no reason not to use something.

 

As for using NULL instead of 0 or -1 for the limit parameter (not count) that's just personal preference. NULL, 0 and -1 all end up as -1 eventually and I like to use NULL in place of "no value" so in this case using NULL is fine (in my opinion). Also, a general convention for "skipping" parameters to use later optional ones is to use NULL; as stated on the preg_split manual page "use null [for the limit parameter] to skip to the flags  parameter".

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.