Jump to content

Split by comma problem


hlstriker

Recommended Posts

I have a string that is in format like so:

 

head:body,head:body[0,1,2],head:body[x,y,z],head:body

 

I'm trying to split the string by the commas so it will be:

head:body

head:body[0,1,2]

head:body[x,y,z]

head:body

 

But what I have is splitting like this:

head:body

head:body[0

1

2]

head:body[x

y

z]

head:body

 

Does anyone know how I can filter out the commas inside of [] brackets?

Link to comment
Share on other sites

How about something like...

 

$subject = "head:body,head:body[0,1,2],head:body[x,y,z],head:body";
$pattern = "#(?=[^\]]*(?:\[|$))\s*,\s*#s";
$fields = preg_split($pattern, $subject);

 

Edit: It's actually a pattern I wrote for somebody else, if there's no spaces in your string you could remove the \s* parts of the pattern.

Link to comment
Share on other sites

Granted CV, your solution will not yield to results the OP is looking for. It would only split by comma if there is a ] before it (thus it won't split parts like the initial comma between head:body and head:body[0,1,2]).

 

My take on this:

$string = "head:body,head:body[0,1,2],head:body[x,y,z],head:body";
$string = preg_split('#,(?![\da-z][,\]])#',$string); // if there might be capital letters, add the i modifier after closing delimiter

Link to comment
Share on other sites

Granted CV, your solution will not yield to results the OP is looking for. It would only split by comma if there is a ] before it (thus it won't split parts like the initial comma between head:body and head:body[0,1,2]).

 

My take on this:

$string = "head:body,head:body[0,1,2],head:body[x,y,z],head:body";
$string = preg_split('#,(?![\da-z][,\]])#',$string); // if there might be capital letters, add the i modifier after closing delimiter

 

This solution seems to work for the most part but for some reason it still splits commas inside of [] brackets sometimes. I tried to figure out why but I really don't know much as it is.

Link to comment
Share on other sites

The "some reason" is probably that you have a wider range of input than you've cared to mention so far. Please give some examples of when the solution does not work, my bet would be the comma-separated-values-within-square-braces can be longer than a single character.

 

P.S. Have you tried all of the other "solutions" or just nrg_alpha's?

Link to comment
Share on other sites

my bet would be the comma-separated-values-within-square-braces can be longer than a single character.

 

This is true, sorry for my bad example using [x,y,z].

 

P.S. Have you tried all of the other "solutions" or just nrg_alpha's?

 

Yes, they didn't seem to work with the brackets at all.

Link to comment
Share on other sites

At least mine (I didn't test the others) worked with the brackets in the samples provided in this thread: so it should work at least a bit.  Again, as with my last post, please give us more representative samples of the input that our regular expressions will need to work with, including ones that "work" and "don't work" at the moment.

 

Link to comment
Share on other sites

I tested cags solution against the sample string, and it works fine. Cags, note that your pattern makes use of the s modifier. However, since your pattern doesn't make use the dot_match_all character, that modifier isn't used (thus not necessary).

 

@hlstriker

You need to understand that regex is a very terse system. If the samples vary in formatting, we need to know this, so that a more accurate pattern may be crafted to accomplish what you seek. Otherwise, you'll be given explicit solutions to explicit problems that are rigid in format. So as salathe suggests, you should provide more sample strings.

Link to comment
Share on other sites

Can I ask what was wrong with the solution I suggested?

 

Just double tested your solution and it works fine for the example I posted.

 

Sometimes it will be in this format too though (which is what was throwing my tests off):

 

head:body,head:body[[one,two],[one,two],[one,two]],head:body[one,two,three],head:body

 

Sorry about the complications.

Link to comment
Share on other sites

However, since your pattern doesn't make use the dot_match_all character, that modifier isn't used (thus not necessary).

Oops, good point.

 

Can I ask what was wrong with the solution I suggested?

 

Just double tested your solution and it works fine for the example I posted.

 

Sometimes it will be in this format too though (which is what was throwing my tests off):

 

head:body,head:body[[one,two],[one,two],[one,two]],head:body[one,two,three],head:body

 

Sorry about the complications.

That being the case salathes solution of '#(?=head:)#' seems like your best fit. Assuming of course you wish "head:body[[one,two],[one,two],[one,two]]" to be one part (which I assume you do as that's what mine splits up).

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.