hlstriker Posted December 29, 2009 Share Posted December 29, 2009 I have a string that is in format like so: head:body,head:body[0,1,2],head:body[x,y,z],head:body I'm trying to split the string by the commas so it will be: head:body head:body[0,1,2] head:body[x,y,z] head:body But what I have is splitting like this: head:body head:body[0 1 2] head:body[x y z] head:body Does anyone know how I can filter out the commas inside of [] brackets? Quote Link to comment Share on other sites More sharing options...
cags Posted December 29, 2009 Share Posted December 29, 2009 How about something like... $subject = "head:body,head:body[0,1,2],head:body[x,y,z],head:body"; $pattern = "#(?=[^\]]*(?:\[|$))\s*,\s*#s"; $fields = preg_split($pattern, $subject); Edit: It's actually a pattern I wrote for somebody else, if there's no spaces in your string you could remove the \s* parts of the pattern. Quote Link to comment Share on other sites More sharing options...
.josh Posted December 29, 2009 Share Posted December 29, 2009 my approach $string = "head:body,head:body[0,1,2],head:body[x,y,z],head:body"; $string = preg_split('~(?<=\]),~',$string); edit: hmm i obviously didn't consider no brackets... Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted December 29, 2009 Share Posted December 29, 2009 Granted CV, your solution will not yield to results the OP is looking for. It would only split by comma if there is a ] before it (thus it won't split parts like the initial comma between head:body and head:body[0,1,2]). My take on this: $string = "head:body,head:body[0,1,2],head:body[x,y,z],head:body"; $string = preg_split('#,(?![\da-z][,\]])#',$string); // if there might be capital letters, add the i modifier after closing delimiter Quote Link to comment Share on other sites More sharing options...
salathe Posted December 30, 2009 Share Posted December 30, 2009 A little late to the game, I know, but why not peek ahead to check if the comma is a field delimiter or not? Like /,(?=head:)/ or /,(?=[a-z]+:)/i Quote Link to comment Share on other sites More sharing options...
.josh Posted December 30, 2009 Share Posted December 30, 2009 i actually thought to do that but then i thought if i posted something like that, then the OP would come back and inform me that all that shit is arbitrary :/ cuz that's how it usually goes. Quote Link to comment Share on other sites More sharing options...
salathe Posted December 30, 2009 Share Posted December 30, 2009 Well, I definitely don't fall into the psychic category so until (inevitably) told otherwise lets assume it's always ",head:" :-) (And when told otherwise, don't help but instead yell at the OP for being annoying!) Quote Link to comment Share on other sites More sharing options...
hlstriker Posted December 31, 2009 Author Share Posted December 31, 2009 Granted CV, your solution will not yield to results the OP is looking for. It would only split by comma if there is a ] before it (thus it won't split parts like the initial comma between head:body and head:body[0,1,2]). My take on this: $string = "head:body,head:body[0,1,2],head:body[x,y,z],head:body"; $string = preg_split('#,(?![\da-z][,\]])#',$string); // if there might be capital letters, add the i modifier after closing delimiter This solution seems to work for the most part but for some reason it still splits commas inside of [] brackets sometimes. I tried to figure out why but I really don't know much as it is. Quote Link to comment Share on other sites More sharing options...
salathe Posted December 31, 2009 Share Posted December 31, 2009 The "some reason" is probably that you have a wider range of input than you've cared to mention so far. Please give some examples of when the solution does not work, my bet would be the comma-separated-values-within-square-braces can be longer than a single character. P.S. Have you tried all of the other "solutions" or just nrg_alpha's? Quote Link to comment Share on other sites More sharing options...
hlstriker Posted December 31, 2009 Author Share Posted December 31, 2009 my bet would be the comma-separated-values-within-square-braces can be longer than a single character. This is true, sorry for my bad example using [x,y,z]. P.S. Have you tried all of the other "solutions" or just nrg_alpha's? Yes, they didn't seem to work with the brackets at all. Quote Link to comment Share on other sites More sharing options...
cags Posted January 1, 2010 Share Posted January 1, 2010 Can I ask what was wrong with the solution I suggested? Quote Link to comment Share on other sites More sharing options...
salathe Posted January 1, 2010 Share Posted January 1, 2010 At least mine (I didn't test the others) worked with the brackets in the samples provided in this thread: so it should work at least a bit. Again, as with my last post, please give us more representative samples of the input that our regular expressions will need to work with, including ones that "work" and "don't work" at the moment. Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted January 1, 2010 Share Posted January 1, 2010 I tested cags solution against the sample string, and it works fine. Cags, note that your pattern makes use of the s modifier. However, since your pattern doesn't make use the dot_match_all character, that modifier isn't used (thus not necessary). @hlstriker You need to understand that regex is a very terse system. If the samples vary in formatting, we need to know this, so that a more accurate pattern may be crafted to accomplish what you seek. Otherwise, you'll be given explicit solutions to explicit problems that are rigid in format. So as salathe suggests, you should provide more sample strings. Quote Link to comment Share on other sites More sharing options...
hlstriker Posted January 1, 2010 Author Share Posted January 1, 2010 Can I ask what was wrong with the solution I suggested? Just double tested your solution and it works fine for the example I posted. Sometimes it will be in this format too though (which is what was throwing my tests off): head:body,head:body[[one,two],[one,two],[one,two]],head:body[one,two,three],head:body Sorry about the complications. Quote Link to comment Share on other sites More sharing options...
cags Posted January 1, 2010 Share Posted January 1, 2010 However, since your pattern doesn't make use the dot_match_all character, that modifier isn't used (thus not necessary). Oops, good point. Can I ask what was wrong with the solution I suggested? Just double tested your solution and it works fine for the example I posted. Sometimes it will be in this format too though (which is what was throwing my tests off): head:body,head:body[[one,two],[one,two],[one,two]],head:body[one,two,three],head:body Sorry about the complications. That being the case salathes solution of '#(?=head:)#' seems like your best fit. Assuming of course you wish "head:body[[one,two],[one,two],[one,two]]" to be one part (which I assume you do as that's what mine splits up). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.