MadTechie Posted April 19, 2007 Share Posted April 19, 2007 Ok i have got a good book but i have only read the two chapters and currently i am stumped on this (not in the book), i have a sentance that i need to pull the weight out of, now the weight can be 100g, 3kg 77tons etc now i have assumed that the weight will not end in p or P (i hope) anyways heres the example text i have used here is some text that holds 100-200g but also 100g/200g or 100g\200g or 1g or even 100grams even just 100g but shouldn't capture text like my100 or even 100 or £1000 or 99p the regex i have tried is below and this is the RegEx i have used ([ ]+(\d+[^ pP]+))+([-\//]+\d+[^ pP])? Now it seams to half work but group 2 still finds 100 and 99 in the line below even 100 or £1000 or 99p the if anyone could help me here i would really appreciate it, also if its not too much to ask a little explanation on how you got it to work i am trying to learn this but this is alittle above me at the moment if this doesn't make sence just say i'll try a re-write Thanks in advance --MadTechie Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 19, 2007 Share Posted April 19, 2007 At first glance I'd use this: [\d\\/-g]+g Since every thing in your example is followed with a 'g' and there's a lot of variability as to where it shows up. (100-200g not 100g-200g, but 100g/200g not 100/200g) You could probably extend it with a word boundary, but it works without it for your example. [\d\\/-g]+g\b I suppose in the most basic interpretation, you're looking for a number followed by a specific suffix (lbs, g, tons, etc.). In that vein, all you'd need is something like this: [\d,.]+(?:lbs|g|grams?|tons?) Just a big alternation to get all the different options. (Plus I added ',' and '.' to the character class to account for numbers like 1,003.45lbs) Does this answer your question? That regex you have looks pretty complicated (kudos if your only on the second chapter!), generally I try to keep it as simple as possible. The more complicated it gets the more specific the match (which is good) but it can be a pain to debug (or get it to work in the first place). Quote Link to comment Share on other sites More sharing options...
MadTechie Posted April 19, 2007 Author Share Posted April 19, 2007 Thanks for helping c4onastick, i think i have a working model i used some of yours and some of mine, Yes my version is long and your version worked but i wanted the values like 100-200g to be pulled in as one, was mainly trail & error but heres my end result ([\d,.]+[^pP\ \d]+)+(([-/\/])[\d,.]+[^pP\ \d])? test data here is some text that holds 100-200g but also 100g/200g or 100g\200g or 1g or even 100grams even just 100g but shouldn't capture text like my100 or even 100 or £1000 or 99p the regex i have tried is below 200ton blar i'm the second chapter but had a small handle on them already the book was for a push.. while this is getting complex (i think maybe too complex for what i am doing) thanx for your help, just reading the process helped alot Oh whats the ?: do again ? i remember reading it but don't think i grasped it Quote Link to comment Share on other sites More sharing options...
Wildbug Posted April 19, 2007 Share Posted April 19, 2007 What about: (?:\d+g?[-\/\])?\d+g The (? is a handy construct that allows you to use parentheses for grouping purposes without capturing the subpattern (in $1, $2, etc). Oh, do you need to worry about thousands seperators or decimals? It looks like you've included that in your pattern. If so, replace the \d with your class for that: [\d.,]+ Quote Link to comment Share on other sites More sharing options...
MadTechie Posted April 19, 2007 Author Share Posted April 19, 2007 i assume you mean (?:\d+g?[-\/\\])?\d+g *It works well except only gets the g when i would like it to get the letter or word after the digit (excluding p or P) i tried tweeking your RegEx and ended up with this (?:[\d.,]+[^pP\ \d]?[-\/\\])?[\d.,]+[^pP\ \d]+ Seams to work well About ?:, Ahhh So its nothing todo with making it optional.. OK cool that makes more sence (unless i'm wrong) *Funny thing tried the regex and it worked but not exacly how i wanted it to so i played with it but couldn't get it to work, so i started writing this post and when i wrote whats wrong with it, i read what i wrote and add that rule to the regex and it "seams to work" lol Thanks you WildIbug & c4onastick you both have been a BIG help If no more suggections i'll click solved (as it pretty much is) Quote Link to comment Share on other sites More sharing options...
Wildbug Posted April 19, 2007 Share Posted April 19, 2007 Yep, that's what I meant! Also, I misunderstood your OP that you were looking for more weight units than just "g". Isn't regex fun?! Quote Link to comment Share on other sites More sharing options...
MadTechie Posted April 19, 2007 Author Share Posted April 19, 2007 Its starting to be fun. I'm just getting past the ??? :'( WTF Huh.. and approaching the Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 19, 2007 Share Posted April 19, 2007 Isn't regex fun?! Regex IS fun! Good work MadTechie, glad to help! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.