Jump to content

Java Regex Help


Recommended Posts

Currently have this regex pattern thats probably pretty bad as I am not a regex pro and have been trying to tweak on it to make it work for what I need:

 

"^(^[a-z.{2}][^\\s][a-z|.{2}]*[.]*[a-z]*)([\\*]*)[\\[]*([^\\]]*)[\\]]*"

 

and it is working for most cases... but not all, and I'm having difficulties

 

Examples of what I want to pass would be:

[*]".."

[*]"blah"

[*]"blah*"

[*]"blah.blah"

[*]"blah.blah*"

[*]any of 2-5 with [@attrib=value] or [@attrib="value" and @attrib2="value2" and ...] appended to the end

 

examples of what I don't want to pass would be:

[*]""

[*]"."

[*]" "

[*]"..."

[*]".blah"

[*]"-blah"

 

basically want ".."

or

a string of a-z that could include but not start with dots or dashes

-that is optionally followed by a *

-that is optionally followed by a list of attributes/values

 

with groups being:

1) ".." or a string a-z including but not starting with dots/dashes

2) "*" or StringUtils.EMPTY if no match

3) the group of attributes or StringUtils.EMPTY if no match

Link to comment
Share on other sites

btw, at this point i'm not really concerned with the exact format of the attribute list (just capture anything within square brackets)

 

I redid the main part like this:

"^([a-z]+[a-z.\\-]*[a-z])([\\*]*)[\\[]*([^\\]]*)[\\]]*"

 

and it seems to work (besides the ".." case)...

 

this is what I want to do to add the ".." case:

"^([\\.]{2})$ | ^([a-z]+[a-z.\\-]*[a-z])([\\*]*)[\\[]*([^\\]]*)[\\]]*"

 

but that doesn't work

^([\\.]{2})$ - starts and ends with ".."

| - or

^([a-z]+[a-z.\\-]*[a-z]) - starts with atleast one a-z char... can contain 0 or more a-z, dots, or dash... has an a-z char at the end

([\\*]*) - can have optional *

[\\[]*([^\\]]*)[\\]]* - can have optional [ ] with content between

Link to comment
Share on other sites

i guess the "has an a-z char at the end" part isn't working either... i can see why i think (getting caught in the part before it) but not sure how to fix it

 

so right now, the best i have is "^([a-z]+[a-z.\\-]*)([\\*]*)[\\[]*([^\\]]*)[\\]]*"

 

but it doesn't find just ".."

and it includes like "blah." for example, when it shouldn't

Link to comment
Share on other sites

I think you need to split your RegExp into its constituent parts, and re-examine what you're actually asking of it. Because right now the minimum required string for it to pass, is a single letter "a-z" with all of the rest being optional. That means that "blah." is indeed legal content, as you've allowed a period to be part of the first sub group (as long as there's at least one letter in front of it).

Link to comment
Share on other sites

that is fine, "blah" was one of the "Examples of what I want to pass would be" list.

 

the problems i'm having is:

 

but it doesn't find just ".."

and it includes like "blah." for example, when it shouldn't

 

in the second line, blah has a dot at the end... i want to prevent that.. as well as allow for just ".."

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.