Jump to content

Complex regular expression


Dareros
Go to solution Solved by requinix,

Recommended Posts

Hi; i would like to search for a regular expression to grab all printf occurences including the folowing senarios :

   printf("something");
   printf("something", variables);
   printf("something", variables, 
       variables);
   printf("something);something",
       variables, variables);

To simplify the third and fourth cases, the regex should start from printf(" and go intil it find ); with a white space after it. But i didn't realise how to do it. This trick should work also for the first and second case.

Thanks

Edited by Dareros
Link to comment
Share on other sites

This cannot and should not be done with common regexes.

 

Like you already said, those are complex expression. Your examples are actually still fairly simple. What about this:

printf(f(g() + h()), 3 * (2 + i()));

This is a full-blown language. Common regexes are way too primitive for this. They work fine for simple tasks like validating a date, but they're no all-powerful parsing tool.

 

What you want is an actual parser. You didn't say which language this is, but I'm pretty sure somebody has already written a parser for it in PHP. Use it.

Link to comment
Share on other sites

  • Solution

This cannot and should not be done with common regexes.

I know you don't like regular expressions, Jacques, but this is one case where a regex definitely can do it. Strings and parenthesized expressions make it a bit trickier but it's still possible.

'/\bprintf\("([^"\\\\]|\\\\.)*"([^\'"()]*|([\'"])([^\3\\\\]|\\\\.)*\3|\((?2)*\))*\);/'
1. Word boundary

2. "printf("

3. An opening double quote, the contents of the string (accounting for escape sequences), and the closing double quote

4. Some amount of

a) stuff that isn't a quote or parenthesis,

b) a quoted string like in #3, or

c) parentheses that recurses back up to #4.

5. ");"

Edited by requinix
Link to comment
Share on other sites

*sigh*

 

Before you try to be smart, please make sure you actually understand the position you're arguing against. Your problem is that you tend to stop reading after the first sentence, immediately construct a bunch of half-baked conclusions in your head and then proudly write down what you think is “the solution”. While this is obviously good enough to impress newbies on a PHP forum, I think a serious programmer should do a bit more than that.

 

Do you realize that I was talking about “common regexes”? Why did I use the word “common”? Was it just an accident? Or could that actually mean something?

 

If you had started PHP a year ago, I'd applaud you all day long for your suggestion. It proves that you know regexes, and that's not bad for the first year. However, I'm pretty sure you've been programming a bit longer, and that means you're expected to actually think about a problem and choose an appropriate solution from several options.

 

This is a parsing problem. Parsing problems usually don't just pop up from nowhere, they belong to a bigger context. That means the OP will most certainly have to make adjustments and solve similar problems in the long run. Can he do that with your regex hack? Hardly:

  • This is cryptic pseudo-Perl bullshit which is optimized for mental masturbation, not readability.
  • Good luck maintaining this and making adjustments, Dareros.
  • This cannot be generalized to anything. It's a throwaway hack for an overspecified problem.

If this is your best shot, I'd be worried.

Link to comment
Share on other sites

One example of how this could be legit: maybe OP is looking for something to use in his editor's find box or w/ grep or something to find occurrences in files to help narrow down stuff he needs to search for.  

 

Maybe that is the case, maybe it is not; without context we do not know.  

 

Therefore it seems to me the best (first) response would be "What is this for? Provide some context."  And yet I don't see that from anybody.  

 

Failing that, it seems to me the best response would be to provide the asked-for solution, under the assumption that OP's reasons, while unknown, are legitimate.  

 

Failing that, I'd keep my mouth shut, because telling someone something isn't the right tool for the job when I don't know what the job is, sounds like a pretty fucking stupid thing to say. 

Link to comment
Share on other sites

grep supports recursive subpatterns? Editors support recursive subpatterns?

 

That was a lame trolling attempt, even by your low standards. Next month, try to come up with something intelligent. Otherwise you'll lose the last bit of respect that you may still have.

Link to comment
Share on other sites

I gave a random example to make a point that you don't know what the context is.  Instead, you deliberately try to sidestep the point in a vain attempt to try and tear me down.  You really do have issues man.  As far as everybody in this community is concerned, you are the unwanted troll, and you know it

  • Like 1
Link to comment
Share on other sites

You know that you've said something stupid when jazzman1 steps in to defend you.

 

I'll explain it one last time. If you still don't get it, just move along and take care of some other topic which you do understand.

 

You said that it always depends on the context, which is one of those blanket statements people like to pull out of their arse when they haven't really thought about a problem yet. To prove your “point”, you refer to use cases like the search-and-replace function of a text editor. What you haven't taken into account (or maybe you don't even know it) is that standard regex engines which are implemented in editors, egrep etc. aren't even capable of processing nested expressions. This requires recursive subpatterns which are a special extension of some full-blown programming languages.

 

In other words: In the cases where a regex might be valid, it's not powerful enough. And in the cases where it is powerful enough, there are better alternatives. So it's the wrong tool either way – which is exactly what I'm saying the whole time.

 

I know that it's hard for programmers to understand the limitations of regexes. Many people don't even understand the theoretical background, and the people who do have a basic understanding of theory tend to miss everything else. It takes a lot of time to reach the next level where you're actually able to think about your tools and realize when they're not appropriate. Obviously you aren't there yet.

Edited by Jacques1
  • Like 1
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.