obsidian Posted September 2, 2008 Share Posted September 2, 2008 I suppose I would specifically request effigy to answer my question, but I'm open to any suggestions at all at this point. I am working on a generic query handler for an open source application I am writing, and I want to know if there is an easy way to recognize table names from within a query string and replace them with a prefix. Since I am adding the functionality to an already existing table structure and application core, I am a bit too late to have the queries themselves written with the prefix in them (prefix may change upon installation). So, my issue comes to this: I need a reliable way to replace all occurrences of table names within a query with the prefix + table name, but ONLY in the parts of the query where they are indeed table names. So, if the table name appears within a string, we DO NOT want to replace it. Any suggestions? I've been playing around with breaking the parser down into different handlers for the different query syntax, but I'm not sure this is the easiest or most efficient way to go. I can even find the tables themselves with a match like this: (FROM|INTO|LOW_PRIORITY|IGNORE)(.+?)(WHERE|GROUP|HAVING|ORDER|LIMIT|\() But, that doesn't take care of WHERE, ORDER BY or even HAVING clauses, so I was hoping there was a way I might just match all occurrences of the table names not inside strings. I think this would be generic enough for my purposes. Ideas? Quote Link to comment Share on other sites More sharing options...
obsidian Posted September 2, 2008 Author Share Posted September 2, 2008 OK, while possibly not the best solution, here is what I have settled on for the time being (until I come up with something better): Step #1: Replace all strings with temp vars using the following pattern: '([^'\\]*(\\.[^'\\]*)*)' Step #2: Replace all occurrences of the table names within the remaining query text Step #3: Replace the strings back into the query This works well, but I just feel that it may be overkill. Any additional thoughts would be greatly appreciated! Quote Link to comment Share on other sites More sharing options...
effigy Posted September 3, 2008 Share Posted September 3, 2008 Without knowing all of the intricate ins and outs of SQL, that's probably the best approach. My take is below. You may also want to have a look at Perl's SQL::Statement. SQL::Statement::Structure pulls out the table names for you. Update: There's also SQL_Parser for PHP; I'm not sure what it has to offer. <pre> <?php $tests = array( "A 'B C' D E F \"G\" H I 'J \"K\" L M' N O P \"Q 'R' S\" T U V 'W X Y' Z", 'A \'B C\' D E F "G" H I \'J "K" L M\' N O P "Q \'R\' S" T U V \'W\' X \'Y\' Z', ); foreach ($tests as $test) { $result = array(); ### Split the string based on quote pairs. $pieces = preg_split('/(([\'"]).*?\2)/', $test, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE); $num_pieces = count($pieces); for ($i = 0; $i < $num_pieces; $i++) { $piece = $pieces[$i]; ### Skip lone quotes. if ($piece == '"' || $piece == "'") { continue; } ### Only replace entries that are not quoted. ### Table name is "X" for sake of example. $first = substr($piece, 0, 1); $piece = $first == '"' || $first == "'" ? $piece : preg_replace('/\b(X)\b/', 'prefix_$1', $piece) ; ### Push. array_push($result, $piece); } print_r($result); } ?> </pre> Quote Link to comment Share on other sites More sharing options...
obsidian Posted September 3, 2008 Author Share Posted September 3, 2008 My take is below. Thanks for the insight, effigy. As always, you have a very well thought through method, and it seems to work well, too. As I suspected, there is really no easy way to approach this, so I'll check out the links you provided. I'm not excited about getting the Pear library involved in this (I had looked into that before), since this is for an open source project, and I would rather not be reliant on yet another library. Thanks again. I'll mark this as solved for now. Quote Link to comment Share on other sites More sharing options...
effigy Posted September 4, 2008 Share Posted September 4, 2008 Upon review, you'll want to update the split pattern if your data contains escaped quotes: /((?<!\\\)([\'"]).*?(?<!\\\)\2)/ Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.