Doctor_Cox Posted June 19, 2007 Share Posted June 19, 2007 Gentlemen, I need to create what essentially amounts to an interpretor of Apache Rewrite Rules in PHP. I use Rewrite Rules extensively in my content management system, but they become to unwieldy, and now I am working on something which basically routes all calls through a file handing script. This will take the requested filename, then figure out what needs to be included to present the requested file. So I need it to be able take a request filename of say, links1-1.html Then using the original rewrite rule pattern of ^links(.*)-(.*).html Conclude I need index.html?page=links_manager/listings&Category_ID=1&Start=1 In other words, does the requested file meet any of the patterns (which I will store in a database with the "destination") then if so, isolate the variables for inserting into the destination for the final, actual URL. Hope this makes sense, thanks. Quote Link to comment Share on other sites More sharing options...
effigy Posted June 19, 2007 Share Posted June 19, 2007 It sounds like you want a library of REs to run against a string? Quote Link to comment Share on other sites More sharing options...
Wildbug Posted June 19, 2007 Share Posted June 19, 2007 So what's your question? You want to convert Apache regex to PCRE or MySQL/POSIX regex? Quote Link to comment Share on other sites More sharing options...
Doctor_Cox Posted June 19, 2007 Author Share Posted June 19, 2007 The values need to be available to PHP to decide what to do, what files to include. The means by which it gets them are of no consequence. Quote Link to comment Share on other sites More sharing options...
Wildbug Posted June 19, 2007 Share Posted June 19, 2007 I still don't see a question mark. Quote Link to comment Share on other sites More sharing options...
Doctor_Cox Posted June 19, 2007 Author Share Posted June 19, 2007 Wildbug - I am an experienced PHP programmer but a complete noob when it comes to Regex, bare with me OK? My content management system is well established, and has grown in capabilities and sophistication as I have learned new things. Because it is so well established, I need to keep the internally linked filenames the same, as I do not have time to go back and apply this new technique to every site. It will only apply to new sites. The new sites will have an .htaccess like this... RewriteEngine On RewriteRule ^(.*) filehandler.php?%{QUERY_STRING} filehandler.php will then figure out what to do with the request by checking the $_SERVER['REQUEST_URI'] variable. .shtml files, for example, are prepared by my page editor module, so it would include the file that gets and renders those pages. Some preliminary tests using string hacking (substr, strpos et al) have proven the soundness of the basic theory. However, given my system contains about 2 dozen different modules, some with up to half a dozen rewrite rules, I do not want to have to write manual hacking code for each one. So I figured if I put the original expressions in the rewrite rules into a database, I could perform some kind of lookup. Obvious the expressions can be changed if need be. I suppose MySQL regex and PCRE are both needed. Let's see if I can concoct a brief, simplified example code sequence of how I think it might/should work. Say we have a database line with two fields. Expression and Destination Filename. Imagine the following example row. Expression - "^links(.*)-(.*).html" Destination - "index.html?page=links_manager/listings&Category_ID=#1&Start=#2" Now the following code. I'll spell out where I do not what should happen, what functions to use etc. with caps and comments. <?php $Requested_File = "links1-1.html"; // this would ordinarily be derived from $_SERVER['REQUEST_URI'], simplifying here $result = mysql_query("select * from rewrite_rules where Expression = REGEX COMPARISON TO $Requested_File HERE"); if ($row = mysql_fetch_array($result) { // code to get variables from $Requested_File by applying $row['Expression'] here // preg_split probably the way to go $Variable_Counter = 1; $File_to_Include = $row['Destination']; foreach($Preg_Split_Returned_Values as $Value) { $File_to_Include = str_replace("#{$Variable_Counter}", $Value, $File_to_Include); $Variable_Counter++; } // now do something with $File_to_Include to present content } else echo "File not found"; ?> My initial experiments with preg_split did not work. Looking at the documentation it does not look like the reg exs in the rewrite rules are directly compatible with the function. So they can of course be changed. I hope this makes it clearer. Quote Link to comment Share on other sites More sharing options...
Wildbug Posted June 20, 2007 Share Posted June 20, 2007 Wildbug - I am an experienced PHP programmer but a complete noob when it comes to Regex, bare with me OK? Okay, but since you had so many declarative statements and no interogative ones, we were having a little trouble deciding what it was you wanted our help with. First of all, why do you think this is better than using Apache's mod_rewrite? It'll be very much the same, except more work, and I'm not sure performance will be any better. Are you using .htaccess files? If so, you know that the docs recommend using a central .conf file and turning off htaccess since it gives the server a performance hit; that would also make organizing your rules easier since they'd be in the same file, and not strewn about the filesystem. If you still want to do this, you can use PHP completely; I'd put the regular expressions right in this file and save the trouble of using MySQL. You can use an array and just call preg_replace() once as in the example below. Using an if() or switch() block will probably be faster, though, since each expression will be tested in the array. Or, use a loop with the array(s) -- if a match is found, break or exit. <?php // filehandler.php $patterns = array(); $replacement = array(); // I like to line them up like this; easier to read! $patterns[] = '/^links(\d+)-(\d+)\.html/i'; $replacement[] = 'index.html?page=links_manager/listings&Category_ID=$1&Start=$2'; $patterns[] = '/^something_else.php&(.*)/i'; $replacement[] = 'other.php?page=$1'; if (isset($_SERVER['QUERY_STRING']) && $_SERVER['QUERY_STRING']) readfile('http://yoursite/'. preg_replace($patterns,$replacement,$_SERVER['QUERY_STRING'])); else header('Location: 404.html'); // or wherever ?> You can use MySQL to look up patterns, but since you said some pages have several rewrite rules, how would you know to go to the next rule? You'll need to introduce some coding logic, again recreating mod_rewrite, sort of. (You can use regular expressions in MySQL [assume "pattern" is a column containing a regex] for lookup with a query like "SELECT * FROM rewrite_rules WHERE '$filename' RLIKE pattern".) And you'll need two patterns -- one to match in MySQL, and another to match and extract in PHP. As far as your pseduo-code, you probably want to use $_SERVER['QUERY_STRING'] to fetch the query string you tacked on the script name to call it and use preg_replace() to match, extract, and rewrite the important bits. In summary, it's certainly do-able, and I'd recommend using a pure PHP solution to implement instead of involving MySQL. Quote Link to comment Share on other sites More sharing options...
Doctor_Cox Posted June 21, 2007 Author Share Posted June 21, 2007 Thanks Wildbug, this is a good piece of code with logic I can follow. First of all, why do you think this is better than using Apache's mod_rewrite? It'll be very much the same, except more work, and I'm not sure performance will be any better. Are you using .htaccess files? If so, you know that the docs recommend using a central .conf file and turning off htaccess since it gives the server a performance hit; that would also make organizing your rules easier since they'd be in the same file, and not strewn about the filesystem. The problem with mod_rewrite is that if I make a change to a module that requires a filename change, I have to go and change every .htaccess in our many sites. And the problem with a central .conf is not only do I not have access to it, but not every site is configured exactly the same way. Some of the them do not use index.html as a "container file" for the system's content, and indeed some use more than one container file. This why I want PHP code level control of the whole thing, rather than manual stuffing around which has cost me many hours as it is. If you still want to do this, you can use PHP completely; I'd put the regular expressions right in this file and save the trouble of using MySQL. I agree a pure PHP solution would be faster, but MySQL would let me access and edit things easily. I suppose it makes no never mind really, it is not like I will be changing things every day, and the having to have to different patterns does not thrill. We have a pretty powerful server that doesn't even break a sweat with our current load. Performance is not that big of an issue. but since you said some pages have several rewrite rules, how would you know to go to the next rule? When I refer to a module I refer to a specific set of code - Shopping Cart, Photo Gallery, Links Manager, whatever - not an individual page. So for example, the Links Manager uses 2 rules - one for the main entry page, one for the page that lists the links in the visitor's selected category. There are no "rules within rules" pages. Thanks again for your help. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.