dStreSd Posted April 15, 2007 Share Posted April 15, 2007 I'm using eregi to match tags and pull out there contents, it works fine if I have one tag, but as soon as I add more it matches them all as one, here is the pattern: \{main\}(.*)\{/main\} (this is just one of the tags there will be more later) it works fine with: {main} My Variable is: {my_var} {/main} but as soon as i make it: {main} My Variable is: {my_var} {/main} {main} My Variable is: {my_var} {/main} it gives me this as the result: My Variable is: {my_var} {/main} {main} My Variable is: {my_var} any ideas on how to fix it or stop it at the nearest "{/main}" or should i just do line by line parsing? thanks guys --dStreSd Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 15, 2007 Share Posted April 15, 2007 One thought would be to use the preg suite (Its generally better than eregi). Here's what you'd use in preg. preg_match_all('%\{main\}(.+?)\{/main\}%s', $text, $matches, PREG_SET_ORDER); I don't think (really I don't know, I always have used preg) that eregi supports lazy quantifiers (the +? in this case) which is why your getting more than you asked for with your expression. .*? - shortest possible match .* - longest possible match Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 18, 2007 Author Share Posted April 18, 2007 wow thanks, works perfect, now im guessing there no way to pull out what i need when the code is as follows: {main} this is some test code {isset:login} <br />You are logged in. {isset:admin_login} <br />You are logged in as an administrator. {/isset} {/isset} {notset:login} <br />You are not logged in. {/notset} {isset:another_var} <br />another_var is set to: {another_var} {/isset} {/main} there's no way to easily handle those isset's (or "repeater"s, "notset"s, "if"s, etc) without additional parsing from me huh? Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 18, 2007 Share Posted April 18, 2007 Nested tags are a pain to get to with regex. It can be done, but not to an arbitrary depth (easily). If they're largely un-nested, then you can use the same scheme as above, but this one... {isset:login} <br />You are logged in. {isset:admin_login} <br />You are logged in as an administrator. {/isset} {/isset} will get you in trouble. It'll return: {isset:login} <br />You are logged in. {isset:admin_login} <br />You are logged in as an administrator. {/isset} This looks XMLish, you might want to look into an XML parser to handle this. (You could easily replace the '{}'s with '<>'s if the PHP XML parser whines about that). Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 18, 2007 Author Share Posted April 18, 2007 hmmm, that might work but its alil harder for me to conceptualize, if nesting gets too complicated i might move in that direction, sticking to this path, would it be easier to do the largest return and sub-parse (recursive of course) or get the smaller pattern and expand as needed (again recursive) in your opinion? any differance in performances, either method superior for any other reasons, etc? Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 18, 2007 Share Posted April 18, 2007 That's a tough call. Having not seen the rest of the data, and in the absence of an better solution (like an XML parser), if the example you posted above is pretty representative, I think it would be easier to sub-parse (I use this technique all the time for HTML scraping, which structurally is pretty similar to what you've got). In which case, you'd probably be best suited to write something like this: function match_xmlish_stuff($xml) { $parsed_data = array(); // Pull all the main's preg_match_all('%\{main\}(.+?)\{/main\}%s', $xml, $mains); foreach($mains as $main){ // Pull out individual tags with a little c4 ingenuity preg_match_all('%\{([a-z]+)[^}]+)\}(.+?)(?=\{/?\1)%s', $main, $set_vars, PREG_SET_ORDER); $sets = array(); foreach($set_vars as $set_var){ // Clean the main content of tags (optional) $set_var[3] = preg_replace('/<[^>]*>/', $set_var[3]); array_push($sets, array($set_var[1], $set_var[2], $set_var[3])); } array_push($parsed_data, $sets); } return $parsed_data; // I haven't had a chance to test this, so it's entirely possible that I screwed something up here. } Actually, I had a little stoke of genius while I was writing that. You can easily parse in the interior tags (really only applies to situations very similar to your data)! This line: preg_match_all('%\{([a-z]+)[^}]+)\}(.+?)(?=\{/?\1)%s', $main, $set_vars, PREG_SET_ORDER); Will pull out all the sub tags (even if nested) based on the fact that the closing tag starts with either a '{isset' (the beginning of the next tag) or '{/isset' (the closing tag). (This uses a positive look-ahead to work correctly). Be warned that this will fail if you have unlike tags nested such as: {isset:login} <br />You are logged in. {notset:login} <br />You are not logged in. {/notset} {isset:admin_login} <br />You are logged in as an administrator. {/isset} {/isset} But judging from your example data, it should work. Give that a try and see if that gets you close. Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 18, 2007 Author Share Posted April 18, 2007 hmmm well that might become an issue because there will most likely be an occasion of something like: {isset:username} {if:(username='architect')} Hey it's the architect! {else} Welcome, {username}. {/if} {/isset} {notset:username} You are not logged in. {/notset} this is all purely conjectural of course as I'm trying to build a general purpose template engine for my library (currently consisting of a db abstraction layer, session handler, file handler, microtimer, etc) which i want to all be able to tie in together for future projects, the most imminent being a simple forum system. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 18, 2007 Share Posted April 18, 2007 Ok. (Nice matrix reference!) Of course, it'd be possible to 'patch' it to handle that, but now we're reaching the point of diminishing returns. You can write an uber-complicated regex to parse these, but its going to be a _____ to maintain. If this is something you're planning on reusing, I may have taken the wrong approach above (shooting for a specific application not really a 'one-size-fits-all'). In which case, you're going to end up writing your own parser to account for all the cases in your templates. Tell us a bit more about the project, maybe we can offer an alternate approach or some solutions that are a little less time consuming. (I started writing my own too, but there's so much out there, for free, that you can use. Then you don't have to maintain it, a big plus. Smarty is the first one the comes to mind.) Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 20, 2007 Author Share Posted April 20, 2007 Well here would be the closest to a real-world example I can think of: base.tpl {mask:header} <html> <head> <title>{page_title}</title> </head> <body> <div id="wrapper"> <div id="header"> <h1>{page_title}</h1> <h2>{subtitle}</h2> </div> {/mask} {mask:footer} <div id="footer"> ©2007 dStreSd </div> </div> </body> </html> {/mask} template.tpl {base:base.tpl} {main} {header} <div id="user"> {isset:username} {if:(username='architect')} Hey it's the architect! {else} Welcome, {username}. {/if} {/isset} {notset:username} You are not logged in. {/notset} </div> <div id="content"> Here is a sample table: <table id="sample"> <tr> <th> Username </th> <th> Real Name </th> <th> Email </th> </tr> {repeater:sample_table} <tr> <td> {username} </td> <td> {real_name} </td> <td> {email} </td> </tr> {empty} <tr> <td colspan="3"> There are no users in the database. </td> </tr> {/repeater} </table> </div> {footer} {/main} is that good enough or do you need more examples? so far the (base|mask|main) tags all work since the pattern you showed me covers them fine (they cannot nest) Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 22, 2007 Author Share Posted April 22, 2007 if you don't have a solution, i may try my hand at dynamic conversion to XML Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 22, 2007 Author Share Posted April 22, 2007 I've been messing around with it and decided I'm just going to do raw line-by-line parsing and just force template makers to newline delimit all there tags (e.g. in all my examples), thanks for the help with all the regex stuff c4. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 22, 2007 Share Posted April 22, 2007 Not a problem. I think that's probably a good route to go, less of a headache down the road when you have to make changes. Good luck! Quote Link to comment Share on other sites More sharing options...
dStreSd Posted April 23, 2007 Author Share Posted April 23, 2007 lol, thanks. I'll make sure to let you check out the code once I'm done and leave you credit in any projects that use the library. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted April 23, 2007 Share Posted April 23, 2007 Thanks. I'm flattered, but that's not really necessary. I'm just glad to help! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.