GoneNowBye Posted July 5, 2010 Share Posted July 5, 2010 okay i'm gonna simplfy it here, hence lack of codes, i've made some sort of markup language, rather then try and explain it, i'll just use html. lets say i'm trying to parse html and i have <div id="this"> </div> <div id="that"> </div> and you tell regex to match (?:<div )([A-Za-z0-9= "_]*)>([\\[\\]=" ->a-zA-Z_]*)(?:</div>) its fine but it matches </div> <div id="that"> not each div tag seperatly, how may i specify to REGEX dont match the end one match the ones between Quote Link to comment Share on other sites More sharing options...
GoneNowBye Posted July 5, 2010 Author Share Posted July 5, 2010 i just found a thread that details on how one cant parse html with regex, i regret my example now, none the less, can one match the first occourance with the first occourance, and second with second, or does the overlap never not occour? Quote Link to comment Share on other sites More sharing options...
cags Posted July 5, 2010 Share Posted July 5, 2010 You can parse HTML with Regex, it is just generally not the best way. Normally far easier/simpler to use some kind of DOM. It sounds like what you are talking about is the difference between Greedy and Lazy pattern matching. By default a PCRE pattern is greedy, you need to make it lazy. Quote Link to comment Share on other sites More sharing options...
salathe Posted July 5, 2010 Share Posted July 5, 2010 Can you give some examples more directly related to what you're trying to do? Your HTML example is not making much sense at all (i.e. what you claim it matches, does not). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.