does not match pattern

GoneNowBye · July 5, 2010

okay i'm gonna simplfy it here, hence lack of codes, i've made some sort of markup language, rather then try and explain it, i'll just use html.

lets say i'm trying to parse html and i have

</div>

</div>

and you tell regex to match

(?:<div )([A-Za-z0-9= "_]*)>([\\[\\]=" ->a-zA-Z_]*)(?:</div>)

its fine but it matches

</div>

not each div tag seperatly, how may i specify to REGEX dont match the end one match the ones between

GoneNowBye · July 5, 2010

i just found a thread that details on how one cant parse html with regex, i regret my example now,

none the less, can one match the first occourance with the first occourance, and second with second, or does the overlap never not occour?

cags · July 5, 2010

You can parse HTML with Regex, it is just generally not the best way. Normally far easier/simpler to use some kind of DOM. It sounds like what you are talking about is the difference between Greedy and Lazy pattern matching. By default a PCRE pattern is greedy, you need to make it lazy.

salathe · July 5, 2010

Can you give some examples more directly related to what you're trying to do? Your HTML example is not making much sense at all (i.e. what you claim it matches, does not).

Sign In

does not match pattern

Recommended Posts

GoneNowBye

Link to comment

Share on other sites

GoneNowBye

Link to comment

Share on other sites

cags

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information