physaux Posted January 19, 2010 Share Posted January 19, 2010 $regex = '/<form.+>(.+?)<\/form/'; preg_match($regex,$pagehtmlcode,$output); echo count($output).'</br>'; foreach($output as $instanceoutput){ echo "[".$instanceoutput."]</br>"; } As you can see, I am trying to print out my results, but I am not getting anything printed out. My "counter" simply prints out 0. I am trying to grab all the text between form tags; for example: <form property1="value1" ...>WANT THIS TEXT</form ...> *there are line breaks in "WANT THIS TEXT", and there is more than one "<form>" couples in the code. That I am sure of, as I have printed it out. So, is there a problem in my regex that I don't see :confused: Thanks!! Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/ Share on other sites More sharing options...
cags Posted January 19, 2010 Share Posted January 19, 2010 If there are line spaces that you wish to match in your pattern you will need to add the s modifier. Also you state that there are more than one form element, once you have added the s modifier you will probably find that your pattern matches only one result, because you have use .+ in the opening tag without making it lazy, this will match much more than what you want. In place of your first .+ you would be much better off adding [^>]* so that you only match until you find the greater than sign which closes the tag. Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998319 Share on other sites More sharing options...
physaux Posted January 19, 2010 Author Share Posted January 19, 2010 Ok so, I'm not really exactly sure what you said. How do I apply a modifier? Here are some changes I thought I should do: -I tried to change the regex how you said :-\ -I changed "preg_match" to "preg_match_all" -I added a delimiter, "PREG_SET_ORDER" $regex = '/<form.+>([^>]*?)<\/form/'; preg_match_all($regex,$pagehtmlcode,$output, PREG_SET_ORDER); What exactly do you mean by all those changes? I don't really understand how I should change my current regex expression **The form tag can't just match to the next "<", because there are more tags inside of the form tags, i.e "<input", and so on Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998329 Share on other sites More sharing options...
cags Posted January 19, 2010 Share Posted January 19, 2010 Well the first change I didn't think was too complicated, as I said replace the first .+ with the section of pattern I provided. If you don't know how to add a modifier you are very much fighting an uphill battle, I suggest you read through the official manual for PCRE, modifiers is covered fairly early on. A modifier is placed after the closing delimiter for the pattern. Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998332 Share on other sites More sharing options...
physaux Posted January 20, 2010 Author Share Posted January 20, 2010 aha thank you for the great resource, I read some of it, then read what you said, then was confused again. I took a short break, read it again, made your changes, and now it is working perfectly! Thanks for your patience in helping me! :) Here is what I did: $regex = '/<form[^>]*>(.+?)<\/form/s'; , as well as preg_match_all yay!! Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998354 Share on other sites More sharing options...
physaux Posted January 20, 2010 Author Share Posted January 20, 2010 Ah ok, so I have ran into another problem, albeit a small one. Now, instead of searching for <form> </form> tags, I need to search for <input ..../> tags. Here is my regex for that: $regex = '/<input[^>]*>(.+?)\/>/s'; It works great! But... now I want to get more specific, I want it to only include input fields that do not have the following text type="hidden". Any idea how I can modify my regex to accomplish this? I am only doing this so that I can count the number of visible input fields per form. Perhaps DomDocument would be better for this? idk what is your opinion someone? Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998376 Share on other sites More sharing options...
cags Posted January 20, 2010 Share Posted January 20, 2010 Generally speaking using DOMDocument/DOMXPath are a more correct way of parsing HTML. I think you would need a negative lookahead assertion to achieve that aim. Completely untested, but... $regex = '/<input(?!type="hidden")[^>]*>(.+?)\/>/s'; Quote Link to comment https://forums.phpfreaks.com/topic/189078-what-is-wrong-with-my-regex-preg_match-im-new-to-it/#findComment-998563 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.