trogster Posted October 26, 2009 Share Posted October 26, 2009 Hi, I'm trying to parse iframe tags from html. I have this pattern so far: '#<iframe[^>]*>.*?</iframe>#i' This works ok for something like: <iframe src='http://www.example.com' width='XXX' height='XXX'></iframe> and <iframe src='http://www.example.com' width='XXX' height='XXX'><a href="http://www.example.com">iframes not supported</a></iframe> But I want to parse also iframe tags like this one: <iframe src='http://www.example.com' width='XXX' height='XXX' /> Does anyone knows how to parse that last iframe code (and the previous) in the same pattern? Thanks. Quote Link to comment Share on other sites More sharing options...
cags Posted October 26, 2009 Share Posted October 26, 2009 Well untill salathe gets around to answering... how about... preg_match_all('#(?:<iframe[^>]*)(??:/>)|(?:>.*?</iframe>))#i', $input, $output); Quote Link to comment Share on other sites More sharing options...
trogster Posted October 26, 2009 Author Share Posted October 26, 2009 Cool, that one seems to work ok, thank you very much. I had also made a small mod in my pattern and I think it's working too. '#<iframe[^>]*>.*?(</iframe>)?#i' I tested both with 3 different iframe tags and they worked fine. I will add it some \s* to parse when there are spaces between < > and the tag text. Quote Link to comment Share on other sites More sharing options...
trogster Posted October 26, 2009 Author Share Posted October 26, 2009 I've made some tests and I still can't get what I need with any of both patterns Here are the iframes I am using for testing: <iframe src='http://google.com' framespacing='0' frameborder='no' scrolling='no' width='160' height='600' /> <iframe src='http://google.com' framespacing='0' frameborder='no' scrolling='no' width='160' height='600'> <iframe src='http://google.com' framespacing='0' frameborder='no' scrolling='no' width='160' height='600'><a href="">iframes not supported</a></iframe> <iframe src='http://google.com' framespacing='0' frameborder='no' scrolling='no' width='160' height='600'></iframe> This pattern: #(?:<iframe[^>]*)(??:/>)|(?:>.*?</iframe>))#i Fails only on the second iframe (it doesn't parse anything there). And this pattern: #<iframe[^>]*>.*?(</iframe>)?#i Fails on the third iframe (it parses everything until ...><a href.... that red char.) UPDATE: I don't know how I didn't see the first and second iframe is an invalid html code I really tought iframe tags could be closed like img tags. I think cags pattern will work ok. Thanks! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.