MarkusJ Posted August 14, 2013 Share Posted August 14, 2013 Hi, I am still learning PHP and given some HTML I am trying to extract all links and iframes from the HTML and append them to a different string. I am still learning PHP so I am not sure if how I am checking that the returned array has values (isset) or if I should be appending strings together (.) is correct The code that I have so far is function GetLinksIFrames($content) { $innerContent =''; $regex_pattern_links = "/<a href=\"(.*)\">(.*)<\/a>/"; preg_match_all($regex_pattern_links,$content,$matches); for ($i = 0; $i < count($matches); $i++) { if(isset($matches[0][$i]))// Is this correct? { $innerContent = $innerContent.$matches[0][$i]." "; // Is this how to append a result to an existing string? } } $regex_pattern_iframe = "/<iframe src=\"(.*)\">(.*)<\/iframe>/"; preg_match_all($regex_pattern_iframe,$content,$matches); for ($i = 0; $i < count($matches); $i++) { if(isset($matches[0][$i])) { $innerContent = $innerContent.$matches[0][$i]." "; } } return $innerContent; } Any help appreciated ThanksMark Quote Link to comment https://forums.phpfreaks.com/topic/281153-newbie-extracting-links-iframes-from-html/ Share on other sites More sharing options...
Irate Posted August 14, 2013 Share Posted August 14, 2013 If you **want**, you can use JavaScript's DOM methods for that. document.links and window.frames respectively do just that. If you are looking for the PHP version, look up non-greedy repititions. <a href="some/path/link.php">A link here.</a><a href="another/path/link.php">Another link here.</a> The above line would be matched by your RegExp Quote Link to comment https://forums.phpfreaks.com/topic/281153-newbie-extracting-links-iframes-from-html/#findComment-1444953 Share on other sites More sharing options...
MarkusJ Posted August 14, 2013 Author Share Posted August 14, 2013 Thanks for the feedback If I can ask as direct php question if(isset($matches[0][$i]))// Is this correct? { $innerContent = $innerContent.$matches[0][$i]." "; // Is this how to append a result to an existing string? } Is the above the best way to check for a null reference and to append a string to itself? Thanks! Quote Link to comment https://forums.phpfreaks.com/topic/281153-newbie-extracting-links-iframes-from-html/#findComment-1444961 Share on other sites More sharing options...
Solution .josh Posted August 14, 2013 Solution Share Posted August 14, 2013 Technically you should check if $matches[0] exists before checking if $matches[0][$1] exists: if(isset($matches[0])&&isset($matches[0][$i]))But this is just to avoid a warning (assuming your error level is set to report warnings); the logic itself would have worked as-is. And yes, that is how you append a string to another string (concatenation). Sidenote: In your patterns, you use (.*) in several places. This is a greedy match, and it will yield unexpected results. You should use a non-greedy match instead: (.*?) Also, I would like to point out that regex isn't really the best method for parsing html. For example, if the links and iframes have any other attributes, or use single quotes instead of double quotes, spaced differently, or any number of other things, your regex would fail, since it doesn't account for any of that. What you should instead use is a DOM parser. Quote Link to comment https://forums.phpfreaks.com/topic/281153-newbie-extracting-links-iframes-from-html/#findComment-1445002 Share on other sites More sharing options...
MarkusJ Posted August 14, 2013 Author Share Posted August 14, 2013 Thanks for the help Quote Link to comment https://forums.phpfreaks.com/topic/281153-newbie-extracting-links-iframes-from-html/#findComment-1445065 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.