mark107 Posted April 28, 2014 Share Posted April 28, 2014 I need some help with my PHP, I'm to parsing the href contents using DOMDocument. I have the list of url that I'm parsing only for the website link even I want to ignore the `<a id="aTest" href="` tags. Here is the input: http://www.mysite.com/get-listing.php?channels=test 1&id=101 http://www.mysite.com/get-listing.php?channels=test 2&id=102 http://www.mysite.com/get-listing.php?channels=test 3&id=103 rtmp://$OPT:rtmp-raw=rtmp://ny.iguide.to/edge playpath=49f5xnbs2wra0ut swfUrl=http://player.ilive.to/player_ilive_2.swf pageUrl=http://www.ilive.to token=UYDk93k#09sdafjJDHJKAD873 Here is the output: <a href='http://www.mysite.com/get-listing.php?channels=test 1&id=101'></a></p><a id="aTest" href="">Stream 1</a><br><br> <a href='http://www.mysite.com/get-listing.php?channels=test 2&id=102'></a></p><a id="aTest" href="">Stream 1</a><br><br> <a href='http://www.mysite.com/get-listing.php?channels=test 3&id=103'></a></p><a id="aTest" href="rtmp://$OPT:rtmp-raw=rtmp://ny.iguide.to/edge playpath=49f5xnbs2wra0ut swfUrl=http://player.ilive.to/player_ilive_2.swf pageUrl=http://www.ilive.to token=UYDk93k#09sdafjJDHJKAD873">Stream 1</a><br><br> Here is the PHP: <?php ini_set('max_execution_time', 300); $errmsg_arr = array(); $errflag = false; $xml .= '<?xml version="1.0" encoding="UTF-8" ?>'; $xml .= ' <tv generator-info-name="www.mysite.com/xmltv">'; $baseUrl = file_get_contents('http://www.mysite.com/get-listing.php'); $domdoc = new DOMDocument(); $domdoc->strictErrorChecking = false; $domdoc->recover=true; //@$domdoc->loadHTMLFile($baseUrl); @$domdoc->loadHTML($baseUrl); $links = $domdoc->getElementsByTagName('a'); $data = array(); foreach($links as $link) { //echo $domdoc->saveXML($link); if($link->getAttribute('href')) { $url = str_replace(" ", "%20", $link->getAttribute('href')); $url = str_replace("rtmp://", "", $link->getAttribute('href')); } } ?> Can you please tell me how I can parsing the contents as only for `<a href='http://www.mysite.com/get-listing.php` while ignoring the other contents especially `<a id="aTest" href="`? Does anyone know how? Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted April 28, 2014 Share Posted April 28, 2014 You could add an if condition which checks for links with the "id" attribute...or an "id" attribute with a specific value ("aTest"). Quote Link to comment Share on other sites More sharing options...
mark107 Posted April 28, 2014 Author Share Posted April 28, 2014 (edited) cyberRobot: yeah so how I can only get the links with href without getting the aTest id? Edited April 28, 2014 by mark107 Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted April 28, 2014 Share Posted April 28, 2014 Note that the following code is untested, but you could try the following: <?php if($link->getAttribute('href')) { if(!$link->hasAttribute('id') || $link->getAttribute('id')!='aTest') { $url = str_replace(" ", "%20", $link->getAttribute('href')); $url = str_replace("rtmp://", "", $link->getAttribute('href')); } } ?> Quote Link to comment Share on other sites More sharing options...
mark107 Posted April 28, 2014 Author Share Posted April 28, 2014 (edited) Note that the following code is untested, but you could try the following: <?php if($link->getAttribute('href')) { if(!$link->hasAttribute('id') || $link->getAttribute('id')!='aTest') { $url = str_replace(" ", "%20", $link->getAttribute('href')); $url = str_replace("rtmp://", "", $link->getAttribute('href')); } } ?> Thank you very much for that, I can see it is working now. Cheers for the help!!! Edited April 28, 2014 by mark107 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.