Jump to content

Removing Links


ajay600

Recommended Posts

I have posted the code that identifies the same images that is repeated in 2 web pages and remove them from the web page(using contents of<img> tag)...

 

Please help me in  finding  the repeated links in the web page and removing them

 

<?php	
$doc = new DOMDocument(); // An instance of DOMDocument
@$doc->loadHTMLFile('http://www.a.net/page_a.htm/');

$doc2 = new DOMDocument(); // An instance of DOMDocument
@$doc2->loadHTMLFile('http://www.a.net/page_b.htm');

$xpath = new DOMXPath($doc); // An instance of DOMXPath attatched to a DOMDocument
$imgList = $xpath->query('//img'); // A DOMNodeList containing all <img> tags

$xpath2 = new DOMXPath($doc2); // An instance of DOMXPath attatched to a DOMDocument
$imgList2 = $xpath2->query('//img'); // A DOMNodeList containing all <img> tags

$srcList = array(); // An array to hold src attribute strings
foreach ($imgList as $img) 
{ // Loops through the DOMNodeList of images
    $srcList[] = $img->getAttribute('src'); // Stores the src attribute of each image
}

$srcList2 = array(); // An array to hold src attribute strings
foreach ($imgList2 as $img2)
{ 
// Loops through the DOMNodeList of images
    $srcList2[] = $img2->getAttribute('src'); // Stores the src attribute of each image
}

$srcBoth = array_intersect($srcList, $srcList2);

foreach ($srcBoth as $src) { // Loops through the src strings that are common to both documents
    $imgs = $xpath->query('//img[@src="'.$src.'"]'); // A DOMNodeList of images with a matching src attribute
    foreach ($imgs as $img) { // Loops through the images

$img->parentNode->removeChild($img);
       // $img->setAttribute('src', ''); // Modifies the src attribute
      // $img->setAttribute('alt', 'Deleted');
    }
}
foreach ($srcBoth as $src2) { // Loops through the src strings that are common to both documents
    $imgs2 = $xpath2->query('//img[@src="'.$src2.'"]'); // A DOMNodeList of images with a matching src attribute
    foreach ($imgs2 as $img2) { // Loops through the images

$img2->parentNode->removeChild($img2);
      //  $img2->setAttribute('src', ''); // Modifies the src attribute
      // $img2->setAttribute('alt', 'Deleted');
    }
}
echo $doc->saveHTML();
echo $doc2->saveHTML();

?>

 

Link to comment
https://forums.phpfreaks.com/topic/189993-removing-links/#findComment-1002383
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.