Jump to content

Multi URL Metatag Scraper


young_coder

Recommended Posts

Dear all,

Can somebody help me to modify this script to scrape more than just one page?

Thank you very much advance.

 

 

<?php

$url = "http://www.example.com";

$fp = fopen( $url, ‘r’ );

$content = "";

 

while( !feof( $fp ) ) {

$buffer = trim( fgets( $fp, 4096 ) );

$content .= $buffer;

}

 

$start = '<title>';

$end = '<\/title>';

 

preg_match( "/$start(.*)$end/s", $content, $match );

$title = $match[ 1 ];

$metatagarray = get_meta_tags( $url );

$keywords = $metatagarray[ "keywords" ];

$description = $metatagarray[ "description" ]; 

 

echo "<div><strong>URL:</strong> $url</div>\n";

echo "<div><strong>Title:</strong> $title</div>\n";

echo "<div><strong>Description:</strong> $description</div>\n";

echo "<div><strong>Keywords:</strong> $keywords</div>\n";

?>

Link to comment
https://forums.phpfreaks.com/topic/207832-multi-url-metatag-scraper/
Share on other sites

Put the URLs you want to scrape in an array and loop through the array. Also if you use file_get_contents, your script will look cleaner:

<?php
$urls = array('http://www.example.com/','http://www.example.org/','http://www.example.net/');
$start = '<title>';
$end = '<\/title>';
foreach ($urls as $url) {
$content = file_get_contents($url);
preg_match( "/$start(.*)$end/s", $content, $match );
$title = $match[1];
$metatagarray = get_meta_tags($url);
$keywords = $metatagarray['keywords'];
$description = $metatagarray['description'];

echo "<div><span style='font-weight:bold'>URL:</span> $url</div>\n";
echo "<div><span style='font-weight:bold'>Title:</span> $title</div>\n";
echo "<div><span style='font-weight:bold'>Description:</span> $description</div>\n";
echo "<div><span style='font-weight:bold'>Keywords:</span> $keywords</div>\n";
}
?>

 

Ken

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.