Jump to content

[SOLVED] Get a web page's description?


Salis

Recommended Posts

I'll tell ya, I'm all ways working on something. My latest project is reading the Title and Description for a given web page. But I hit a snag.. I'm having a little problem reading the Description...

 

My code for reading the Title:

<?php
$Handle = fopen('http://vx-fx.com/', 'r');
$File_Read = fread($Handle, 4096);

if( preg_match("@<title>(.*)</title>@",$File_Read, $mTitle) ) {
	echo $mTitle[1];
}
?>

 

Reading the title was easy but as I thought, the description has the same tag/value thing as key words. I'm assuming I need to preg_match for http-equiv="description" then some how get the value for content=""

 

Any ideas?

 

Thanks every one.

Link to comment
https://forums.phpfreaks.com/topic/56434-solved-get-a-web-pages-description/
Share on other sites

<?php

function textbetweenarray($s1,$s2,$s){
  $myarray=array();
  $s1=strtolower($s1);
  $s2=strtolower($s2);
  $L1=strlen($s1);
  $L2=strlen($s2);
  $scheck=strtolower($s);

  do{
  $pos1 = strpos($scheck,$s1);
  if($pos1!==false){
$pos2 = strpos(substr($scheck,$pos1+$L1),$s2);
if($pos2!==false){
  $myarray[]=substr($s,$pos1+$L1,$pos2);
  $s=substr($s,$pos1+$L1+$pos2+$L2);
  $scheck=strtolower($s);
  }
	}
  } while (($pos1!==false)and($pos2!==false));
return $myarray;
}

$website = "http://www.google.com";

$content = file_get_contents($website);

list($title) = text_between_array("<title>", "</title>", $content);

?>

 

Easier way, no regular expressions :)

Well, first vx-fx.com is my web site I'm building. My goal is to read the description meta tag. I need to get the description of that page for a URL Snipper tool that I'll be adding to my site. It works like this, or at least this is the idea. Say a member has a few web sites they like to goto or wants to share stuff they found on line with the community, well they would copy the url. They would fill out an alternative title and description just in case the page did have one. Then once they submit the URL a PHP script would access that page and read the <title></title> and <meta name="description" content="" /> tags. As you can see, so far the script for reading the title works, but I'm not too great with preg_match at all and barley understand it (even after reading a few tutorials).

 

So I need find <meta name="description" content="" /> then read content=""

Actually I hit submit then read your code.

 

I get Fatal error: Call to undefined function: text_between_array() in *************** on line 29

 

Opps, underscores.... OK tried the code but it returns the Title. I can all ready get that. Or can I change something in here to read the description?

$html = file_get_contents('http://vx-fx.com/');

preg_match('/<title>(.*?)</title>/i',$html,$title_match);
preg_match('/<meta\s+(name|content)="(.*?)"[^>]*(name|content)="(.*?)"[^>]*>/i',$html,$desc_match);

$title = isset($title_match[1]) && $title_match[1] ? $title_match[1] : 'No title available';
if (isset($desc_match) && count($desc_match) == 5) {
$description = strtolower($desc_match[1]) == 'name' ? $desc_match[4] : $desc_match[2];
} else {
$description = 'No description available';
}

 

Technically, with HTML, the attributes can come in any order -- name= can be before content=, and other bits can be in there, so I've tried to account for that in the regex.  You should also be using the /i modifier to account for different case since HTML isn't case sensitive.

I've tried your code and it works, though It's pulling the keywords from my site. Again really don't know mych of any thing when it comes to preg_match, but is there a way that if "description" matches then it reads the content?

 

Also, any good tutorials on preg_match? Every thing I find is not so helpful...

 

Thanks again!

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.