Jump to content

Archived

This topic is now archived and is closed to further replies.

tonyr1988

Parsing a Remote HTML File

Recommended Posts

I need to be able to grab some information from a remote HTML file.

All I have so far is:

$content = file_get_contents($url);

The problem is, I have no idea what to do. I need contents within the <dd> </dd> tags, preferably in an array. I guess I could keep doing a find on <dd> tags and erase everything before it + 4 (for the tag space), and take it up to the </dd> tag, but it seems really drawn out and confusing....

Can I use ereg to do this? I have never done anything with that at all, so I have no clue.

Can someone please get me started?

Share this post


Link to post
Share on other sites
I made this for you. :)

Just edit the settings, this code grabs the html and echos it, you have the option of echoing the tags or not aswell ;) Have fun!

In this example it gets all the bold html tags on the page and echos them on a seperate line :) then it removes the tags, leaving just the infor in between the tags

here it is in action: http://www.business-tycoon.com/example.php

[code]<?php

$config['url']       = "http://www.business-tycoon.com"; // url of html to grab
$config['start_tag'] = "<b>"; // where you want to start grabbing
$config['end_tag']   = "</b>"; // where you want to stop grabbing
$config['show_tags'] = 0; // do you want the tags to be shown when you show the html? 1 = yes, 0 = no

class grabber
{
var $error = '';
var $html  = '';

function grabhtml( $url, $start, $end )
{
$file = file_get_contents( $url );

if( $file )
{
if( preg_match_all( "#$start(.*?)$end#s", $file, $match ) )
{
$this->html = $match;
}
else
{
$this->error = "Tags cannot be found.";
}
}
else
{
$this->error = "Site cannot be found!";
}
}

function strip( $html, $show, $start, $end )
{
if( !$show )
{
$html = str_replace( $start, "", $html );
$html = str_replace( $end, "", $html );

return $html;
}
else
{
return $html;
}
}
}

$grab = new grabber;
$grab->grabhtml( $config['url'], $config['start_tag'], $config['end_tag'] );

echo $grab->error;

foreach( $grab->html[0] as $html )
{
echo htmlspecialchars( $grab->strip( $html, $config['show_tags'], $config['start_tag'], $config['end_tag'] ) ) . "<br>";
}

?>[/code]

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.