Jump to content

Parsing a Remote HTML File


tonyr1988

Recommended Posts

I need to be able to grab some information from a remote HTML file.

All I have so far is:

$content = file_get_contents($url);

The problem is, I have no idea what to do. I need contents within the <dd> </dd> tags, preferably in an array. I guess I could keep doing a find on <dd> tags and erase everything before it + 4 (for the tag space), and take it up to the </dd> tag, but it seems really drawn out and confusing....

Can I use ereg to do this? I have never done anything with that at all, so I have no clue.

Can someone please get me started?
Link to comment
Share on other sites

I made this for you. :)

Just edit the settings, this code grabs the html and echos it, you have the option of echoing the tags or not aswell ;) Have fun!

In this example it gets all the bold html tags on the page and echos them on a seperate line :) then it removes the tags, leaving just the infor in between the tags

here it is in action: http://www.business-tycoon.com/example.php

[code]<?php

$config['url']       = "http://www.business-tycoon.com"; // url of html to grab
$config['start_tag'] = "<b>"; // where you want to start grabbing
$config['end_tag']   = "</b>"; // where you want to stop grabbing
$config['show_tags'] = 0; // do you want the tags to be shown when you show the html? 1 = yes, 0 = no

class grabber
{
var $error = '';
var $html  = '';

function grabhtml( $url, $start, $end )
{
$file = file_get_contents( $url );

if( $file )
{
if( preg_match_all( "#$start(.*?)$end#s", $file, $match ) )
{
$this->html = $match;
}
else
{
$this->error = "Tags cannot be found.";
}
}
else
{
$this->error = "Site cannot be found!";
}
}

function strip( $html, $show, $start, $end )
{
if( !$show )
{
$html = str_replace( $start, "", $html );
$html = str_replace( $end, "", $html );

return $html;
}
else
{
return $html;
}
}
}

$grab = new grabber;
$grab->grabhtml( $config['url'], $config['start_tag'], $config['end_tag'] );

echo $grab->error;

foreach( $grab->html[0] as $html )
{
echo htmlspecialchars( $grab->strip( $html, $config['show_tags'], $config['start_tag'], $config['end_tag'] ) ) . "<br>";
}

?>[/code]
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.