Jump to content

Parsing links in a website


springo

Recommended Posts

Parsing means you read through the file and grab only specific contents you want.

 

Here is one way to do it:

 

<?php
$siteData = file_get_contents($siteurl); // can use CURL also

$splitData = spliti('<a', $siteData); // split it at all the <a

foreach ($splitData as $data) {
       list($link) = spliti('</a>', $data);
       $links[] = '<a' . $link . '</a>';
}

print_r($links);
?>

 

That should get you started.

 

EDIT: changed to spliti for case-insensitive

Link to comment
Share on other sites

Thanks, I was struggling to do somethign with "preg_match_all" but I couldn't get it to work. I'm going to try to display the results properly, update local paths from the website to absolute to link from mine and also remove <a href>'s from images. If I couldn't do any of these, I'll ask back.

Thanks again!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.