Jump to content

Getting relative urls from html code


ohitsme

Recommended Posts

Hi,

 

I grab the contents of a page of galleries (eg, http://www.example.com/galleries/gallery.htm).  Then i want to grab the images on it. Normally this is ok, as I assume the images are in the same directory as the html file, so i just do file_get_contents(str_replace("gallery.htm","",$url) . "1.jpg")

 

But sometimes the images are like

<img src="../otherdirectory/1.jpg">

or

<img src="http://static.example.com/gals/1.jpg">

 

How can i work out what is the correct url to retrieve?

 

(at the moment i have an array of the contents of all src="" (eg,

1 => 1.jpg

2 => 2.jpg

 

or

1 => http://static.example.com/1.jpg

2 => http://static.example.com/2.jpg

 

or

1 => ../otherdir/1.jpg

2 => ../otherdir/2.jpg

 

(this isn't for any illegal scraping or whatever. It just saves me a manual task i have to do every day)

 

thanks

Link to comment
Share on other sites

  • 7 months later...
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.