JoeF Posted December 14, 2006 Share Posted December 14, 2006 Hey guys, Im working on a spider/crawler for a specialist search engine. Im trying to get my script to load the contents of a directory into a variable. Similar to how you woud use:[code]$html = implode('', file($url)) ;[/code]to load html of a remote page into a variable.However when $url is a directory eg: http://www.vobe.frihost.net/testing/You can go to view source in your browser and see the html, but it just hangs when using file() to view it, obviously cos its not a file. Im guessing the browser is generating the html for that directory.What i need to know is how to let php get that html so i can pharse the links out of the directory.ThanksHope someone can help! Link to comment https://forums.phpfreaks.com/topic/30677-how-to-read-contents-of-a-remote-directory/ Share on other sites More sharing options...
hitman6003 Posted December 14, 2006 Share Posted December 14, 2006 Even when you are looking at a directory listing like that, it's still HTML. So you will need to take the html, then search for links to files. Someone who's better with regex can help with that.What I can tell you is that there is a better way to get the html than you are using:[code]$remote_html = file_get_contents("http://www.vobe.frihost.net/testing/");[/code]Alternatly use sockets, there are some good examples on the fsockopen manual page:http://www.php.net/fsockopen Link to comment https://forums.phpfreaks.com/topic/30677-how-to-read-contents-of-a-remote-directory/#findComment-141382 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.