Jump to content

MaxMouseDLL

New Members
  • Posts

    5
  • Joined

  • Last visited

    Never

Profile Information

  • Gender
    Not Telling

MaxMouseDLL's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. It should go as deep as possible (IE: index all links) because I'm going to lock it to the domain, and duplicates will be removed. I think I'm going to have to sit and stare at this one for a while. The idea is to cron job the script, or atleast password it so i can execute it whenever i see fit, and from the output generate a sitemap.xml, one will be generated daily from this. From there the domain will be a filler (EG: <domain>) and i'll use another script to change (str_replace) that filler to whatever domain i see fit, that way where ever my site code is deployed a sitemap will be readily available.
  2. Its function based, so i pass it a URL (EG: http://www.something.com ) it returns an array containing all the links on that page, each one of those links will also need to be spidered for other links. So what i need to do is pass it a link, which returns all the links contained within that then begin "dynamic looping" if it returns index.html, hello.html, whatever.html it'll need to spider those three for links, and whatever links it finds within those and so on, the number of links returned is arbitrary and unknown, hence the loop needs to pay attention to the ever growing array rather than just the upper boundary of the array when it began executing. If you would like to see the code i can provide it. I'm not a hater i have just never got on with recursion... i tend to try and visualise the whole thing which is pretty much impossible and ends up bogging my brain down.
  3. My code is a PHP Web Spider, so it may or may not add an element (or more than one element) per loop... EG: http://yoursite.com may produce 300 extra elements http://yoursite.com/map.php may produce (say) 10 elements http://yoursite.com/about.php may produce none All variables are unknown... How do i avoid an infinite loop, yet process an ever changing array upper boundary until completion... i hate recursion lol!
  4. Say i have a foreach loop, which is dependant on $arr... if i add other elements to $arr while still inside the foreach loop, will foreach process the newly created elements or drop out at what the $arr upper boundary condition was before entering the foreach loop? If it doesn't process elements added while inside the foreach, then what would be a work around for this?
  5. Hello All, I have acquired a piece of code which will extract URL's from a string (Currently I'm using cURL to obtain that string) it outputs said URL's within an Array. My question is, like a spider, how would i recurse this array, getting the contents of each element with cURL and adding new URL results to the same array only if they do not already exist in the array (Wouldn't want to spider the same URL twice), and continue until there are no more elements left in the array to process. $urls = extract_html_urls( $strContent ); foreach ($urls['a']['href'] as $key => $lnk) { // Code... echo $lnk . "<br>\n"; } URL: http://chipstix.net/includes/self_spider.php I've always had problems visualising multi-dimensional arrays (Not a problem here) or fully grasping recursive functions, especially those that depend upon an array who's size is changing as it executes, i normally end up with a head-ache, this time I've decided to ask for help! Thanks in advance.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.