Jump to content

Search the Community

Showing results for tags 'multi-threading'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Welcome to PHP Freaks
    • Announcements
    • Introductions
  • PHP Coding
    • PHP Coding Help
    • Regex Help
    • Third Party Scripts
    • FAQ/Code Snippet Repository
  • SQL / Database
    • MySQL Help
    • PostgreSQL
    • Microsoft SQL - MSSQL
    • Other RDBMS and SQL dialects
  • Client Side
    • HTML Help
    • CSS Help
    • Javascript Help
    • Other
  • Applications and Frameworks
    • Applications
    • Frameworks
    • Other Libraries
  • Web Server Administration
    • PHP Installation and Configuration
    • Linux
    • Apache HTTP Server
    • Microsoft IIS
    • Other Web Server Software
  • Other
    • Application Design
    • Other Programming Languages
    • Editor Help (PhpStorm, VS Code, etc)
    • Website Critique
    • Beta Test Your Stuff!
  • Freelance, Contracts, Employment, etc.
    • Services Offered
    • Job Offerings
  • General Discussion
    • PHPFreaks.com Website Feedback
    • Miscellaneous

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


AIM


MSN


Website URL


ICQ


Yahoo


Jabber


Skype


Location


Interests


Age


Donation Link

Found 2 results

  1. I built a "bulk importer" that takes a .zip file filled with images and a corresponding csv file that holds attributes. I'm happily using some JavaScript to provide upload-progress feedback to the user. So if the .zip file is say 10mb... they are seeing it's upload progress. (im using AJAX) This is all working nicely BUT... Once the .zip hits the server I need to do A TON of processing. Each image has to be converted into 10 different sizes, cropped, etc... All entries must be entered into the Database and admin text logs created. All of this actually works just fine for small files <10mb and I'm sure it could work with bigger files by increasing timeout time,etc... BUT the browser "locks up" during processing and there is no real way to inform the user about the progress of their files being processed. I thought maybe I could be clever and create a "progress table" in the db... and use it like this: As soon as the .zip file is uploaded to the server I create a row and an id. Next I send that id back to the browser (AJAX) and immediately start the laborious processing. The processing would continually update the DB with it's progress. The js would receive the id and keep polling the DB to check on the processing progress and ultimately report this back to the user. Well my brilliant scheme doesn't seem to work and everything locks up regardless. I think I was trying to fake multi-threading and I'm not sure how to solve this problem. My end goal is to crunch huge files and keep the user notified of it's progress - Does anyone have good advice?
  2. I have a web crawler I created with PHP, and now I want to alter the structure so that it crawls concurrently. Here is a copy of the class. I did not include the private functions, and there is only one public function. class ConcurrentSpider { private $startURL; private $max_penetration = 5; const DELAY = 1; const SLEEPTIME = 1; const ALLOW_OFFSITE = FALSE; private $maxChildren = 1; private $children = array(); function __construct($url) { $this->concurrentSpider($url); } public function concurrentSpider($url) { // STEP 1: // Download the $url $pageData = http_get($url, $ref = ''); if(!$this->checkIfSaved($url)){ $this->save_link_to_db($url, $pageData); } // print_r($pageData); sleep(self::SLEEPTIME); // STEP 2: // extract all hyperlinks from this url's page data $linksOnThisPage = $this->harvest_links($url, $pageData); // STEP 3: // Check the links array from STEP 2 to see if this page has // already been saved or is excluded because of any other // logic from the excluded_link() function $filteredLinks = $this->filterLinks($linksOnThisPage); // STEP 4: loop through each of the links and // repeat the process foreach ($filteredLinks as $filteredLink) { //print "Level $x: \n"; $pid = pcntl_fork(); switch ($pid) { case -1: print "Could not fork!\n"; exit(1); case 0: print "In child with PID: " . getmypid() . " processing $filteredLink \n"; $spider = new ConcurrentSpider($filteredLink); sleep(2); exit(1); default: // print "$pid In the parent\n"; // Add an element to the children array $this->children[$pid] = $pid; while(count($this->children) >= $this->maxChildren){ print count($this->children) ." children \n"; $pid = pcntl_waitpid(0, $status); unset($this->children[$pid]); }*/ } } } You can see in step 4 I fork PHP and, in the child, create a new instance of my spider class. What I'm expecting to happen is that the first child, for example, will take the first element of my filterlinks array and begin to spider the links located at that particular URL. Then, of course, it loops, and it I'm expecting it to fork off and spider the second element of $filteredLinks array. However, what is actually happening is that each child tries to read the the first link of an array over and over. You can see where I have a print statement in the child. Here is an example of what that prints out. In child with PID: 12583 processing http://example.com/ In child with PID: 12584 processing http://example.com/ In child with PID: 12585 processing http://example.com/ So it's forking, but it keeps trying to read the first elemenent of the $filteredLinks array over and over. This seems to be an infinite loop. Secondly, if I remove while loop, then the print statement correctly prints each link that is on the page within its own child. However, it will not spider any of those links and the loop exits. Thoughts on what could be wrong with my logic?
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.