Jump to content

Recommended Posts

Hi everyone!!

I am doing an map reduce in PHP. It's whole written in PHP (no js code generator).

  i've been running my script on local machine and the result was great, but trouble came when i had too run the script on server machine. I think that it's too overloaded with tasks. But firstly You have to read the description of my script to understand why do i think so.

 

Firstly i delete current records with criteria and then send a query ($collection->find($criteria);). Next for each element being represented by mongo cursor I am doing a map from collection1 to form of collection2 where the mapreduce result is being saved. This works just fine (i think). Secondly i am grabing keys (group keys) from array being result of Map (lets call it MapResult), and putting those keys as another criteria into another find but this time it was findOne and it was executed on collection2. (FindOne couse in my collection there is no records with same keys and if the keys will match it will add the values from MapResult['value'] to the current sum of those values that have been finded in collection2 with collection2->find($criteria2) where criteria2 are group keys ).

If the record in collection2 exists as i have writen above the values will be add'ed (Thats where reduce function comes in handy - end it works just fine as well on local machine). The next step is saving MongoReduceResult in to collection2.

I've done it in 2 ways: firstly, using update, then when update() was failing, i tried save().

Both methods were ineffective. The records where adding but (in my opinion!!!) the php update havn't been  waiting for answer from mongo. And in next step in foreach ($mongo_cursor as $data)  findOne ($data['id')) on criteria returned null but previous record with the same keys should be updated!! And thats when the values from previous records are being overwritten! :/

 

I used update options: 'upsert', 'set', 'fsync'... NOTHING worked! :( Any ideas??

 

Link to comment
https://forums.phpfreaks.com/topic/247045-mongo-update-with-options-failure-why/
Share on other sites

map and reduce functions work fine as i said on phpfreaks forum, so i won't paste code for it here.

$criteria is date only.

$mongoCursor = $collection->find($criteria);
foreach ($mongoCursor as $key => $value) {
    $mapResult = $this->map($value, $groupKey);
    $isInCollection2 = $collection2->findOne(array('_id' => $mapResult['_id']));
    if (!empty($isInCollection2)) { //If there will be record with this gorup key, the values are being summed with current $mapResult['value];
        $mapResult = $this->reduce($mapResult, $city_id, $isInCollection2['value']);
    }
    $collection2->update(array('_id' => $mapResult['_id']), array('$set' => array('value' => $mapResult['value'])), array('upsert' => true, 'safe' => true));
}

Input structure:  ($collection1)

{
    "_id" : ObjectId("4e6dfc8a7ba176a952000000"),
    "date" : "2011-08-07",
    "something" : 0,
    "moresmthng" : 1,
    "city_id" : 33,
    "prog_id" : 1230,
    "some_text" : ""
}
output structure: ($collection2)
{
    "_id" : {
        "date" : "2011-08-07",
        "progid" : 1230,
    },
    "value" : {
        "33" : { //this is city_id
                                    "something" : 0,
                                    "some_text" : ""
                                    "moresmthng" : 1,
        }
    }
}
if one of next records will have the same date and progId (group keys) but different city_id (for example 45) the output structure will look like:
        {
    "_id" : {
        "date" : "2011-08-07",
        "progid" : 1230,
    },
    "value" : {
        "33" : { //this is city_id
                                    "something" : 0,
                                    "some_text" : ""
                                    "moresmthng" : 1,
        },
                       
        "45" : { //this is city_id
                                    "something" : 12,
                                    "some_text" : "blah blah"
                                    "moresmthng" : 111,
}
}
   

offtopic : couldn't find the edit button:/

I am almost sure that this is becouse the foreach is not waitng for responce from mongo update!! :/ How too check it?? ;/

i've var_dumped the amount of times, when

 if (!empty($isInCollection2))

is true.

It was equal 169586;

( //this is my log: Log on service machine looks the same.
    [input] => 179055 //amount of records in collection1 matching criteria,
    [update_correct] => 179055 //this many times the foreach has run
    [update_failure] => 0 //... useles for now;P
    [Output] => 9452 // the amount of inserts to collection2 ($collection2 size before whole operation minus after operation)
) 

As You can see 179055 - 9452 = 169586 so the amount of if (...) being run is correct. On server machine i go smaller value. So the collection2->find() was looking for records, that has'nt been inserted yet but should be! :/

 

The collection1 on server has about 30 000 000 records, collection2 = 1 000 000 . On my local machine i've got only  about 600 000 records of collection1 copied from server.

Oh come on! Don't You really know the answer??

It's simple!! in update option save should be set as number of machines that our mongo stands on. In my case: 2 machines (one master, second slave), so the options should look like:  'safe' => 2

That's all.

Thanks for any effort.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.