Jump to content

Curious about explode()


monkeytooth

Recommended Posts

I have a list of items

 

item name|description for item|item number|item stock reference|item quantity|

item name|description for item|item number|item stock reference|item quantity|

item name|description for item|item number|item stock reference|item quantity|

item name|description for item|item number|item stock reference|item quantity|

item name|description for item|item number|item stock reference|item quantity|

item name|description for item|item number|item stock reference|item quantity|

 

I know I can use explode to break that down by | but my issue is a couple things. One file I have this CSV layout is literally a continuous line no line breaks. The other is Some of the lines have "item name||item number|||" best way would be to put it is blanks.

 

On the up and up the CSV is consistent as in exact number of spots per | and whats supposed to be in that spot. So what I am wondering is, is it possible to

do something like

$blah = explode("|", $string, 5);

But have it do that per 5 | and if so how? Would I build an array off the initial explode and output it some how where every 5 | are part of a new line or is there something other than explode() that I can use?

Link to comment
Share on other sites

I think you are better off using a regular expression using preg_match_all

 

I am not the best at Regex and I am sure Salathe will have criticism to this, but it works, so yea.

 

$items = "item-name|description for item|12345|5432|5|item name|description for item|12345|5432|5|item name|description for item|12345|5432|5|item name|description for item|12345|5432|5|";

// item name|description for item|item number|item stock reference|item quantity|
preg_match_all('~([a-z0-9.\-_\'" ]+)\|(.*)\|([0-9]+)\|([0-9]+)\|([0-9]+)\|~U', $items, $matches);
echo "<pre>", print_r($matches, true), "</pre>";

 

Should give you an idea of how the multi-dimensional array, which is returned, is setup.

Link to comment
Share on other sites

Ok what I am trying to do over all is dump this data into a mysql table. Thats what I was looking at explode() below is an excerpt from the file I am working with.

 

What I am trying to do is break down "$stringy" into 5 chunks. per "|" petty much.

And use the concept foreach to take each piece and put it in its proper column for the insert query im going to make with it. My thing is Im not sure about how to go about creating multiple arrays per 5 or how to go about doing this properly. Im kinda stumped.. 

 

<?php
$stringy = "
0020080956|USA00480508|BC|From an interoffice memo to a fifty-page proposal, this is the definitive guide to business writing. Anyone who has ever had to write any business document will find "The Elements of Business Writing" the single most effective tool for producing clear, concise, and persuasive prose. Equally useful to executives and support staff, it shows how to: write clearly and powerfully; rid writing of jargon and pompous language; organize material effectively; and avoid errors in spelling, grammar, and usage. (paper) |9780020080954|
0020080956|USA00480508|TC|Acknowledgments.Introduction.1. Principles of Communication. Rule 1: Use the Active Voice. Rule 2: Avoid Long Sentences. Rule 3. Use Simple Language. Rule 4: Delete Words, Sentences, and Phrases That Do Not Add to Your Meaning. Rule 5: Break Your Writing into Short Sections. Rule 6: Use Specific and Concrete Terms. Rule 7: Write in a Natural, Conversational Style. Rule 8: Keep Ideas Parallel.2. Principles of Organization. Rule 9: Organize Your Material According to the Way Your Reader Thinks about the Subject. Rule 10: Organize Your Material Logically. Rule 11: Delete the Warm-Up Paragraph. Rule 12: Use an Executive Summary. Rule 13: Separate Fact from Opinion. Rule 14: Delete Unnecessary Closings. Rule 15: Use Headings and Subheadings.3. Principles of Wording and Phrasing. Rule 16: Avoid Wordy and Redundant Phrases. Rule 17: Use Small Words. Rule 18: Avoid Sexist Language. Rule 19: Know the Proper Use of the Most Commonly Misused Words and Phrases. Rule 20: Substitute Modern Business Language for Antiquated Phrases. Rule 21: Substitute Original Language for Cliches. Rule 22: Avoid Jargon.4. Principles of Tone. Rule 23: Write to Express, Not to Impress. Rule 24: Prefer Informal to Formal Language. Rule 25: Prefer Positive Words to Negative Words. Rule 26: In a Sentence Containing Both Good and Bad News, Give the Bad News First. Rule 27: Write to Change Behavior, Not to Express Anger. Rule 28: Be Your Most Pleasant Self. Rule 29: Use Contractions to Warm Up Your Message. Rule 30: Avoid Unnecessary Hedging. Rule 31: Avoid Sarcasm.5. Principles of Persuasion. Rule 32: Gain Your Reader's Attention in an Appropriate Manner. Rule 33: Awaken a Need for an Idea before Presenting the Idea. Rule 34: Stress Benefits, Not Features. Rule 35: Use Facts, Opinions, and Statistics to Prove Your Case. Rule 36: Don't Get Bogged Down in Unnecessary Details or Arguments. Rule 37: Tell the Reader What to Do Next. Rule 38: Before Making a Request, Give the Reader a Reason to Respond. Rule 39: Do Not Assume the Readers Has Been Persuaded by Your Argument.6. Principles of Punctuation, Grammar, Abbreviation, Capitalization, and Spelling.Punctuation. Rule 40: Use Commas to Indicate a Brief Pause. Rule 41: Use a Semicolon to Separate Independent Clauses Not Joined by a Conjunction. Rule 42: Use a Colon to Introduce a List or Explanation. Rule 43: Add an Apostrophe and ansto Form the Possessive Case of a Singular Noun. Rule 44: Hyphenate Two Words Compounded to Form an Adjective Modifier if They Precede a Noun. Rule 45: Use an Ellipsis to Show Hesitation or Omission. Rule 46: Use Parentheses to Add Explanatory Material That's Not Part of the Main Thought. Rule 47: Use a Dash to Interrupt -- or Highlight -- a Thought. Rule 48: Avoid Slash Construction. Rule 49: Put Commas Inside Quotation Marks.Grammar. Rule 50: Avoid Subject and Verb Disagreement. Rule 51: Avoid Improper Use of Reflexive Pronouns. Rule 52: Avoid Sentence Fragments and Run-On Sentences. Rule 53: Avoid Dangling Modifiers. Rule 54: Avoid Misplaced Modifiers.Abbreviations. Rule 55: Use Too Few Abbreviations Rather Than Too Many. Rule 56: Do Not Use an Apostrophe When Writing the Plural of an Abbreviation.Capitalization. Rule 57: Do Not Capitalize Words to Emphasize Their Importance. Rule 58: Capitalize the Full Names of Corporation, Government Agencies, Divisions, Departments, and Organizations. Rule 59: Capitalize Trade Names.Spelling. Rule 60: Know the Basic Rules of Spelling, Rule 61: If there are Variant Spellings, Use the Preferred Spelling. Rule 62: Keep a List of the Words You Repeatedly Misspell.7. Principles of Format. Rule 63: Use Wide Margins to Aid Readability. Rule 64: Use|9780020080954|
0020351283|USA01098880|AA|Born in England in 1905, Ashley Montagu was educated at the University of London, University of Florence, and Columbia University. After serving as a research worker in natural history at the British Museum, he became curator of physical anthropology at the Welcome Medical Historical Museum in London. He has taught anthropology at Harvard, Princeton, and the University of California. He has been director of research for the New Jersey Committee on Growth and Development, and for many years served as chairman of the Anisfield-Wold Award Committee on Race Relations.|9780020351283|
0023001402|USA00612675|SD|Library of Liberal Arts title. |9780023001406|
";

$bookdata = explode("|", $stringy, 5);
foreach($bookdata as $bookitem){
echo "ISBN10 is: ".$bookitem[0]."<br />";
echo "Tupe is: ".$bookitem[1]."<br />";
echo "Anno is: ".$bookitem[3]."<br />";
echo "ISBN13 is: ".$bookitem[4]."<br />";

}
//var_dump($bookdata);
?>

Link to comment
Share on other sites

$s='a|b|c|d|e|a|b|c|d|e|a|b|c|d|e';
$ss=explode("|",$s);
$i=0;
$j=0;
foreach($ss as $t)
{
$array[$i][$j]=$t;
if($j==4)
{$i++;$j=0;}
else{$j++;}
}
echo "<pre>";
print_r($array);

adjust $j in the if statement for the number of items in each group.(remember that counting starts at 0)

 

 

HTH

Teamatomic

Link to comment
Share on other sites

Well im feeling off my game tonight.. that last bit works wonders, not im stuck trying to pool the data on an array to array basis. Im trying to figure out if While or For or Foreach would work better, and how to get it written out so I can do it dynamicaly as I have 5 files ranging from 127mb to 1.25gig in size to run this function I am trying to build up.

 

Now what I need to do is find out how many arrays are in the main array, and then pool them out to do my database inserts with.

Link to comment
Share on other sites

another thing I noticed, im not sure if its a consequence of the way its being built or the actual string..

    [0] => Array
        (
            [0] => 
0020080956
            [1] => USA00480508
            [2] => BC
            [3] => From an interoffice memo to a fifty-page proposal, this is the definitive guide to business writing. Anyone who has ever had to write any business document will find "The Elements of Business Writing" the single most effective tool for producing clear, concise, and persuasive prose. Equally useful to executives and support staff, it shows how to: write clearly and powerfully; rid writing of jargon and pompous language; organize material effectively; and avoid errors in spelling, grammar, and usage. (paper) 
            [4] => 9780020080954
        )

    [1] => Array
        (
            [0] => 
0020080956
            [1] => USA00480508
            [2] => TC
            [3] => Acknowledgments.Introduction.1. Principles of Communication. Rule 1: Use the Active Voice. Rule 2: Avoid Long Sentences. Rule 3. Use Simple Language. Rule 4: Delete Words, Sentences, and Phrases That Do Not Add to Your Meaning. Rule 5: Break Your Writing into Short Sections. Rule 6: Use Specific and Concrete Terms. Rule 7: Write in a Natural, Conversational Style. Rule 8: Keep Ideas Parallel.2. Principles of Organization. Rule 9: Organize Your Material According to the Way Your Reader Thinks about the Subject. Rule 10: Organize Your Material Logically. Rule 11: Delete the Warm-Up Paragraph. Rule 12: Use an Executive Summary. Rule 13: Separate Fact from Opinion. Rule 14: Delete Unnecessary Closings. Rule 15: Use Headings and Subheadings.3. Principles of Wording and Phrasing. Rule 16: Avoid Wordy and Redundant Phrases. Rule 17: Use Small Words. Rule 18: Avoid Sexist Language. Rule 19: Know the Proper Use of the Most Commonly Misused Words and Phrases. Rule 20: Substitute Modern Business Language for Antiquated Phrases. Rule 21: Substitute Original Language for Cliches. Rule 22: Avoid Jargon.4. Principles of Tone. Rule 23: Write to Express, Not to Impress. Rule 24: Prefer Informal to Formal Language. Rule 25: Prefer Positive Words to Negative Words. Rule 26: In a Sentence Containing Both Good and Bad News, Give the Bad News First. Rule 27: Write to Change Behavior, Not to Express Anger. Rule 28: Be Your Most Pleasant Self. Rule 29: Use Contractions to Warm Up Your Message. Rule 30: Avoid Unnecessary Hedging. Rule 31: Avoid Sarcasm.5. Principles of Persuasion. Rule 32: Gain Your Reader's Attention in an Appropriate Manner. Rule 33: Awaken a Need for an Idea before Presenting the Idea. Rule 34: Stress Benefits, Not Features. Rule 35: Use Facts, Opinions, and Statistics to Prove Your Case. Rule 36: Don't Get Bogged Down in Unnecessary Details or Arguments. Rule 37: Tell the Reader What to Do Next. Rule 38: Before Making a Request, Give the Reader a Reason to Respond. Rule 39: Do Not Assume the Readers Has Been Persuaded by Your Argument.6. Principles of Punctuation, Grammar, Abbreviation, Capitalization, and Spelling.Punctuation. Rule 40: Use Commas to Indicate a Brief Pause. Rule 41: Use a Semicolon to Separate Independent Clauses Not Joined by a Conjunction. Rule 42: Use a Colon to Introduce a List or Explanation. Rule 43: Add an Apostrophe and ansto Form the Possessive Case of a Singular Noun. Rule 44: Hyphenate Two Words Compounded to Form an Adjective Modifier if They Precede a Noun. Rule 45: Use an Ellipsis to Show Hesitation or Omission. Rule 46: Use Parentheses to Add Explanatory Material That's Not Part of the Main Thought. Rule 47: Use a Dash to Interrupt -- or Highlight -- a Thought. Rule 48: Avoid Slash Construction. Rule 49: Put Commas Inside Quotation Marks.Grammar. Rule 50: Avoid Subject and Verb Disagreement. Rule 51: Avoid Improper Use of Reflexive Pronouns. Rule 52: Avoid Sentence Fragments and Run-On Sentences. Rule 53: Avoid Dangling Modifiers. Rule 54: Avoid Misplaced Modifiers.Abbreviations. Rule 55: Use Too Few Abbreviations Rather Than Too Many. Rule 56: Do Not Use an Apostrophe When Writing the Plural of an Abbreviation.Capitalization. Rule 57: Do Not Capitalize Words to Emphasize Their Importance. Rule 58: Capitalize the Full Names of Corporation, Government Agencies, Divisions, Departments, and Organizations. Rule 59: Capitalize Trade Names.Spelling. Rule 60: Know the Basic Rules of Spelling, Rule 61: If there are Variant Spellings, Use the Preferred Spelling. Rule 62: Keep a List of the Words You Repeatedly Misspell.7. Principles of Format. Rule 63: Use Wide Margins to Aid Readability. Rule 64: Use
            [4] => 9780020080954
        )

 

notice the [0] spot for both outputs.. its got a line break in it, the source shows no

<br />

so Im assuming its picking up a \n somewhere or an \r

Link to comment
Share on other sites

Did you try fgetcsv?

 

thats something I would consider, but considering the sheer size of the a few of the files I am working with, php is going to time out and not allow it. Then the other issue is there is at least one file where there isnt rows its just a run on sentence so to speak and the only way to do it with that one is with the counting it out and basing it into arrays like im trying to do. So I can later also exclude certain elements from each array that I dont need..

Link to comment
Share on other sites

Here is a basic mock up and how I would handle it. First up, you will need to change your allowed memory for php, this can be changed with ini_set or in the php.ini file, I would suggest the ini_set so this script only gets the extra memory. You will also need to set_time_limit to 0 (for unlimited). Finally, you will need to have some type of access to your MySQL command line so you can import the file we are going to create. As you will not be able to do this over the internet due to the shear size of the file and the amount of inserts it is going to do. Just will not happen. This is untested, as I do not have large enough files but the theory should work.

 

I should also mention, linebreaks are not <br /> in text files they are a "hidden" character \n. This will only show up in command line / editing the file type of environment. So your file may in deed have line breaks in it, however, the code I am giving you, that will not matter. The following script takes into account if there are line breaks or not, so it should work either way.

 

I tested it on a 200MB file, it took about 15-20 seconds to complete. Questions let me know.

 

<?php
ini_set("memory_limit", "1024M"); // sets to 1 GB
set_time_limit(0);

$items = file_get_contents('test_text.txt');

// If we need to take into account not having line breaks, let's do so.
if (strstr($items, "\n") !== false) {
    preg_match_all('~[0-9]+\|[a-z0-9]+\|[A-Z]{2}\|.*\|[0-9]+\|~Ui', $items, $matches);
    $matches = $matches[0];
}else {
    $items = str_replace("\r", "", $items); // just incase.
    $matches = explode("\n", $items);
}

$sql = array();
foreach ($matches as $match) {
    $item = explode("|", $match);
    $sql[] = "('" . escapeSq($item[0]) . "', '" . escapeSq($item[1]) . "', '" . escapeSq($item[2]) . "', '" . escapeSq($item[3]) . "', '" . escapeSq($item[4]) . "')";
}

$fh = fopen('sql_file.sql', 'w');
fwrite($fh, "INSERT INTO table_name (col1, col2, col3, col4, col5) VALUES " . implode(",\n", $sql) . ";");
fclose($fh);

echo "Completed.\n";

// escapes single quotes to not break the sql.
function escapeSq($string) {
    return str_replace("'", "\\'", $string);
}
?>

 

Now I am not sure if MySQL limits how many extended inserts can be done, so if that causes guff we just need to add in logic to do another insert after x amount of records.

Link to comment
Share on other sites

@premiso, that looks exactly like what I am attempting to do, worked lovely first shot I took a snippet of the file I am working with an it just worked lovely..

 

Im curious I know mySQL will explode on me if my queries are to large on the insert. Is there a way to either break it into a line by line insert query where theres a query per line in the output or limit it to maybe 1000 or so queries per insert query?

 

Right now this one is putting out

 

INSERT INTO table_name (col1, col2, col3, col4, col5) VALUES ('0020080956', 'USA00480508', 'BC', 'From an interoffice memo to a fifty-page proposal, this is the definitive guide to business writing. Anyone who has ever had to write any business document will find "The Elements of Business Writing" the single most effective tool for producing clear, concise, and persuasive prose. Equally useful to executives and support staff, it shows how to: write clearly and powerfully; rid writing of jargon and pompous language; organize material effectively; and avoid errors in spelling, grammar, and usage. (paper) ', '9780020080954'),

('0020080956', 'USA00480508', 'TC', 'Acknowledgments.Introduction.1. Principles of Communication. Rule 1: Use the Active Voice. Rule 2: Avoid Long Sentences. Rule 3. Use Simple Language. Rule 4: Delete Words, Sentences, and Phrases That Do Not Add to Your Meaning. Rule 5: Break Your Writing into Short Sections. Rule 6: Use Specific and Concrete Terms. Rule 7: Write in a Natural, Conversational Style. Rule 8: Keep Ideas Parallel.2. Principles of Organization. Rule 9: Organize Your Material According to the Way Your Reader Thinks about the Subject. Rule 10: Organize Your Material Logically. Rule 11: Delete the Warm-Up Paragraph. Rule 12: Use an Executive Summary. Rule 13: Separate Fact from Opinion. Rule 14: Delete Unnecessary Closings. Rule 15: Use Headings and Subheadings.3. Principles of Wording and Phrasing. Rule 16: Avoid Wordy and Redundant Phrases. Rule 17: Use Small Words. Rule 18: Avoid Sexist Language. Rule 19: Know the Proper Use of the Most Commonly Misused Words and Phrases. Rule 20: Substitute Modern Business Language for Antiquated Phrases. Rule 21: Substitute Original Language for Cliches. Rule 22: Avoid Jargon.4. Principles of Tone. Rule 23: Write to Express, Not to Impress. Rule 24: Prefer Informal to Formal Language. Rule 25: Prefer Positive Words to Negative Words. Rule 26: In a Sentence Containing Both Good and Bad News, Give the Bad News First. Rule 27: Write to Change Behavior, Not to Express Anger. Rule 28: Be Your Most Pleasant Self. Rule 29: Use Contractions to Warm Up Your Message. Rule 30: Avoid Unnecessary Hedging. Rule 31: Avoid Sarcasm.5. Principles of Persuasion. Rule 32: Gain Your Reader\'s Attention in an Appropriate Manner. Rule 33: Awaken a Need for an Idea before Presenting the Idea. Rule 34: Stress Benefits, Not Features. Rule 35: Use Facts, Opinions, and Statistics to Prove Your Case. Rule 36: Don\'t Get Bogged Down in Unnecessary Details or Arguments. Rule 37: Tell the Reader What to Do Next. Rule 38: Before Making a Request, Give the Reader a Reason to Respond. Rule 39: Do Not Assume the Readers Has Been Persuaded by Your Argument.6. Principles of Punctuation, Grammar, Abbreviation, Capitalization, and Spelling.Punctuation. Rule 40: Use Commas to Indicate a Brief Pause. Rule 41: Use a Semicolon to Separate Independent Clauses Not Joined by a Conjunction. Rule 42: Use a Colon to Introduce a List or Explanation. Rule 43: Add an Apostrophe and ansto Form the Possessive Case of a Singular Noun. Rule 44: Hyphenate Two Words Compounded to Form an Adjective Modifier if They Precede a Noun. Rule 45: Use an Ellipsis to Show Hesitation or Omission. Rule 46: Use Parentheses to Add Explanatory Material That\'s Not Part of the Main Thought. Rule 47: Use a Dash to Interrupt -- or Highlight -- a Thought. Rule 48: Avoid Slash Construction. Rule 49: Put Commas Inside Quotation Marks.Grammar. Rule 50: Avoid Subject and Verb Disagreement. Rule 51: Avoid Improper Use of Reflexive Pronouns. Rule 52: Avoid Sentence Fragments and Run-On Sentences. Rule 53: Avoid Dangling Modifiers. Rule 54: Avoid Misplaced Modifiers.Abbreviations. Rule 55: Use Too Few Abbreviations Rather Than Too Many. Rule 56: Do Not Use an Apostrophe When Writing the Plural of an Abbreviation.Capitalization. Rule 57: Do Not Capitalize Words to Emphasize Their Importance. Rule 58: Capitalize the Full Names of Corporation, Government Agencies, Divisions, Departments, and Organizations. Rule 59: Capitalize Trade Names.Spelling. Rule 60: Know the Basic Rules of Spelling, Rule 61: If there are Variant Spellings, Use the Preferred Spelling. Rule 62: Keep a List of the Words You Repeatedly Misspell.7. Principles of Format. Rule 63: Use Wide Margins to Aid Readability. Rule 64: Use', '9780020080954'),

('0020351283', 'USA01098880', 'AA', 'Born in England in 1905, Ashley Montagu was educated at the University of London, University of Florence, and Columbia University. After serving as a research worker in natural history at the British Museum, he became curator of physical anthropology at the Welcome Medical Historical Museum in London. He has taught anthropology at Harvard, Princeton, and the University of California. He has been director of research for the New Jersey Committee on Growth and Development, and for many years served as chairman of the Anisfield-Wold Award Committee on Race Relations.', '9780020351283'),

('0023001402', 'USA00612675', 'SD', 'Library of Liberal Arts title. ', '9780023001406');

In one single insert query.

 

Sorry I'm not trying to get you to write something fully for me, I just have a lot on my plate with the rest of the site I am working on this is just a small piece of a lot bigger puzzle and my mind is so wrapped around that stuff, trying to figure something that would probably be easy normally is just not working that well for me.. Cause of the other stuff I'm trying to do at the same time. That and this is something that works very well for the cause I don't want to muck it up so to speak by trying to add to it, then break it, and then try to fix it not knowing where I broke it, dunno if that makes any sense but yea, just dont want to break whats working.. and if its not possible then I dont want to spend a lot of effort trying to make it work.. And will work with what I got..

 

I really do appreciate what you've helped me with thus far though.

Link to comment
Share on other sites

I understand that, what I've shown on here is small, but on the largest one I have to work with I am likely looking at 1mil+ records. Thats where I'm not sure if the single query will work or not.. If I can pull that much off give or take via one query. Then this is a great solution, other wise i have to break it down somewhere some how.. and if i can't do this through what premiso said, then i will have to sit and figure another means out.. but what he gave is definitely a good start point for me, and shows me I was definitely approaching it from the wrong angle when I first started.

Link to comment
Share on other sites

This should break it at 1000 records per extended insert. If you want to change that simply modify the if ($i % 1000)  and change 1000 to be whatever you want it to be.

 

<?php
ini_set("memory_limit", "1024M"); // sets to 1 GB
set_time_limit(0);

$items = file_get_contents('test_text.txt');

// If we need to take into account not having line breaks, let's do so.
if (strstr($items, "\n") !== false) {
    preg_match_all('~[0-9]+\|[a-z0-9]+\|[A-Z]{2}\|.*\|[0-9]+\|~Ui', $items, $matches);
    $matches = $matches[0];
}else {
    $items = str_replace("\r", "", $items); // just incase.
    $matches = explode("\n", $items);
}

$sql = array();
$cnt = count($matches);
$x=0;
for ($i=0; $i<$cnt; $i++) {
    $item = explode("|", $matches[$i]);
    
    // break at 1000 rows
    if (($i % 1000) == 0) {
        $x++;
        $sql[$x] = array();
    }
    
    $sql[$x][] = "('" . escapeSq($item[0]) . "', '" . escapeSq($item[1]) . "', '" . escapeSq($item[2]) . "', '" . escapeSq($item[3]) . "', '" . escapeSq($item[4]) . "')";
}

$fh = fopen('sql_file.sql', 'w');

foreach ($sql as $stmt) {
    fwrite($fh, "INSERT INTO table_name (col1, col2, col3, col4, col5) VALUES " . implode(",\n", $stmt) . ";\n");
}

fclose($fh);

echo "Completed.\n";

function escapeSq($string) {
    return str_replace("'", "\\'", $string);
}
?>

Link to comment
Share on other sites

Oh, so close.. new issue on this.

 

Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 1349279078 bytes)

 

But im not sure if thats referencing the server processor/ram in the hosts settings or. or if its just not reading the file cause of the sheer size, right now I am tempting to do this off (not on) my local machine, I am using a dedicated server via host gator.. Im not sure what my limits are with host gator and the dedicated machine, but if i can't dump this via the server I can set up a WAMP here on my vista machine. And cycle it through that. Anyone know what settings in the php.ini or apache http.conf I would have to change to make this work on my local machine?

Link to comment
Share on other sites

Set it to 2048M and same issue. Host Gator in there FAQ/Support area claims changing it as such so long as you have a dedicated server which from my understanding we do. Is not a problem..

 

But im still getting the same issue upping it to 2048

 

Fatal error: Out of memory (allocated 262144) (tried to allocate 1349279078 bytes), the Allocated portion of the error makes me wonder though cause thats 256MB i believe. So its still not pumping enough for me. Im starting to think that im going to have to do this on my machine, which is fine..

 

Only thing right now is I have set up a WAMP specifically apache2triad, and upgraded my apache server to the latest stable version. Now what I am getting when I try it on my home server is a reset connection issue. So either way this is just seeming a pain in the arse to do.

 

What I need to do is some how bypass the hostgator limits or figure out how to better config my apache conf and php.ini files.

 

Eh, frustrations..

Link to comment
Share on other sites

Honestly, I would do a clean install of WAMP on your machine. I would not mess with a dedicated production machine for a task like this. I would also use the CLI to run the program, meaning you do not even need apache, so I would not install apache. Simply PHP.

 

You can get the latest version of PHP then modify the php.ini file to set the memory_limit to be 2048M. Then save the script in the same dir as php, change the file paths for the text file and run this command from the php directory:

 

php -f yourscript.php

 

Which should run through. The problem with doing this under Apache / Browser is simply that they can have limitations as well. Doing it with just PHP you can effectively eliminate one of them causing the issue.

Link to comment
Share on other sites

Never seen that error before, but I am sure it has to do with how much memory it is eating up. If you can I would try and split the file into 500MB pieces then run the script for each file, making sure you save the sql file as a numbered file.

 

A pain in the ass, but sometimes you have to do parts manually, especially when dealing with this amount of data. Unless someone has a better suggestion, I am out of ideas :)

 

Link to comment
Share on other sites

Eh, I suppose your right.. This turned from a hopeful fact finding expedition to try to get a task done.. to the metaphorical TV Remote search - Yes I can get up to change the channel, but id rather spend hours looking for it .. if i cant find it, then I must find a device to reach the tv from where I would sit.. all the while it reverts back to me eventually doing it the way I should have, getting up and changing the channel on the TV manually.

 

premiso, thank you for everything..

Link to comment
Share on other sites

With large data, it might be worthwhile skipping PHP entirely and going for a more streamlined approach to feeding the database with its data.  My suggestion would be to use LOAD DATA INFILE (reference) which can parse your pipe-separated values (even without newline separators) directly into a table. A query would look like:

 

LOAD DATA INFILE 'datafile.txt'
INTO TABLE table_name
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '|'
(col1,col2,col3,col4,col5);

 

That approach should be able to eat up many millions of rows without breaking a sweat.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.