kittrellbj Posted January 21, 2011 Share Posted January 21, 2011 Here's what I'm trying to do, and I am having trouble getting started with this. It's a very simple process, but I didn't want to spend the next 6 hours in frustration, so some help getting started would be great. Here's the purpose of the script: 1. Allow user to add a text file to a form. 2. Take the text file, add HTML code to the beginning and end of each paragraph (a single line of text, usually paragraphs would be separated by a line return) 3. Send the user an email with the HTML file attached and thank them or whatever. 4. Allow the system to throttle itself (one-at-a-time) so that many people using the site won't bog it down. These files will probably be anywhere from 100 KB to 1,000 KB in size, usually hitting in the 300-500KB range. Here's what I can do very easily: 1. Allow user to add a text file - very simple and straightforward. 2. Take the text file, add HTML... - this is what I need a little help figuring out. Each paragraph needs to have <p> at the beginning and </p> at the end, and the script will also search for keywords on certain lines (section headers) and add a <align="center"> tag to that, and so forth. I can handle the formatting rules, but making sure the loop runs correctly could be a problem. 3. Send the user an email... - very easy, I can do that myself. 4. Allow the system to throttle itself... - this could be tricky. I was thinking a database with a TINYINT field, 0 for not processed yet, 1 for processing, 2 for processed. Cron job checks the next one on the list to see if it needs to send it to the processor, if the file is already being processed, or can be sent to a different database (completed entries) and removed from the current queue. The cron job would also be responsible for triggering the "Your file is converted!" email and the attachment. Any/all help would be greatly appreciated on this. I am going to work on the parts that I can do myself, and I'll be checking back for the discussion - in between Mountain Dew runs. Quote Link to comment Share on other sites More sharing options...
radi8 Posted January 21, 2011 Share Posted January 21, 2011 Well, I have no idea about the throttle issue. That would be a server configuration issue I would assume. As for parsing the text file and adding html elements to it, I would look at using the explode() command (http://php.net/manual/en/function.explode.php). This is a powerful tool and should at least get you started. Look at the samples in the PHP manual, good info there. I hope this gets you started. Others will no doubt have other ideas for you. Quote Link to comment Share on other sites More sharing options...
kittrellbj Posted January 21, 2011 Author Share Posted January 21, 2011 I was afraid you were going to say explode. I will start working with the explode option to see how terrible it is at working with files of this size. I suppose I can explode by using the line returns as the delimiter. Quote Link to comment Share on other sites More sharing options...
Pikachu2000 Posted January 21, 2011 Share Posted January 21, 2011 Read it into an array with file . . . Quote Link to comment Share on other sites More sharing options...
kittrellbj Posted January 21, 2011 Author Share Posted January 21, 2011 I'll try pulling it in with file(). Explode will probably hang too much on larger files. Quote Link to comment Share on other sites More sharing options...
DavidAM Posted January 22, 2011 Share Posted January 22, 2011 See fopen(), fgets(), fwrite(), and fclose(). It looks like fgets() will be better than fread() here. Something along these lines $fIn = fopen(filename, 'r'); $fOut = fopen(newFilename, 'w'); while ($line = fgets($fIn)) { $line = '<p>' . str_replace(array("\r","\n"), "", $line) . "</p>\n" ; fwrite($fOut, $line); } fclose($fIn); fclose($fOut); Quote Link to comment Share on other sites More sharing options...
kittrellbj Posted January 22, 2011 Author Share Posted January 22, 2011 Thank you very much, David. That looks like it will do the trick. Also, does anyone think that processing files as large as 2 MB would kill the server? It would only be doing one at a time, of course. Should I chop the files up into chunks or throttle it some other way? Quote Link to comment Share on other sites More sharing options...
ignace Posted January 22, 2011 Share Posted January 22, 2011 Thank you very much, David. That looks like it will do the trick. Also, does anyone think that processing files as large as 2 MB would kill the server? It would only be doing one at a time, of course. Should I chop the files up into chunks or throttle it some other way? Using David's proposal you won't even notice that their is a script running so to speak, in contrast to using explode and file which would pull the entire file contents into memory. Quote Link to comment Share on other sites More sharing options...
kittrellbj Posted January 22, 2011 Author Share Posted January 22, 2011 Thank you very much, David. That looks like it will do the trick. Also, does anyone think that processing files as large as 2 MB would kill the server? It would only be doing one at a time, of course. Should I chop the files up into chunks or throttle it some other way? Using David's proposal you won't even notice that their is a script running so to speak, in contrast to using explode and file which would pull the entire file contents into memory. Yes, I tried it with a very large file on the server, about 5 MB in size. It ran and output the new file in about 12 seconds, and that's with latency time. (From my browser showing the "finished" page and looking into the FTP folder.) Hooking the processor to a Cron and a database will be a piece of cake. I am thinking of running the Cron job every 30 seconds to account for extremely large files being processed, but also to make sure that it doesn't make the people waiting for their conversions wait for very long. On the other hand, I could run the Cron for, say, every 10 minutes or so and process more at that time. I could also process the file at execution if you think that running this script at 50-100 times at a single instance (processing up to 100 MB at a time) would be too much strain for your average server. I know Cron jobs can stress the server, but I'd rather the system be self-sufficient like that so that, on the off-chance that 1000 people submit their 1 MB within the same 5 seconds, the server won't have problems. Of course, the service isn't popular yet - it's a start-up - so the system won't have huge numbers of submissions right off the bat (unless I get very lucky). Any thoughts? Very nice, I appreciate that. This site always has such helpful people, and the people never cease to impress. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.