Jump to content

joallen

Members
  • Posts

    19
  • Joined

  • Last visited

Everything posted by joallen

  1. I have no control over the other site unfortunately. I wish they would create an API for us, but that is a shot in the dark . The closest I can get is to use cURL; therefore I have no control over the speed of the requests. The speed of the requests vary dramatically as well. It may take 2 seconds to pull a single order and then you click the same order and it takes 1 minute to pull; I take it this is completely out of my control; right? It isn't so much the visual content I am collecting, it is the data. Actually the data is just displayed in table with bunch of links. Since it is different for each customer account, caching wouldn't help me here; but that is a great suggestion for other implementations of cURL. Thank you for explaining the background process. I am sure I can set up a cronjob in CPanel and use your technique with the database to process the large requests. When a user clicks the button to request the work it will simply add the "job" to the database where a script will be running every 30 seconds in search for new requests. Will this be able to handle multiple requests? Lets say 4 different managers were pulling their work in the morning and all 4 requests are in the data table. Thank you for your help!
  2. Try the following: $conn = mysql_connect("localhost","root","Password"); $err_db = mysql_select_db('bd_amics',$conn); //You want your select_db statement to contain the connection $sql = "INSERT INTO ".$_SESSION["use"][0]." (ID,Amic) VALUES ('".$_SESSION['person'][0]."', '1')"; mysql_query($sql); You do not necessarily need to close the database connection, it will automatically close at the end of the script. If your script was massive and required various functions, procedures, etc, then closing a connection may be necessary. Also I am not sure why you were doing: mysql_query("SET NAMES utf8"); Is your server default not set to utf8? Lastly, in you mysql_query() statement you do not necessarily need to provide the connection information since you are only working with a single connection. If you were using mysqli it would be a different story. Since it is clear you are just starting out with PHP, please get yourself set up with mysqli instead of mysql. I would hate for you to do what I did and program 6 months worth the PHP to realize mysql has been deprecated and will not receive anymore updates or new features. It is pretty hefty task to go back and modify all your mysql statements to mysqli. Hope that works for you! Josh
  3. Im marking this as solved since requinix did answer the main question. For those who may be wondering what my take away is from this, I will sum it up: I cannot necessarily avoid these cURL requests from occurring, nor do I have the ability to speed them up, so my only option at this point was to ensure the max_connections for Apache was increased on the server (I had my host provider handle this); I also had them switch the PHP handler from SuPHP to DSO per their suggestion. I am not sure what the benefit in my situation would be. The key takeaway from requinix is that Apache does not care which file on the server has the connections, just that the max_connections is not reached. Thanks for your help requinix! Josh
  4. Try dumping the $output_mtgox variable after it is decoded to view the array. When looking at blockchain the data is in a json encoded array but there are multiple countries, something mtgox may not have had. I bet you the array will look something like: array(21){ usd(5){ "15m"=>527, etc... }, etc.... } So basically you will need to modify your echo statements to be something like: <?php echo $mtgox_array['usd']['last']; ?> Haven't tested it, but at least that should get you in the right direction. If you use blockchain you could technically go a step further than the example site you gave and provide the results in the various currencies by using a jQuery slideshow or something of the sort. p.s. make sure you change the curl_init statement to the blockchain url . Good Luck! Josh
  5. Is this a hosted website? If so, I would suggest to reach out to the provider to ensure MySQL was configured correctly. Do you use PHPMyAdmin to access to your data tables? Do you use C-Panel to manage your site? Were you able to create any databases? Have you modified your mysql_connect statement (if using mysql) to log into the correct database with the correct login information? We need some more information. Hopefully some of those questions can at least get you looking in the right direction. Good Luck! Josh
  6. function checkSku($check_for){ $query=$cc_db->query("SELECT `sku` FROM `product` WHERE `sku`='".$check_for."'"); //NEED SOME ERROR HANDLING ON YOUR QUERY HERE $fetch=$cc_db->fetch_object($query); $checking=$cc_db->num_rows($query); if ($checking < 1){ return true; }else{ return false; } } if(isset($_POST['submit'])){ $product = $_POST['product']; $brand = $_POST['brand']; $description = $_POST['description']; $contruct=strtoupper($product."-".$brand."-".$description); $number = 0; $isNewSku = false; while($isNewSku == false){ $number=$number+1; $number=sprintf("%03s",$number); $check_for=$contruct."-".$number; $isNewSku = checkSku($check_for); } //DO WHAT YOU WANT WITH YOUR $check_for variable WHICH WILL CONTAIN YOUR NEW SKU HERE echo $check_for. 'is a new sku'; } Haven't tested it but Im pretty sure it should work for what you need. Good Luck! Josh
  7. give the following a try: while ($row = mysql_fetch_assoc($result)){ DO WHAT YOU NEED WITH EACH ROW OF THE DATA HERE } By using mysql_fetch_assoc you can use the name of the column (eg. $row['account_num']) instead of referencing the key in the array. You could even build your html and set it to a variable like this: <?php $business_name = $_GET['business_name']; $this_query = "Select business_name, account_num, main_contact, business_phone, business_email, business_suite, business_address, business_city, business_region, business_province, business_postal FROM orders WHERE sale_datetime in (Select Max(sale_datetime) from orders Group by business_name) and business_name = '$business_name' ORDER BY business_name ASC "; $mytablehtml = ''; $result = mysql_query($this_query) or die(mysql_error()); if (!$result) { echo "Query did not run - ".MySQL_error(); //This will spit out any errors with the sql statement exit(); } while ($row = mysql_fetch_assoc($result)){ $mytablehtml .= '<tr><td>'.$row['business_name'].'</td><td>'.$row['account_num'].'</td></tr>'; } ?> <html> <head></head> <body> <table> <thead><tr><th>Business Name</th><th>Account Number</th></tr></thead> <tbody><?php echo $mytablehtml;?></tbody> </table> </body> </html> Keep in mind, when using the if (!result) function it will output the error with the sql statement on the page and that is it. This is a great way to debug your sql statements as it will pinpoint what the problem is. Personally I like to remove the mysql_error() part of echo statement in real applications and just echo a text stating what the problem might be as the mysql_error() statement contains information regarding file paths and server info. Hope that helps!
  8. You can handle something like this using PHP and cURL. Here is a cURL process which will take an array of urls and process them: function requestData($urls){ // Create get requests for each URL $mh = curl_multi_init(); foreach($urls as $i => $url) { $ch[$i] = curl_init($url); curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, 1); curl_multi_add_handle($mh, $ch[$i]); } // Start performing the request do { $execReturnValue = curl_multi_exec($mh, $runningHandles); } while ($execReturnValue == CURLM_CALL_MULTI_PERFORM); // Loop and continue processing the request while ($runningHandles && $execReturnValue == CURLM_OK) { // Wait forever for network $numberReady = curl_multi_select($mh); if ($numberReady != -1) { // Pull in any new data, or at least handle timeouts do { $execReturnValue = curl_multi_exec($mh, $runningHandles); } while ($execReturnValue == CURLM_CALL_MULTI_PERFORM); } } // Check for any errors if ($execReturnValue != CURLM_OK) { trigger_error("Curl multi read error $execReturnValue\n", E_USER_WARNING); } // Extract the content foreach($urls as $i => $url) { // Check for errors $curlError = curl_error($ch[$i]); if($curlError == "") { $res[$i] = curl_multi_getcontent($ch[$i]); } else { $res[$i] = ''; } // Remove and close the handle curl_multi_remove_handle($mh, $ch[$i]); curl_close($ch[$i]); } // Clean up the curl_multi handle curl_multi_close($mh); // Print the response data return $res; } Hope that gets you in the right direction. Good Luck! Josh
  9. Unfortunately there isn't much I can do to avoid the scraping as the information I am requesting from the other site is also dynamic. Its a little difficult to explain, but what the scraping actually does is it allows me to serve the html from the other site but handle all of the functions on my site. For instance, a user opens their route on my site and clicks the link to refresh their route. What it actually does is processes a cURL request to the site which actually gives us the work and returns the html. I parse the html and swap out any hrefs with my own javascript:function and then serve the page to the user. So now the user will see the content of the other site, but my site can "listen" in to what they are actually doing as they navigate. The user thinks the links are normal links, but what actually happens is a function is called which sends a cURL request for the html. Although that sounds like some fishy business , it actually isn't and the users of my site are well aware of what is happening. It basically enables them to have access to not only my system, but it integrates the other system so our data can match perfectly. Yes, there are cURL requests which can take up to 5 minutes, but those are the requests which are handled when the admin users pull all of their work (generally in the morning). While the employees are working throughout the day the cURL requests are generally anywhere from 5-10 seconds, but I could imagine there may be hundreds of cURL requests happening simultaneously, all processed through the functions.php file. You mentioned creating a background process on the server to handle the long requests, can you elaborate on that a little more. I thought about this in the past and didn't really have any idea where to start. Can a background process be started on user interaction? Is there a way I can advise the user the process is complete? Is there a better method that you can think of to accomplish what I am trying to do here? It is working perfectly, but as the number of users increases, so will the number of calls to the functions.php file. Basically, as the users navigate the other site I am serving through my own, each link they click results in a cURL request. I really appreciate your help!
  10. You can accomplish this by using a .htaccess file in the root directory and let apache handle it rather than php. I am not completely versed in .htaccess, but I have used it in the past to develop a redirect site similar to bit.ly. A Google search on .htaccess should get you in the right direction. I apologize I cannot give you concrete examples, but .htaccess came to mind when I read this post. You can have the .htaccess file put the referrer information into url variables so they can be used in a $_GET method in your php function when the .htaccess file redirects. Hope I can at least get you in the right direction. Good Luck! Josh
  11. Over the past few days I have been running into some issues with my server crashing due to apache max connections issues. I am running my site off of a hosted Cloud VPS with 200GB of storage, 8192MB Memory, 8TB of transfer, Apache, MySQL, PHP5, and CentOS. I am afraid the issue doesn't necessarily lay in the configuration of Apache, but the way I have scripted the php on my site, the reason I am reaching out here. My site isn't your average website, it is more of a web-based customer management program. There are currently only 2 pages you can actually access via the url bar (signin.php and index.php). All other content is loaded via AJAX and JQuery processes (.load and $.getScript). All AJAX requests are pointed toward a single file called functions.php where a _POST parameter contains the function name and any additional _POST data required by the function. FOR EXAMPLE: AJAX Call: $.post('functions/functions.php',{func:'myFunctionName',ops:'whatever',a:'whatever',b:'whatever'},function(data){ DO WHATEVER I WANT WITH THE RETURN DATA HERE },"json"); PHP (functions.php) require 'dbcon.php'; include 'main_class.php'; include 'f_customerdetail.php'; include 'f_listoptions.php'; include 'f_route.php'; include 'f_useractions.php'; if (isset($_POST['func'])){ $userfunc = $_POST['func']; $funcops = $_POST['ops']; if ($funcops != ''){ $userfunc($funcops); }else{ $userfunc(); } } The functions.php file includes all of my other php files containing all of the functions. Each of the other files (f_customerdata.php, f_route.php, f_payroll.php, etc.) contains a number of functions which handle that specific genre of the site; this was more of an organization method I used to keep track of things. Now that you have a little background, I want to know if that is a toxic way to do things? If I currently have 100 people using the site and anytime they navigate it requests data from the functions.php file then that means there are going to be a ton of requests pointing to that single file, thus causing apache to crash; correct? There are multiple functions which use cURL to scrape data from another website as well. Therefore, a connection to the functions.php file may last in upwards of 5 minutes depending upon the function. A large issue as that all of the content on the site is completely dynamic; it is completely driven by getting data from the database and displaying it. Am I going about this correctly by having a single file handling all of the functions? Or, do I need to re-approach it by pointing the AJAX requests directly to the file containing the functions for that particular situation? I know this is a large question. I am completely self-taught, 4 years experience, and have developed a massive project over the last 6 months. I just want to be sure I am going about this the correct way. Thank you for your input, Josh
  12. Verify the path to the php file containing your mysql_connect statement. Also I noticed a "p" at the end of mysql_connect. That may be the issue.
  13. I finally got it to work, but it was one of those "I have no clue what I did to make it work" type things. I modified a few sql statements in the processSubItem function which may have been hanging up the script. I also added "return true" at the end of the processSubItem function which I did not think was necessary at all. Can someone advise if it is necessary to return something if you do not intend to receive any info from a function? In this case the rolling_curl function calls processSubItem each time a cURL request completes. Is it necessary for the processSubItem function to return a value? Also, can someone advise what would constitute a "large" cURL request? Thousands of urls seems large to me, but what should a cURL multi request be able to process? and generally how fast? I know it depends on the server I am requesting information from, but I look at it this way, if I navigate to a page on that server it displays almost instantly, should the cURL request occur in the same amount of time for that same url? This is bothering me because another site has somewhat already accomplished what I am trying to do. But, of course, I cannot view their PHP script and there is no way they would be willing to share it with me =0). Both of the cURL scripts work perfectly by the way; I do not believe that was the issue. So for those who are reading up on cURL multi-requests, the two provided above will more than likely suffice for your needs (again, I am able to pull 300 urls within a matter of 1.6 minutes max). I appreciate the help!!
  14. Your query: $display_brands_query = "select manufacturers_id,manufacturers_name,manufacturers_image from manufacturers where manufacturers_image != '' ORDER BY RAND() Limit 18;"; Change To: $display_brands_query = "select manufacturers_id,manufacturers_name,manufacturers_image from manufacturers where manufacturers_image IS NOT NULL ORDER BY RAND() Limit 18;"; Just noticed your select statement was correct....could potentially be the smiley face in your script?????....lol. Can you edit your post and ensure your script is placed between the script tag in the forum?? It would help us help you... Thank you! =0) Cheers!
  15. Use the following: mysql_real_escape_string($yourvariablehere). You have to tell apache to ignore any characters and simply accept the variable as a string. Also ensure the fields in your database are set up correctly. A field containing large amounts of text should be set to TEXT or BLOB. Fields with a smaller number of characters can be set to VARCHAR and specify the number of characters allowed...generally 255 max if you aren't 100% sure the length of the values going to be inserted. 255 will at least accommodate urls if you are inserting them to a table. You may also want to specify which fields you are inserting into. See below: <?php if(isset($_POST['Guardar'])){ $semana= mysql_real_escape_string($_POST['semana']); $video= mysql_real_escape_string($_POST['video']); $archivo= mysql_real_escape_string($_POST['archivo']); $nombre= mysql_real_escape_string($_POST['nombre']); $leccion= mysql_real_escape_string($_POST['leccion']); //verificar si la semana ya existe $sql = "SELECT * FROM clases WHERE semana='".$semana."'"; $result = mysql_query($sql) $existe = mysql_num_rows($result); if($existe==0){ $sql = "INSERT INTO clases (semana, nombre, archivo, leccion, video) VALUES ('".$semana."','".$nombre."','".$archivo."','".$leccion."','".$video."')"; mysql_query($sql); } else{ $row = mysql_fetch_assoc($result); $sql = "UPDATE clases SET nombre="'.$nombre."',archivo='".$archivo."',nombre_leccion='".$leccion."',video='".$video."' WHERE id=".$row['id'] //You should always have a field, in this case named id, which acts as a primary key and auto-number for records. mysql_query($sql); } header("Location: clases.php"); } ?> You will notice I use $row['id'] in the Update script. You should always have a column which is an auto-number primary key for your tables. It makes updating data much more efficient. In the event you had the same value in semana on multiple rows in your table, it would update them all. If this was your intent then you can remove my script. I take it your database only contains unique values in the semana column. Cheers!
  16. First off, I sincerely appreciate your quick response!! The $info variable returns the following: array(20) { ["url"]=> string(241) "I REMOVED THE URL DUE TO THE LEGAL STUFF =0) BUT IT WAS HERE" ["content_type"]=> string(23) "text/html;charset=UTF-8" ["http_code"]=> int(200) ["header_size"]=> int(220) ["request_size"]=> int(271) ["filetime"]=> int(-1) ["ssl_verify_result"]=> int(0) ["redirect_count"]=> int(0) ["total_time"]=> float(0.939846) ["namelookup_time"]=> float(1.5E-5) ["connect_time"]=> float(0.173083) ["pretransfer_time"]=> float(0.371921) ["size_upload"]=> float(0) ["size_download"]=> float(7950) ["speed_download"]=> float(8458) ["speed_upload"]=> float(0) ["download_content_length"]=> float(-1) ["upload_content_length"]=> float(0) ["starttransfer_time"]=> float(0.886433) ["redirect_time"]=> float(0) } And the $content variable did return the corresponding html for the url. It seems to have processed fine. Took 7.11 seconds to log the user in, fetch sub-item urls, which there were 35, and call the rolling_curl script to dump the aforementioned data. Let me elaborate a little further. Consider the following functions (albeit they are simply for logic sake). The rolling_curl function will return true if it completes; should I be checking for that? And how would I go about doing that?: function getSubItems(){ //log in users //get urls and put into an array named $urls //get activities and put into an array named $activities $curlData = array('urls'=>$urls,'activities'=>$activities); $rd = new curlProcess; $processCurl = $rd->rolling_curl($curlData,'processSubItem'); } function processSubItem($content, $activity, $updated){ //updated will always be 1 for now, you will see it in the rolling_curl callback. //array of data needed $dataneeded = array('blah','blah','blah','etc','etc'); $datatoinsert = array(); //Parse html using simpleHTMLDOM scripts $html = str_get_html($content); $tds = $html->find('td'); foreach ($tds AS $td){ //Get all the data I need and put into an array if (in_array($td->innerhtml,$dataneeded) !== false)){ $datatoinsert[] = $td->innerhtml; } } foreach ($datatoinsert AS $data){ //Insert the data into the database } } FYI: While I was writing this I ran the script again and it took 3.3 minutes and returned failed for all 35 sub-items. Thank you in advance!!!
  17. Your Code: <?php include('header.php'); include('config.php'); if (isset($_POST['submit'])) { $username = ($_POST['username']); $password = ($_POST['password']); $query = "SELECT * FROM login WHERE user_name='$username' AND pass_word='$password' LIMIT 1"; $result = mysql_query($query) or die(mysql_error()); if(mysql_num_rows($result)) { header('location:home.php'); exit; } else { $error = "Wrong Username / Password"; } } ?> <table class="login" align="center"> <tr> <td class="table1" > Student Information System</td> </tr> </table> <div class="table2"> <form method="post"> <table class="table3" align="center"> <?php if(isset($error)): ?> <tr> <td colspan="2" style="color: red; font-weight: bold"><?php echo $error; ?></td> </tr> <?php endif; ?> <tr> <td>Username</td> <td><input type="text" name="username"></td> </tr> <tr> <td>Password</td> <td><input type="password" name="password"></td> </tr> <tr> <td colspan="2" align="center"><input type="submit" name="submit" value="LogIn"></td> </tr> </table> </form> </div> <?php include('footer.php'); ?> Change To: <?php include('header.php'); include('config.php'); if (isset($_POST['submit'])) { $username = ($_POST['username']); $password = ($_POST['password']); $error = ''; $query = "SELECT * FROM login WHERE user_name='".$username."' AND pass_word='".$password."' LIMIT 1"; $result = mysql_query($query) or die(mysql_error()); $numresults = mysql_num_rows($result); if($numresults > 0) { //Needs something to set a cookie or session data to check if user is logged in. header('location:home.php'); exit; } else { $error = "Wrong Username / Password"; } } ?> <table class="login" align="center"> <tr> <td class="table1" > Student Information System</td> </tr> </table> <div class="table2"> <form method="post"> <table class="table3" align="center"> <tr> <td colspan="2" style="color: red; font-weight: bold"><?php if(isset($error)){echo $error;} ?></td> </tr> <tr> <td>Username</td> <td><input type="text" name="username"></td> </tr> <tr> <td>Password</td> <td><input type="password" name="password"></td> </tr> <tr> <td colspan="2" align="center"><input type="submit" name="submit" value="LogIn"></td> </tr> </table> </form> </div> <?php include('footer.php'); ?> I will generally use mysql_num_rows to verify if any results returned from a query. Although it is an additional step, It enables me to utilize the variable if needed in other parts of the script. It doesn't seem like you are setting anything such as a session variable or cookie so your other pages know the user is logged in. In essence all they would have to do is just navigate to your home.php page if they knew the url and it wouldn't stop them from viewing the information. How does your login page know they aren't already logged in? A simple method would be to set a session variable such as a md5 hash of the user's id in your database. See below: <?php //Start the session. This must be at the beginning of every page if you are going to check the session variable. session_start(); //Check to see if the visitor is logged in by checking for $_SESSION['id']. If they are then simply send them to the home.php page. if (isset($_SESSION['id'])){ header('Location: home.php'); } include('header.php'); include('config.php'); if (isset($_POST['submit'])) { $username = ($_POST['username']); $password = ($_POST['password']); $error = ''; $query = "SELECT * FROM login WHERE user_name='".$username."' AND pass_word='".$password."' LIMIT 1"; $result = mysql_query($query) or die(mysql_error()); $numresults = mysql_num_rows($result); if($numresults > 0) { $row = mysql_fetch_assoc($result); $_SESSION['id'] = md5($row['id']); <-----This should reference a field in your database table that is set as the primary key and an auto-number. If you aren't using one then you should start. header('Location: home.php'); exit; } else { $error = "Wrong Username / Password"; } } ?> <table class="login" align="center"> <tr> <td class="table1" > Student Information System</td> </tr> </table> <div class="table2"> <form method="post"> <table class="table3" align="center"> <tr> <td colspan="2" style="color: red; font-weight: bold"><?php if(isset($error)){echo $error;} ?></td> </tr> <tr> <td>Username</td> <td><input type="text" name="username"></td> </tr> <tr> <td>Password</td> <td><input type="password" name="password"></td> </tr> <tr> <td colspan="2" align="center"><input type="submit" name="submit" value="LogIn"></td> </tr> </table> </form> </div> <?php include('footer.php'); ?> Keep in mind, $_SESSION['id'] can be set to anything. You could do $_SESSION['myid'] or $_SESSION['itdoesntmatterwhatyounameit']. You are simply setting a variable to the user's internet session which you can access from any page. The session will eventually timeout, at which point they will be redirected to the login page if they try to navigate to a page which requires them to be logged in. A session can last for days though. So if you require a timeout feature then you will need to write a script to check the time based upon however you decide, and if the user is supposed to be timed out then the script should destroy the session variable using unset($_SESSION['id']) or unset($_SESSION['whatever your variable is']) and redirect the user to the login.php page. If you decide to use the session variable method then at the beginning of every php page you direct a user to which requires them to be logged in should have the following: <?php session_start(); if (!isset($_SESSION['id'])){ header("Location: login.php"); } //// The rest of your page's script here ?> Cheers!
  18. The portion of your code below: //header('Location: ?success'); //echo "your enquiry was submitted successfully, thank you,"; header('Location: contme.php'); echo "Form submitted succesfully, thank you"; } else { echo "Form failed to send"; } } Change To: //header('Location: ?success'); //echo "your enquiry was submitted successfully, thank you,"; header('Location: thanks.htm'); } else { echo "Form failed to send"; } }
  19. I am having a heck of a time trying to process a large cURL request. I keep running into issues with the mysql server timing out and also using the callback function within the cURL script (see below). What I am attempting to do is to utilize cURL to log a user into a system (*due to legality issues I cannot specify which) and pull all of their work for the day. I have been successful at pulling all of the work, but each order contains multiple sub-items, each with a specific url. For instance, 300 work orders would translate to approximately 2000 sub-items. Pulling the 300 work order takes approximately 1.6 minutes. For some reason just pulling 10 sub-items is taking in upwards of 3 minutes. After hundreds (and I am not exaggerating) of attempts I have finally decided to reach out to see if someone can take a look at my script and offer some knowledge. Here is the process from a logic standpoint: Pull all user login data from database and log them into the system through cURL (*Works fine) Request all activity and customer information and Insert into database (*Works fine) Get all sub-items and insert them into the database (*ISSUES) Here is the process from a script standpoint: User clicks "Import" button which sends AJAX request to run importWork PHP function. This function only handles requesting the activity and customer information through cURL. (Due to the amount of time it takes for the sub-items to process I have broken up the process). importWork function returns via jSON the number of work orders processed. ***In testing I have also had the importWork function store the urls for all of the sub-items to my database. The only issue is that the logins will start to timeout (Not on my server but the server I am pulling the data from) before all the sub-items can process. javascript automatically sends another AJAX request to pull all of the sub-items. I am using a cURL Multi function to process the url requests. The function will return an array containing the html for each of the urls. I then parse the html to search for the underlying hrefs I need to access the workorders, customer information, and sub-items. So overall, my question is, what is the best way to handle a large cURL request of 2000 urls? Below you will see the rolling_curl function which I am attempting to use to handle the line items. For some reason it doesnt work at all. What I would like to do is simply send an array of urls to the rolling_curl function and have it request all the html for each url. Once a url is finished processing it should run the callback script to insert the data into the database. I figured it would be the best way to handle such a large request in a timely manner. ROLLING CURL FUNCTION: explanation: A function will put all sub-item urls and the corresponding activity ids into an associative array and pass it to the rolling_curl function. The callback function will parse the html and insert the needed data into the database. The only thing this function is doing at this time is dumping "Failed". I have ran the script using the same urls through the standard cURL multi function (See Below) and verified it is pulling the html (So it isn't an issue with the urls). public function rolling_curl($urldata, $callback = null, $custom_options = null) { set_time_limit(0); //extract data from $urldata $urls = $urldata['urls']; $activities = $urldata['activities']; // make sure the rolling window isn't greater than the # of urls $rolling_window = 95; $rolling_window = (sizeof($urls) < $rolling_window) ? sizeof($urls) : $rolling_window; $master = curl_multi_init(); $curl_arr = array(); // add additional curl options here $std_options = array(CURLOPT_RETURNTRANSFER => true, CURLOPT_FOLLOWLOCATION => true, CURLOPT_MAXREDIRS => 5); $options = ($custom_options) ? ($std_options + $custom_options) : $std_options; // start the first batch of requests for ($i = 0; $i < $rolling_window; $i++) { $ch = curl_init(); $options[CURLOPT_URL] = $urls[$i]; curl_setopt_array($ch,$options); curl_multi_add_handle($master, $ch); } do { while(($execrun = curl_multi_exec($master, $running)) == CURLM_CALL_MULTI_PERFORM); if($execrun != CURLM_OK) break; // a request was just completed -- find out which one while($done = curl_multi_info_read($master)) { $info = curl_getinfo($done['handle']); if ($info['http_code'] == 200) { $output = curl_multi_getcontent($done['handle']); // request successful. process output using the callback function. $ref = array_search($info['url'],$urls); $callback($output, $activities[$ref],1); // start a new request (it's important to do this before removing the old one) $ch = curl_init(); $options[CURLOPT_URL] = $urls[$i++]; // increment i curl_setopt_array($ch,$options); curl_multi_add_handle($master, $ch); // remove the curl handle that just completed curl_multi_remove_handle($master, $done['handle']); } else { // request failed. add error handling. $dmp = 'Failed!'; var_dump($dmp); } } } while ($running); curl_multi_close($master); return true; } } STANDARD cURL MULTI FUNCTION: public function requestData($urls) { set_time_limit(0); // Create get requests for each URL $mh = curl_multi_init(); foreach($urls as $i => $url) { $ch[$i] = curl_init($url); curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, 1); curl_multi_add_handle($mh, $ch[$i]); } // Start performing the request do { $execReturnValue = curl_multi_exec($mh, $runningHandles); } while ($execReturnValue == CURLM_CALL_MULTI_PERFORM); // Loop and continue processing the request while ($runningHandles && $execReturnValue == CURLM_OK) { // Wait forever for network $numberReady = curl_multi_select($mh); if ($numberReady != -1) { // Pull in any new data, or at least handle timeouts do { $execReturnValue = curl_multi_exec($mh, $runningHandles); } while ($execReturnValue == CURLM_CALL_MULTI_PERFORM); } } // Check for any errors if ($execReturnValue != CURLM_OK) { trigger_error("Curl multi read error $execReturnValue\n", E_USER_WARNING); } // Extract the content foreach($urls as $i => $url) { // Check for errors $curlError = curl_error($ch[$i]); if($curlError == "") { $res[$i] = curl_multi_getcontent($ch[$i]); } else { return "Curl error on handle $i: $curlError\n"; } // Remove and close the handle curl_multi_remove_handle($mh, $ch[$i]); curl_close($ch[$i]); } // Clean up the curl_multi handle curl_multi_close($mh); // Print the response data return $res; } An assistance would be greatly appreciated!!! I am racking my head against my desk at this point =0). I am open to any suggestions. I will completely scrap the code and take an alternate approach if you would be so kind as to direct me accordingly. FYI - I am running on a hosted, shared server which I have little control over. PHP plugins might not be a route I can take at this point. But if there is something you know of that will assist me, shoot it at me and I will talk with my hosting provider. THANK YOU!!!!
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.