adam.w.k Posted May 3, 2009 Share Posted May 3, 2009 I've just made my first PHP application and have been coding it offline on my laptop using XAMPP. I got it working how I needed and uploaded it to my webspace, and to my surprise while in general it still works there are some things it fails to do which have baffled both me and my friend who is a PHP programmer. The application requests online pages using file_get_contents. Once it has the data it strips the style html and writes the new data to a new file using fopen($data, 'w') and fwrite(). This works well on my laptop offline and you can request such pages as wikipedia articles, some news articles and my own website's main page, and it strips the content from most of the styling. However when I uploaded it to my webspace, it continues to work on sites like news articles and my website's main page, but it fails to work with wikipedia articles (acts like there was an invalid URL and the file isn't found). (So basically the application seems to work the same online as it does offline, with the exception of wikipedia articles which only work offline.) The main code that's being used is posted below, and please bare in mind this is my first time using PHP so it might look a little ugly to you, but this is only for a relatively small 2 week project at university so it doesn't have to be perfect. <?php error_reporting(E_ALL ^ E_NOTICE); $new_file = "newFile.html"; $break = "<br/>"; $insert= ""; $extrabreaks = ""; $turn_off = false; if(safe_output($_GET["action"]) == "Submit"){ // INPUT TYPE // // If 'URL' checkbox is ticked // if(safe_output($_GET["source"]) == "url"){ $original_file = "http://" . safe_output($_GET["url"]); } // If 'wiki' checkbox is ticked // else if(safe_output($_GET["source"]) == "wiki"){ $original_file = "http://en.wikipedia.org/wiki/" . safe_output($_GET["wiki"]); } // BREAKS // // If 'extra breaks' checkbox is ticked // if(safe_output($_GET["extrabreaks"]) == "true"){ $insert = "<br/>"; $break .= "<br/>"; $extrabreaks = true; } // TYPE OF RESULT // // If 'plain text' checkbox is ticked // if(safe_output($_GET["stripper"]) == "true"){ $stripper = true; } // If 'HTML' checkbox is ticked // else{ $stripper = false; } } // Lines if(file_get_contents($original_file)){ // If file exists and is readable // // Store each line of file in $lines // $file = file_get_contents($original_file); $lines = explode("\n", $file); // Start function to run for every line of file // loop_lines($lines, $new_file, $stripper, $break, $insert, $extrabreaks, $turn_off); } else{ // If file doesn't exist // include "error_message.php"; } // FUNCTIONS // // Make output safe // function safe_output($string){ $string = trim($string); $string = strip_tags($string); $string = htmlspecialchars($string); return $string; } // Run tests on each line of original file to decide if it gets output // function passes_tests($line){ if(preg_match('{ *<\/?[pP]>}', $line) // Positive // || preg_match('{ *<[hH]1}', $line) || preg_match('{ *<[hH]2}', $line) || preg_match('{ *<[hH]3}', $line) || preg_match('{ *<\/?a[ >]}', $line) || preg_match('{ *<br ?\/>}', $line) || preg_match('{^[A-Z]+[a-zA-Z ]* ?,?[a-zA-Z ]*\.}', $line) || preg_match('{^[A-Z]+[a-zA-Z ]*}', $line) || preg_match('{the|of|you|your|that|with|they|have|had|were}', $line) || preg_match('{ *<\/?[lL][iI][ >]}', $line) || preg_match('{ *<\/?[uU][lL][ >]}', $line) || preg_match('{ *<\/?[oO][lL][ >]}', $line) || preg_match('{ *<\/?[dD][dD][ >]}', $line) || preg_match('{ *<\/?[tT][dD][ >]}', $line) || preg_match('{ *<\/?[tT][rR][ >]}', $line) || preg_match('{ *<\/?[tT][hH][ >]}', $line) || preg_match('{ *<\/?[eE][mM][ >]}', $line) || preg_match('{ *<\/?[sS][tT][rR][oO][nN][gG][ >]}', $line) || preg_match('{ *<\/?[dD][iI][vV][ >]}', $line) || preg_match('{ *<\/?[tT][aA][bB][lL][eE][ >]}', $line) || preg_match('{ *<\/?[sS][pP][aA][nN][ >]}', $line) && !preg_match('{^\.source}', $line) // Negative // && !preg_match('{.*? ?:.*?;}', $line) && !preg_match('{ *var .+=}', $line)){ return true; } else{ return false; } } // Replace necessary parts of each line before output // function replace_parts($line){ $line = preg_replace('{–}', '-', $line); $line = preg_replace('{—}', '—', $line); $line = preg_replace('{“}', '”', $line); $line = preg_replace('{â€}', '“', $line); $line = preg_replace('{“¢}', '•', $line); $line = preg_replace('{·}', '·', $line); $line = preg_replace('{if ?\(.*\) ?[;\{] ?}', '', $line); $line = preg_replace('{function ?.+ ?\(.*\) ?[ ;\)\}]}', '', $line); $line = preg_replace('{var .+?\;}', '', $line); $line = preg_replace('{@import ?[\'\"]/?.+/?.+\....[\'\"];}', '', $line); return $line; } // Detect Comments // function detect_comment($line){ if(preg_match('{\/\*}', $line) || preg_match('{<\!--}', $line)) { return true; } if(preg_match('{\*\/}', $line) || preg_match('{-->}', $line)) { return false; } } // Loops and outputs each line of data into the new file // function loop_lines($lines, $new_file, $stripper, $break, $insert, $extrabreaks, $turn_off){ // Grab onto the handle for new writing $file_handle = fopen($new_file, 'w'); // Loop through the lines foreach($lines as $line){ if(passes_tests($line)){ // Replace necessary parts within line // $line = replace_parts($line); // If no comment detected // if(detect_comment($line) == false) { replace_parts($line); if($stripper == true){ // Insert one break $insert = $break; // Strip tags $line = strip_tags($line); } if($extrabreaks == true){ // Write line to new file using handle fwrite($file_handle, $line . $insert); } else{ // Write line to new file using handle fwrite($file_handle, $line . $insert); } } } } include "download_link.php"; // Close handle fclose($file_handle); // Display file readfile($new_file); } ?> I'd be extremely grateful if anyone can help me shed light on what's going on. I've already been to my only friend with PHP experience and he wasn't able to help but he suggested I come on here and ask. Also this is my first time posting on a forum of this kind so please let me know if more info is needed and if I missed anything obvious. Link to comment https://forums.phpfreaks.com/topic/156703-offline-to-online-problems-small-app-using-file_get_contents-fopen-and-fwrite/ Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.