soycharliente Posted October 14, 2007 Share Posted October 14, 2007 EDIT: I MARKED LINE 25 I have been trying to create a page that will eliminate everything between the < script > tags on a page and then render all the HTML that is leftover. For some reason, the page won't load for me, but when I copy the source, paste it in a text editor, manually delete all the JavaScript, and then just render what is left over, it works fine. I have an error and I don't know why I'm getting this error. I don't know what is incorrect about the code. Parse error: parse error, unexpected ':', expecting ']' in /fantasypoints.php on line 25 <?php if (isset($_POST["submiturl"])) { $url = $_POST["url"]; // create a new cURL resource $ch = curl_init(); // set URL and other appropriate options curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // grab URL and pass it to the browser $source = curl_exec($ch); // close cURL resource, and free up system resources curl_close($ch); $script_start = strpos($source, '<script'); while ($script_start > 0) { $script_start = strpos($source, '<script'); $script_end = strpos($source, '</script>', $script_start) + 9; $source = $source[:$script_start] + $source[$script_end:]; // THIS IS LINE 25 } } ?> <html> <body> <form action="fantasypoints.php" method="post"> <p>Match URL: <input name="url" type="text" value="<?php echo $url; ?>" maxlength="500" size="50" /></p> <p><input name="submiturl" type="submit" value="Submit" /></p> </form> <?php echo $source; ?> </body> </html> Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/ Share on other sites More sharing options...
MadTechie Posted October 14, 2007 Share Posted October 14, 2007 remove the :'s Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/#findComment-369454 Share on other sites More sharing options...
soycharliente Posted October 15, 2007 Author Share Posted October 15, 2007 That didn't work. It rendered: 0 Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/#findComment-369668 Share on other sites More sharing options...
soycharliente Posted October 15, 2007 Author Share Posted October 15, 2007 I've made some progress. I've got the substring part down, but now it never exits the while loop and the $source isn't being updated and I can't figure out why. Here's the code thus far: <?php if (isset($_POST["submiturl"])) { $url = $_POST["url"]; // create a new cURL resource $ch = curl_init(); // set URL and other appropriate options curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // grab URL and pass it to the browser $source = curl_exec($ch); // close cURL resource, and free up system resources curl_close($ch); while (strpos($source, '<script') !== FALSE) { $script_start = strpos($source, '<script'); $script_end = strpos($source, '</script>', $script_start) + 9; $source = substr($source, 0, $script_start) . substr($source, $script_start, $script_end - $script_start); } echo $source; } ?> <form action="fantasypoints.php" method="post"> <p>Match URL: <input name="url" type="text" value="<?php echo $url; ?>" maxlength="500" size="50" /></p> <p><input name="submiturl" type="submit" value="Submit" /></p> </form> Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/#findComment-369682 Share on other sites More sharing options...
soycharliente Posted October 15, 2007 Author Share Posted October 15, 2007 Got it working. Error right here: substr($source, $script_start, $script_end - $script_start); Needed to be: substr($source, $script_end); Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/#findComment-369686 Share on other sites More sharing options...
sKunKbad Posted October 15, 2007 Share Posted October 15, 2007 This might work for you: <?php $url = "removejs.htm"; $input = @file_get_contents($url) or die('Could not access file: $url'); $regexp = "(.*)<script(.*)<\/script>(.*)"; if(preg_match_all("/$regexp/si", $input, $matches)) { echo $matches[1][0]; echo $matches[3][0]; unset($matches[1][0]); unset($matches[2][0]); unset($matches[3][0]); } ?> I forgot the while loop for multiple instances, but wouldn't be hard to do Quote Link to comment https://forums.phpfreaks.com/topic/73222-solved-parsing-html/#findComment-369696 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.