TravisT Posted May 14, 2011 Share Posted May 14, 2011 Hello, I am new to the forum and somewhat new to php, nice to meet you all. This is the first time I have ever really scripted with PHP so I'm still learning about all the tools I have and what I have to call things in PHP. I have a list of urls and as I loop through each one, I'd like to be able to get information from the webpage. The <title> would be a good start. I also want to know the best way for me to compare data I have. I'll show the basic code below, but I successfully go through each url in this text file. I put it in a <ul><li> list just fine. So if $url == http://www.youtube.com/file how is the normal way to check and see if the word "youtube" is in $url? I found preg_match() but I think I'm approaching the whole thing wrong because I get no output. I am an intermediate to somewhat advanced scripter in other languages similar to php, I just need to learn how you do the normal things in PHP. So I'd like to compare a string "youtube" to a variable '$url'. And I would like to be able to grab the title or other info from the file $url. Here is what I have so far. (Recent research showed me how I should do this with an XML file so I will probably change the .txt to .xml) Can you please tell me what to look for as I have been searching and can't really find a comprehensive answer. I changed the whole page to an echo trying to fix something last night. Before it was written like.. <?php if ($true) { $var = value ?> <html code>The value is <?php $var ?> .</html code> <?php } ?> /index.php <?php include 'include/header.html'; echo "<div id='wrapper'> <div id='left'> <div class='article'> <br /> <p>"; echo "Today is " . date("l") . ", the " . date("jS") . " of " . date("F") . "."; $lines = file('data/news.txt'); if ($lines){ foreach ($lines as $line_num => $line) { $url = htmlspecialchars($line); //Now I have url. I want to check the url and get the <title> & misc. data. //if youtube is in $url {html code to embed youtube}; //my attempt was $x = file($url); but I got a lot of 404 and 403 errors. //now I fill html. echo "<ul id='menu1' class='auroramenu'> <li><a href='#'>Story ".$line_num."</a> <a style='display: none;' class='aurorashow' href='#'></a> <a style='display: inline;' class='aurorahide' href='#'></a> <ul> <br /> <p>".$url."</p><br /> <li style='text-align:right;'><a href='".$url."' target='_blank'>Read the story.</a> </li> </ul> </li> </ul>"; } } echo "</p><br /> </div> </div> <div id='right'>"; include 'include/sidebar.html'; echo "</div><br class='clr' /></div><br />"; include 'include/footer.html'; ?> Thank you for your help. Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/ Share on other sites More sharing options...
TravisT Posted May 14, 2011 Author Share Posted May 14, 2011 I am reading through the forum right now. Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/#findComment-1215445 Share on other sites More sharing options...
anupamsaha Posted May 14, 2011 Share Posted May 14, 2011 Go for strstr() or stristr() to get your job done. Read more from here: http://php.net/manual/en/function.strstr.php Hope it helps. Thanks! Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/#findComment-1215464 Share on other sites More sharing options...
TravisT Posted May 14, 2011 Author Share Posted May 14, 2011 Thanks! I'm trying that out. I found a post showing how to use preg_match better and that helped. StrStr() seems easier. Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/#findComment-1215484 Share on other sites More sharing options...
QuickOldCar Posted May 14, 2011 Share Posted May 14, 2011 Here's a way I came up with. If anyone has better or faster methods tan this I'd love to hear it. I parse the url to find the host, then match against that, you could easily be finding the word youtube or youtube.com in any part of a url. Example would be: http://mysite.com/out.php?url=http://www.youtube.com/movies Stripping the protocol, exploding the / , using $variable[0], and then preg_match also works. If you want fast displaying results on a page in whatever order look into multi-curl. This is the simple method and should find most titles but not all. <?php //check if youtube function function checkYoutube($inserturl) { $inserturl = strtolower(trim($inserturl)); if(substr($inserturl,0,5) != "http:"){ $inserturl = "http://$inserturl"; } $parsedUrl = parse_url($inserturl); $host = trim($parsedUrl['host'] ? $parsedUrl['host'] : array_shift(explode('/', $parsedUrl['path'], 2))); $checkhost = "youtube.com"; // match if(preg_match("/$checkhost/i", $inserturl)){ return TRUE; } else { return FALSE; } } //read a file $my_file = "urls.txt";//change file name to yours if (file_exists($my_file)) { $data = file($my_file); $total = count($data); echo "<br />Total urls: $total<br />"; foreach ($data as $line) { if($line != "" && checkYoutube($line) == TRUE){ $url = trim($line); //making sure any url has the http protocol if(substr($url,0,5) != "http:"){ $url = "http://$url"; } //using curl is better for more options, setting the timeout matters for speed versus accuracy $context = stream_context_create(array( 'http' => array( 'timeout' => 8 ) )); //get the content from url $the_contents = @file_get_contents($url, 0, $context); //alive or dead condition if (empty($the_contents)) { $status = "dead"; $color = "#FF0000"; $title = $url; } else { $status = "alive"; $color = "#00FF00"; preg_match("/<title>(.*)<\/title>/Umis", $the_contents, $title); $title = $title[1]; //$title = htmlspecialchars($title, ENT_QUOTES); //saving data to database } //show results on page echo "<a style='font-size: 20px; background-color: #000000; color: $color;' href='$url' TARGET='_blank'>$title</a><br />"; } } } else { echo "Can't locate $my_file"; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/#findComment-1215511 Share on other sites More sharing options...
QuickOldCar Posted May 14, 2011 Share Posted May 14, 2011 I made a slight error as I wasn't checking just the host area but the entire url. I made the changes here. For anyone wanting to use this just make a text file named urls.txt in the same folder of this script. Place the urls 1 per line. <?php //check if youtube function function checkYoutube($inserturl) { $inserturl = strtolower(trim($inserturl)); if(substr($inserturl,0,5) != "http:"){ $inserturl = "http://$inserturl"; } $parsedUrl = parse_url($inserturl); $host = trim($parsedUrl['host'] ? $parsedUrl['host'] : array_shift(explode('/', $parsedUrl['path'], 2))); $checkhost = "youtube.com"; // match if(preg_match("/$checkhost/i", $host)){ return TRUE; } else { return FALSE; } } //read a file $my_file = "urls.txt";//change file name to yours if (file_exists($my_file)) { $data = file($my_file); $total = count($data); echo "<br />Total urls: $total<br />"; foreach ($data as $line) { if($line != "" && checkYoutube($line) == TRUE){ $url = trim($line); //making sure any url has the http protocol if(substr($url,0,5) != "http:"){ $url = "http://$url"; } //using curl is better for more options, setting the timeout matters for speed versus accuracy $context = stream_context_create(array( 'http' => array( 'timeout' => 8 ) )); //get the content from url $the_contents = @file_get_contents($url, 0, $context); //alive or dead condition if (empty($the_contents)) { $status = "dead"; $color = "#FF0000"; $title = $url; } else { $status = "alive"; $color = "#00FF00"; preg_match("/<title>(.*)<\/title>/Umis", $the_contents, $title); $title = $title[1]; //$title = htmlspecialchars($title, ENT_QUOTES); //saving data to database } //show results on page echo "<a style='font-size: 20px; background-color: #000000; color: $color;' href='$url' TARGET='_blank'>$title</a><br />"; } } } else { echo "Can't locate $my_file"; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/236411-open-html-http-file-and-retrieve/#findComment-1215533 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.