cihan Posted September 30, 2009 Share Posted September 30, 2009 Hi, i'm a newbee and this is my first post. I need some help to change this code: $page = 0; $URL = "http://www.blabla.com/"; $page = @fopen($URL, "r"); print("Links at $URL<BR>\n"); print("<UL>\n"); while(!feof($page)) { $line = fgets($page, 255); while(eregi("HREF=\"[^\"]*\"", $line, $match)) { print("<LI>"); print($match[0]); print("<BR>\n"); $replace = ereg_replace("\?", "\?", $match[0]); $line = ereg_replace($replace, "", $line); } } print("</UL>\n"); fclose($page); This code grabs links from a page as text. It works good. but what i want is grabbing only rapidshare,megaupload and some other popular filehost links as clickable format. May somebody change this code for that? Thank you... Quote Link to comment Share on other sites More sharing options...
cihan Posted September 30, 2009 Author Share Posted September 30, 2009 BUMP. Any ideas? Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 30, 2009 Share Posted September 30, 2009 Note that this only works if they put the href content in double quotes. They could use single quotes or no quotes. I just didn't have the time to code for all three scenarios. <?php $url = "http://www.somesite.com"; $allowedDomains = array ( 'rapidshare', 'megaupload' ); $page = file_get_contents($url); preg_match_all("/<a.*?href="([^"]*).*?>.*?</a>/is", $page, $matches); echo "Links at $URL<br /> "; echo "<ul> "; foreach ($matches[1] as $match) { $validDomain = false; foreach ($allowedDomains as $domain) { if (strpos($match, $domain)) { $validDomain = true; break; } } if ($validDomain) { echo "<li>$match</li>"; } } echo "</ul> "; ?> Quote Link to comment Share on other sites More sharing options...
cihan Posted September 30, 2009 Author Share Posted September 30, 2009 At first, thank you for your answer. I get the following error when i try to use the code: Parse error: syntax error, unexpected '(' in /home/cihan/public_html/xxxxxxxxxxxx.com/wp-content/themes/clean-minimal/index.php on line 12 Quote Link to comment Share on other sites More sharing options...
cags Posted September 30, 2009 Share Posted September 30, 2009 Some of the characters in the string need to be escaped. I'm not great at Regex, but try this... preg_match_all("/<a.*?href=\"([^\"]*).*?>.*?<\/a>/is", $page, $matches); Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 30, 2009 Share Posted September 30, 2009 Odd, the forum accidentally removed a double quote. I had actually tested the code and it was working fine. I just copy/pasted in to the forum. cags has it correct. Quote Link to comment Share on other sites More sharing options...
cihan Posted September 30, 2009 Author Share Posted September 30, 2009 Some of the characters in the string need to be escaped. I'm not great at Regex, but try this... preg_match_all("/<a.*?href=\"([^\"]*).*?>.*?<\/a>/is", $page, $matches); Now it works ! Thanks a lot mjdamato and cags ! This forum is great ! Note that this only works if they put the href content in double quotes. They could use single quotes or no quotes. said mjdamato. Is there anybody have some time to add these scenarios to the existing code above? Thanks in advance... Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 30, 2009 Share Posted September 30, 2009 Well, I was easily able to adapt it for double or single quotes, but not no quotes. Note, the only changes are the preg_match() AND the $matches array index used in the foreach() loop <?php $url = "http://www.somesite.com"; $allowedDomains = array ( 'domain1', 'domain2' ); $page = file_get_contents($url); preg_match_all("/<a.*?href=(\"|\')(.*?)\\1.*?>.*?<\/a>/is", $page, $matches); echo "<pre>"; //print_r($matches); echo "<pre>"; echo "Links at $url<br /> "; echo "<ul> "; foreach ($matches[2] as $match) { $validDomain = false; foreach ($allowedDomains as $domain) { if (strpos($match, $domain)) { $validDomain = true; break; } } if ($validDomain) { echo "<li>$match</li>"; } } echo "</ul> "; ?> Quote Link to comment Share on other sites More sharing options...
cags Posted September 30, 2009 Share Posted September 30, 2009 To match an anchor href which doesn't use quotes, could you not use space or > as the ending criteria and make the quote at the start optional? The only HTML I can think of that would work correctly are... <a href=http://www.google.com title="or some other attribute"> or <a href=http://www.google.com> Something along the lines of (this is untested, just theoretical). preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches); But as previously mentioned I'm not that great with Regex, only wrote my first simple one earlier this week. Quote Link to comment Share on other sites More sharing options...
cihan Posted September 30, 2009 Author Share Posted September 30, 2009 Well, I was easily able to adapt it for double or single quotes, but not no quotes. Note, the only changes are the preg_match() AND the $matches array index used in the foreach() loop <?php $url = "http://www.somesite.com"; $allowedDomains = array ( 'domain1', 'domain2' ); $page = file_get_contents($url); preg_match_all("/<a.*?href=(\"|\')(.*?)\\1.*?>.*?<\/a>/is", $page, $matches); echo "<pre>"; //print_r($matches); echo "<pre>"; echo "Links at $url<br /> "; echo "<ul> "; foreach ($matches[2] as $match) { $validDomain = false; foreach ($allowedDomains as $domain) { if (strpos($match, $domain)) { $validDomain = true; break; } } if ($validDomain) { echo "<li>$match</li>"; } } echo "</ul> "; ?> Thanks a lot mjdamato for fast replying and help, the code works like a charm ! I think it doesn't matter if the links under an image or button. Code can still grab them? The urls are displayed in text format. Can we make them clickable? And the last question: I want to put this php code into wordpress posts. In wordpress i can call permalinks with <?php the_permalink(); ?> , it's not working when i try to put this permalink php code into the php code (where http://www.somesite.com is) you've written. i think it's simply because i'm trying to put a php code into another php code but i don't know what to do. Is there any solution about that? i know i'm asking many questions and requesting much help and taking your time but i'm really interested in php, it's like a magic. Thanks again... Quote Link to comment Share on other sites More sharing options...
cihan Posted September 30, 2009 Author Share Posted September 30, 2009 To match an anchor href which doesn't use quotes, could you not use space or > as the ending criteria and make the quote at the start optional? The only HTML I can think of that would work correctly are... <a href=http://www.google.com title="or some other attribute"> or <a href=http://www.google.com> Something along the lines of (this is untested, just theoretical). preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches); But as previously mentioned I'm not that great with Regex, only wrote my first simple one earlier this week. I'm little bit confused, well i don't know what's the difference between your preg_match_all and the mjdamato's? can i prefer one of them or both suits the code? Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 30, 2009 Share Posted September 30, 2009 preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches); I haven't tested that, but I'm not sure it would work. Well, at least not 100%. The expression I posted looked for a single or double quote and then would look for the same character to end the paramter value. By using an or in both places it would start at a single or double quote and then would end at a single quote or double quote or space. So, this href href="http://www.mysite.com/users?name=O'Donnel" would only return "http://www.mysite.com/users?name=O", but my script will always get the entire value for the paramter delimited with single or double quotes. I don't consider myself a regex expert and I'm sure there is a bette expression though. Thanks a lot mjdamato for fast replying and help, the code works like a charm ! I think it doesn't matter if the links under an image or button. Code can still grab them? The urls are displayed in text format. Can we make them clickable? And the last question: I want to put this php code into wordpress posts. In wordpress i can call permalinks with <?php the_permalink(); ?> , it's not working when i try to put this permalink php code into the php code (where http://www.somesite.com is) you've written. i think it's simply because i'm trying to put a php code into another php code but i don't know what to do. Is there any solution about that? i know i'm asking many questions and requesting much help and taking your time but i'm really interested in php, it's like a magic. Thanks again... To make the presented text into hyperlinks, just change the echo statement accordingly echo "<li><a href=\"{$match}\">{$match}</a></li>"; As for your Wordpress problems I can't really help you as I've never used it. but the googling I've done indicates that the_permalink() is used to display the link to the current post being displayed. So, I'm not sure how that applies to what you are doing. Quote Link to comment Share on other sites More sharing options...
cihan Posted October 1, 2009 Author Share Posted October 1, 2009 Thank you mjdamato for your help again, you're a php genie! As for your Wordpress problems I can't really help you as I've never used it. but the googling I've done indicates that the_permalink() is used to display the link to the current post being displayed. So, I'm not sure how that applies to what you are doing. Yes you're right "the_permalink() is used to display the link to the current post being displayed" i'll point the permalink to the website content where the rapidshare links grabbed. if you know just show me how can i insert this permalink into the php code please. Or i have another idea: I want to create a rss feed xml or php file that takes only permalinks of another website (you see the only problem is taking that link to grab the rapidshare etc. links under its content) and put the php code you've written in description or content area of the feed file (because i want the rapidshare links as description/content of the feed items), php code uses (for each feed item/permalink) permalinks to grab the rapidshare etc. links and puts them into description area. so when i fetch that feed file 1 time i got all these links and i can put them into my wordpress posts. it's better to run the code in wordpress template files for each page load. About the great php code you wrote: It's really powerful and works great. i don't know if it's possible the hide the output rapidshare vs. links under a (click to display the download links written on it) javascript etc. button. I hope i don't want the impossible thing. Thank you again and again for your great help ! I'm modifying my post cause i have an another idea: Grabbing rapidshare etc. links with a php code from defined 3,4 websites for given keywords. and using the <?php the_title(); ?> code to get the title in wordpress as given keywords. Quote Link to comment Share on other sites More sharing options...
cihan Posted October 1, 2009 Author Share Posted October 1, 2009 Bump. Any ideas? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.