Jump to content

Recommended Posts

Hi, i'm a newbee and this is my first post.

I need some help to change this code:

 

$page = 0;

$URL = "http://www.blabla.com/";

$page = @fopen($URL, "r");

print("Links at $URL<BR>\n");

print("<UL>\n");

while(!feof($page)) {

    $line = fgets($page, 255);

    while(eregi("HREF=\"[^\"]*\"", $line, $match)) {

        print("<LI>");

        print($match[0]);

        print("<BR>\n");

        $replace = ereg_replace("\?", "\?", $match[0]);

        $line = ereg_replace($replace, "", $line);

    }

}

print("</UL>\n");

fclose($page);

 

This code grabs links from a page as text. It works good. but what i want is grabbing only rapidshare,megaupload and some other popular filehost links as clickable format. May somebody change this code for that? Thank you...

Link to comment
https://forums.phpfreaks.com/topic/176061-a-little-php-help-needed-here/
Share on other sites

Note that this only works if they put the href content in double quotes. They could use single quotes or no quotes. I just didn't have the time to code for all three scenarios.

 

<?php

$url = "http://www.somesite.com";

$allowedDomains = array
(
    'rapidshare',
    'megaupload'
);

$page = file_get_contents($url);
preg_match_all("/<a.*?href="([^"]*).*?>.*?</a>/is", $page, $matches);

echo "Links at $URL<br /> ";
echo "<ul> ";
foreach ($matches[1] as $match)
{
    $validDomain = false;
    foreach ($allowedDomains as $domain)
    {
        if (strpos($match, $domain))
        {
            $validDomain = true;
            break;
      }
    }

    if ($validDomain)
    {
        echo "<li>$match</li>";
    }
}
echo "</ul> ";

?>

Some of the characters in the string need to be escaped. I'm not great at Regex, but try this...

 

preg_match_all("/<a.*?href=\"([^\"]*).*?>.*?<\/a>/is", $page, $matches);

 

Now it works ! Thanks a lot mjdamato and cags ! This forum is great !

 

Note that this only works if they put the href content in double quotes. They could use single quotes or no quotes.

said mjdamato. Is there anybody have some time to add these scenarios to the existing code above? Thanks in advance...

Well, I was easily able to adapt it for double or single quotes, but not no quotes.

 

Note, the only changes are the preg_match() AND the $matches array index used in the foreach() loop

 

<?php

$url = "http://www.somesite.com";

$allowedDomains = array
(
    'domain1',
    'domain2'
);

$page = file_get_contents($url);
preg_match_all("/<a.*?href=(\"|\')(.*?)\\1.*?>.*?<\/a>/is", $page, $matches);


echo "<pre>";
//print_r($matches);
echo "<pre>";

echo "Links at $url<br /> ";
echo "<ul> ";
foreach ($matches[2] as $match)
{
    $validDomain = false;
    foreach ($allowedDomains as $domain)
    {
        if (strpos($match, $domain))
        {
            $validDomain = true;
            break;
      }
    }

    if ($validDomain)
    {
        echo "<li>$match</li>";
    }
}
echo "</ul> ";

?>

To match an anchor href which doesn't use quotes, could you not use space or > as the ending criteria and make the quote at the start optional? The only HTML I can think of that would work correctly are...

 

<a href=http://www.google.com title="or some other attribute">

or

<a href=http://www.google.com>

 

Something along the lines of (this is untested, just theoretical).

 

preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches);

 

But as previously mentioned I'm not that great with Regex, only wrote my first simple one earlier this week.

 

 

Well, I was easily able to adapt it for double or single quotes, but not no quotes.

 

Note, the only changes are the preg_match() AND the $matches array index used in the foreach() loop

 

<?php

$url = "http://www.somesite.com";

$allowedDomains = array
(
    'domain1',
    'domain2'
);

$page = file_get_contents($url);
preg_match_all("/<a.*?href=(\"|\')(.*?)\\1.*?>.*?<\/a>/is", $page, $matches);


echo "<pre>";
//print_r($matches);
echo "<pre>";

echo "Links at $url<br /> ";
echo "<ul> ";
foreach ($matches[2] as $match)
{
    $validDomain = false;
    foreach ($allowedDomains as $domain)
    {
        if (strpos($match, $domain))
        {
            $validDomain = true;
            break;
      }
    }

    if ($validDomain)
    {
        echo "<li>$match</li>";
    }
}
echo "</ul> ";

?>

Thanks a lot mjdamato for fast replying and help, the code works like a charm !

I think it doesn't matter if the links under an image or button. Code can still grab them?

The urls are displayed in text format. Can we make them clickable?

And the last question:

I want to put this php code into wordpress posts. In wordpress i can call permalinks with

<?php the_permalink(); ?>

, it's not working when i try to put this permalink php code into the php code (where http://www.somesite.com is) you've written. i think it's simply because i'm trying to put a php code into another php code but i don't know what to do. Is there any solution about that? i know i'm asking many questions and requesting much help and taking your time but i'm really interested in php, it's like a magic. Thanks again... :)

 

To match an anchor href which doesn't use quotes, could you not use space or > as the ending criteria and make the quote at the start optional? The only HTML I can think of that would work correctly are...

 

<a href=http://www.google.com title="or some other attribute">

or

<a href=http://www.google.com>

 

Something along the lines of (this is untested, just theoretical).

 

preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches);

 

But as previously mentioned I'm not that great with Regex, only wrote my first simple one earlier this week.

I'm little bit confused, well i don't know what's the difference between your preg_match_all and the mjdamato's? can i prefer one of them or both suits the code?

preg_match_all("/<a.*?href=[\"|\']?(.*?)[\"|\'| |>].*?.*?<\/a>/is", $page, $matches);

 

I haven't tested that, but I'm not sure it would work. Well, at least not 100%. The expression I posted looked for a single or double quote and then would look for the same character to end the paramter value. By using an or in both places it would start at a single or double quote and then would end at a single quote or double quote or space.

 

So, this href

href="http://www.mysite.com/users?name=O'Donnel"

would only return "http://www.mysite.com/users?name=O", but my script will always get the entire value for the paramter delimited with single or double quotes.

 

I don't consider myself a regex expert and I'm sure there is a bette expression though.

 

Thanks a lot mjdamato for fast replying and help, the code works like a charm !

I think it doesn't matter if the links under an image or button. Code can still grab them?

The urls are displayed in text format. Can we make them clickable?

And the last question:

I want to put this php code into wordpress posts. In wordpress i can call permalinks with

<?php the_permalink(); ?>

, it's not working when i try to put this permalink php code into the php code (where http://www.somesite.com is) you've written. i think it's simply because i'm trying to put a php code into another php code but i don't know what to do. Is there any solution about that? i know i'm asking many questions and requesting much help and taking your time but i'm really interested in php, it's like a magic. Thanks again... :)

 

 

To make the presented text into hyperlinks, just change the echo statement accordingly

echo "<li><a href=\"{$match}\">{$match}</a></li>";

 

As for your Wordpress problems I can't really help you as I've never used it. but the googling I've done indicates that the_permalink() is used to display the link to the current post being displayed. So, I'm not sure how that applies to what you are doing.

Thank you mjdamato for your help again, you're a php genie!

 

As for your Wordpress problems I can't really help you as I've never used it. but the googling I've done indicates that the_permalink() is used to display the link to the current post being displayed. So, I'm not sure how that applies to what you are doing.

 

Yes you're right "the_permalink() is used to display the link to the current post being displayed" i'll point the permalink to the website content where the rapidshare links grabbed. if you know just show me how can i insert this permalink into the php code please.

 

Or i have another idea:

I want to create a rss feed xml or php file that takes only permalinks of another website (you see the only problem is taking that link to grab the rapidshare etc. links under its content) and put the php code you've written in description or content area of the feed file (because i want the rapidshare links as description/content of the feed items), php code uses (for each feed item/permalink)  permalinks to grab the rapidshare etc. links and puts them into description area. so when i fetch that feed file 1 time i got all these links and i can put them into my wordpress posts. it's better to run the code in wordpress template files for each page load.

 

About the great php code you wrote:

It's really powerful and works great. i don't know if it's possible the hide the output rapidshare vs. links under a (click to display the download links written on it) javascript etc. button. I hope i don't want the impossible thing. :)

 

Thank you again and again for your great help !

 

I'm modifying my post cause i have an another idea:

Grabbing rapidshare etc. links with a php code from defined 3,4 websites for given keywords. and using the

<?php the_title(); ?>

code to get the title in wordpress as given keywords. :)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.