Jump to content

problem with instarss.php


mboveiri

Recommended Posts

Hi


I am using instarss to get rss feed from an Instagram profile pics.



https://gist.github.com/jonathanbell/48c46d9fc2913fba3292

The issue is that Instagram adds "?ig_cache_key=*" to each pics and the result in the rss feed is :



https://scontent.cdninstagram.com/t51.2885-15/e35/13712096_262107487501012_162919037_n.jpg?ig_cache_key=MTI5Njk2NDMzMzg2NzQyMDQ5MA%3D%3D.2

i am beginner in PHP, anyone can help to remove "?ig_cache_key=" from each pics in "instarss".?


 


thanks.


Edited by mboveiri
Link to comment
Share on other sites

Why remove it? What's the problem?

 

 

Why is that an issue?

 

i want to use the result in a "Telegram Bot" and telegram API doesn't accept this type of url as a natural Picture.

 

telegram bot api only accept pics if it ends with ".JPG .PNG" 

 

telegram bot api detect "_162919037_n.jpg?ig_cache_key=MTI5Njk2NDMzMzg2NzQyMDQ5MA%3D%3D.2" as malformed format request.

Link to comment
Share on other sites

I haven't seen anything to suggest this sort of behavior is not allowed...

 

What's your code? If it handles those URLs at some point between the RSS and Telegram then you can simply remove the query string before submitting them.

Link to comment
Share on other sites

I haven't seen anything to suggest this sort of behavior is not allowed...

 

What's your code? If it handles those URLs at some point between the RSS and Telegram then you can simply remove the query string before submitting them.

 

code is : 

<?php
    if (!isset($_GET['user'])) {
        if (!isset($_GET['hashtag'])) {
            exit('Not a valid RSS feed. You didn\'nt provide an Instagram user or hashtag. Send one via a GET variable. Example .../instarss.php?user=snoopdogg');
        }
    }
    if (isset($_GET['user']) && isset($_GET['hashtag'])) {
        exit('Don\'t request both user and hashtag. Request one or the other.');
    }
    if (isset($_GET['user'])) {
        $html = file_get_contents('http://instagram.com/'.$_GET['user'].'/');
    }
    if (isset($_GET['hashtag'])) {
        $html = file_get_contents('http://instagram.com/explore/tags/'.$_GET['hashtag'].'/');
    }
    $html = strstr($html, '{"country_code');
    $html = strstr($html, '</script>', true);
    $html = substr($html, 0, -1);
    // for debugging... sigh........
    // echo $html;
    $data = json_decode($html);
    // more debugging... 
    // print_r($data->entry_data->ProfilePage[0]->user->media->nodes);
    if (isset($_GET['user'])) {
        if ($data->entry_data->ProfilePage[0]->user->media->nodes) {
            $nodes = $data->entry_data->ProfilePage[0]->user->media->nodes;
        } else {
            exit('Looks like this Instagram account is set to private or doesn\'t exist. We can\'t do much about that now, can we?');
        }
    }
    if (isset($_GET['hashtag'])) {
        $nodes = $data->entry_data->TagPage[0]->tag->media->nodes;
    }
    header('Content-Type: text/xml; charset=utf-8');
    $rss_feed = '<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel>';
    if (isset($_GET['user'])) {
        $rss_feed .= '<title>'.$_GET['user'].'\'s Instagram Feed</title><atom:link href="http://'.$_SERVER['HTTP_HOST'].$_SERVER["REQUEST_URI"].'" rel="self" type="application/rss+xml" /><link>http://instagram.com/'.$_GET['user'].'</link><description>'.$_GET['user'].'\'s Instagram Feed</description>';
    }
    if (isset($_GET['hashtag'])) {
        $rss_feed .= '<title>Photos tagged with: '.$_GET['hashtag'].' on Instagram</title><atom:link href="http://'.$_SERVER['HTTP_HOST'].$_SERVER["REQUEST_URI"].'" rel="self" type="application/rss+xml" /><link>http://instagram.com/explore/tags/'.$_GET['hashtag'].'</link><description>Photos tagged with: '.$_GET['hashtag'].' on Instagram</description>';
    }
    foreach($nodes as $node) {
        $rss_feed .= '<item><title>';
        if(isset($node->caption) && $node->caption != '') {
            $rss_feed .= htmlspecialchars($node->caption, ENT_QUOTES);
        } else {
            $rss_feed .= 'photo';
        }
        // pubdate format could also be: "D, d M Y H:i:s T"
        $rss_feed .= '</title><link>https://instagram.com/p/'.$node->code.'/</link><pubDate>'.date("r", $node->date).'</pubDate>';
        if (isset($_GET['user'])) {
            $rss_feed .= '<dc:creator><![CDATA['.$_GET['user'].']]></dc:creator>';
        }
        $rss_feed .= '<description><![CDATA[<img src="'.$node->display_src.'" />]]></description><guid>https://instagram.com/p/'.$node->code.'/</guid></item>';
    } // foreach "node" (photo)
    $rss_feed .= '</channel></rss>';
    echo $rss_feed;
?>

it's only generate rss feeds from an instagram user profile or hashtags. see this sample

 

that script fetch pictures from user profile and put each of them to rss feed item.

 

issue is the instagram add "?ig_cache_key=23456*" to each pic you upload to instagram . and result in rss feeds generated by instarss for pictures address was  : 

 

https://scontent.cdninstagram.com/t51.2885-15/e35/13712096_262107487501012_162919037_n.jpg?ig_cache_key=MTI5Njk2NDMzMzg2NzQyMDQ5MA%3D%3D.2
i want to change that script to auto remove "?ig_cache_key=*" from picture address in generated rss feed.
Link to comment
Share on other sites

Okay, see, now we've got a bit of a problem: Instagram doesn't like people doing what you're doing. Their Terms of Service specifically disallows crawling media (General Condition #9), their API policy says you can't "simply display User Content" without permission (General Terms #16), and the less precise "Platform Policy" list in their documentation says you can't crawl media without users' consent or automate requests (#4, #5).

 

And on a more technical note, their API doesn't provide a way for you to access user media without them specifically logging into Instagram. As far as I can tell.

 

I'm not sure there are any alternatives open to you...

Edited by requinix
Link to comment
Share on other sites

Okay, see, now we've got a bit of a problem: Instagram doesn't like people doing what you're doing. Their Terms of Service specifically disallows crawling media (General Condition #9), their API policy says you can't "simply display User Content" without permission (General Terms #16), and the less precise "Platform Policy" list in their documentation says you can't crawl media without users' consent or automate requests (#4, #5).

 

And on a more technical note, their API doesn't provide a way for you to access user media without them specifically logging into Instagram. As far as I can tell.

 

I'm not sure there are any alternatives open to you...

 

i use it for my account, the problem is not about instagram TOS or etc. 

Edited by mboveiri
Link to comment
Share on other sites

If it's your own content then that deals with the more problematic restrictions, but you still can't scrape their site to get what you need. The API does provide a way, but it requires generating access tokens by logging into their site; there's no indication how long those tokens are good for (just an ominous warning that they might expire in the future) so you will probably have to re-login to keep the feed working indefinitely.

 

So here's what I propose:

 

You take the officially-supported API route. Make an application according to their guidelines, and to alleviate most of the burden of the API you can use a third-party library to do the communication parts, then use their /users/self/media/recent endpoint to get the images. Yes, it will take you a little longer, yes, it's not as simple as scraping, but that's the method they require so that's what you'll need to do.

Edited by requinix
Link to comment
Share on other sites

If it's your own content then that deals with the more problematic restrictions, but you still can't scrape their site to get what you need. The API does provide a way, but it requires generating access tokens by logging into their site; there's no indication how long those tokens are good for (just an ominous warning that they might expire in the future) so you will probably have to re-login to keep the feed working indefinitely.

 

So here's what I propose:

 

You take the officially-supported API route. Make an application according to their guidelines, and to alleviate most of the burden of the API you can use a third-party library to do the communication parts, then use their /users/self/media/recent endpoint to get the images. Yes, it will take you a little longer, yes, it's not as simple as scraping, but that's the method they require so that's what you'll need to do.

 

thanks but it's need instagram API Knowledge, and me and many other don't know this Knowledge, this script made a easy way to users gets feeds from own profile.

ok if help to resolve these scripts problems has Conflict with your forums TOS,i understand.

 

thanks.

Link to comment
Share on other sites

A quick fix would be to just strip everything before the ?

 

Add this snippet of code, just above this line

// Strip out the query params section
$source = explode('?', $node->display_src)[0]);
$rss_feed .= '<description><![CDATA[<img src="'.$source.'" />]]></description><guid>https://instagram.com/p/'.$node->code.'/</guid></item>';
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.