Drongo_III Posted September 29, 2012 Share Posted September 29, 2012 (edited) Hi Guys I'm retrieving a twitter feed as json but I have a slight issue. The feed outputs as text and so links in tweets come through as plain text, e.g.: More Tweets to discover... in the Discover tab on http://t.co/coKFdEQL. http://t.co/6OfRxQeW I've used preg_replace to reform the links into html links using the following code: $twitterFeed = file_get_contents('https://api.twitter.com/1/statuses/user_timeline.json?screen_name=twitter&count=4'); $feedArray = json_decode($twitterFeed, true); //Pattern to correct date format $pattern = '/\+[0-9\s\b]+$/'; $replacement = ''; //$linkPattern = '/(http:\/\/[^\s]+)(?!\.)$/'; // This was my shot at forward reference // Pattern to match links $linkPattern = '/(http:\/\/[^\s]+)/'; $linkReplace = "<a href=\"$1\">$1</a>"; // Replacement pattern to create a link foreach($feedArray as $k=>$v){ $date = preg_replace($pattern, $replacement, $v['created_at']); $text = preg_replace($linkPattern, $linkReplace , $v['text']); echo $date . "<br/>" . $text."<br/><br/>"; } The problem is that the linkPattern also captures the trailing full stop at the end of the sentence (seen in example above). So the resultant link ends up as a 404 because that full stop shouldn't be part of the link. Therefore can anyone suggest either: 1) how the linkPattern can be adjusted so that it doesn't capture the trailing full stop 2) how I can rtrim the capture reference if it's a full stop Or do I just need to do another preg_replace? Thanks! Edited September 29, 2012 by Drongo_III Quote Link to comment Share on other sites More sharing options...
Jessica Posted September 29, 2012 Share Posted September 29, 2012 rtrim() Quote Link to comment Share on other sites More sharing options...
Christian F. Posted September 29, 2012 Share Posted September 29, 2012 This is the Regular Expression I'm using for URIs, should be good for your purposes too: $RegExp = '#^(??:(?:f|ht)tps?|dchub)://)?((?:[\\w\\pL-]+\\.)+[a-z\\pL]{2,5})((?:/[\\w\\%-]*)+(??:\\.\\w{1,6})?(\\?(?:[\\w-]+=[\\w-]+)(?:&[\\w-]+(?:=[\\w-]+)?)*&?)?)?)?\\z#ui'; May be that you can use the filter_var () functionality for this too, but no guarantees there. Worth checking out at least. Quote Link to comment Share on other sites More sharing options...
Drongo_III Posted September 29, 2012 Author Share Posted September 29, 2012 This is the Regular Expression I'm using for URIs, should be good for your purposes too: $RegExp = '#^(??:(?:f|ht)tps?|dchub)://)?((?:[\\w\\pL-]+\\.)+[a-z\\pL]{2,5})((?:/[\\w\\%-]*)+(??:\\.\\w{1,6})?(\\?(?:[\\w-]+=[\\w-]+)(?:&[\\w-]+(?:=[\\w-]+)?)*&?)?)?)?\\z#ui'; May be that you can use the filter_var () functionality for this too, but no guarantees there. Worth checking out at least. That doesn't seem to match replace the plain text links as links. And I won't lie, I don't follow half of that! If I just wanted to rtrim the reference how can I do that? Is that possible to do? Quote Link to comment Share on other sites More sharing options...
Christian F. Posted September 29, 2012 Share Posted September 29, 2012 (edited) You're right in that I don't use it for replacing URIs with HTML anchors, but I use it to validate them. To get it to replace, all you need to do is to take out the RegExp anchors. So that it's not tied to the start and end of the string. Next step is to put a pair of parentheses around the whole exp<b></b>ression, to save the result in sub group 1. Which you can then use in the replacement text. Edit: In short, replace the first caret (^) with an opening parenthesis, and the "\\z" with a closing parenthesis, and you're set. Upon re-reading your post, I see that you're only fetching links from Twitter feeds. Which means you can simplify the RegExp quite a bit. Mine above matches a complete URI, and isn't needed for your purposes. Sorry about missing that the first time around. In short, what you need is the following: $RegExp = '#(http://(?:\\w+\\.)+\\w+/\\w+)#u'; Edited September 29, 2012 by Christian F. Quote Link to comment Share on other sites More sharing options...
Drongo_III Posted September 29, 2012 Author Share Posted September 29, 2012 Well that worked perfectly Thank you! I will spend some time deciphering that regex too...i'll have it cracked by sometime next month You're right in that I don't use it for replacing URIs with HTML anchors, but I use it to validate them. To get it to replace, all you need to do is to take out the RegExp anchors. So that it's not tied to the start and end of the string. Next step is to put a pair of parentheses around the whole expression, to save the result in sub group 1. Which you can then use in the replacement text. Edit: In short, replace the first caret (^) with an opening parenthesis, and the "\\z" with a closing parenthesis, and you're set. Quote Link to comment Share on other sites More sharing options...
Christian F. Posted September 29, 2012 Share Posted September 29, 2012 You're welcome, glad I could help. Good luck on the deciphering as well. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.