gdfhghjdfghgfhf Posted May 10, 2010 Share Posted May 10, 2010 hello, i am using this script to remove links in a text: function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } the problem is that it will also remove the first word after the link example: http://www.link.com hello my name is bob would result in: my name is bob how can i fix this ? also, i would like to replace the links with the word "(link)" instead of just removing everything thanks a lot! Quote Link to comment Share on other sites More sharing options...
ChemicalBliss Posted May 10, 2010 Share Posted May 10, 2010 Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Quote Link to comment Share on other sites More sharing options...
siric Posted May 10, 2010 Share Posted May 10, 2010 Hi, I ran the script and it worked perfectly. $url = "http://www.link.com This is a test and hello my name is bob"; $result = xcleaner($url); print $result; function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
ScotDiddle Posted May 10, 2010 Share Posted May 10, 2010 ungovernable, If I understand your request correctly, the following code produces what you want: link hello my name is bob http://link.com hello my name is bob Scot L. Diddle, Richmond VA <?php Header("Cache-control: private, no-cache"); Header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); Header("Pragma: no-cache"); function xcleaner($url) { $U = explode(' ', $url); $W = array(); $link = array(); $anchorLink = array_shift($U); // $anchorLink => http://www.link.com // $U[0] => hello // $U[1] => my // $U[2] => name // $U[3] => is // $U[4] => bob $hasHTTP = stristr($anchorLink,'http'); if ($hasHTTP) { $W = explode('.', $anchorLink); $numOfWs = count($W); $W1 = $W[1]; if ( ($numOfWs > 1 && $W1 != "com") || ($numOfWs > 2) ) { $link[] = $W1; $merge = array_merge($link, $U); $return = implode(' ', $merge); return $return; } else { $link[] = $anchorLink; $merge = array_merge($link, $U); $return = implode(' ', $merge); return $return; } } } $url1 = 'http://www.link.com hello my name is bob'; $url2 = 'http://link.com hello my name is bob'; echo xcleaner($url1) . "<br /><br/> \n"; echo xcleaner($url2) . "<br /><br/> \n"; ?> Quote Link to comment Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 10, 2010 Author Share Posted May 10, 2010 Hi, I ran the script and it worked perfectly. $url = "http://www.link.com This is a test and hello my name is bob"; $result = xcleaner($url); print $result; function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } yes you are right... i just realized the given example will work but if i try with this text it will not work: http://www.dailymotion.com/video/x4o...me-french_news http://www.dailymotion.com/video/x4o...french-p2_news http://www.dailymotion.com/video/x4o...french-p3_news http://www.dailymotion.com/video/x4o...french-p4_news http://www.dailymotion.com/video/x4s...french-p5_news Super Size Me est un film documentaire américain réalisé par Morgan Spurlock. Le journaliste décide de se nourrir exclusivement chez McDonald’s pendant un mois et enquête à travers les États-Unis sur les effets néfastes du fast-food et de la célèbre chaîne spécialiste du hamburger, qui entraînent l'accroissement de l'obésité. i don't understand where the problem comes from.. ungovernable, If I understand your request correctly, the following code produces what you want: link hello my name is bob http://link.com hello my name is bob actually, i want to replace ALL links with the text "link" so something like http://www.awebsite.com/hello/blabla/hi.php would be replaced by "link" Quote Link to comment Share on other sites More sharing options...
ChemicalBliss Posted May 11, 2010 Share Posted May 11, 2010 Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Have you even tried this script i put together for you? It does exactly what you want. -cb- Quote Link to comment Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 11, 2010 Author Share Posted May 11, 2010 thanks a lot !!! it's working !! Quote Link to comment Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 17, 2010 Author Share Posted May 17, 2010 Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Have you even tried this script i put together for you? It does exactly what you want. -cb- i have a problem with this script for example, this text: [DL]http://www.megaupload.com/?d=QI29F7AJ[/DL] maxi repressage de 1982 01 - couleurs sur paris.mp3 02 - maximum.mp3 03 - tout ce fric.mp3 04 - poupee de cire.mp3 05 - piano dub.mp3 AlbumArtSmall.jpg Folder.jpg OBERKAMPF-LP-Couleurs5tvert.jpg will turn into: [DL](link)[/DL] maxi repressage de 1982 (link)3 (link)3 (link)3 (link)3 (link)3 AlbumArtSmall.jpg Folder.jpg OBERKAMPF-LP-Couleurs5tvert.jpg i want to convert only the links that start with http:// but the script thinks the list of the mp3 names are links any help would be appreciated! Quote Link to comment Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 17, 2010 Author Share Posted May 17, 2010 bump Quote Link to comment Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 21, 2010 Author Share Posted May 21, 2010 bump! here's another example of a text that will be messed up once parsed with the function given in ChemicalBliss's post Streaming: 1 - http://www.ubest1.com/index.php?video_user=13752|hdad|hdad_1272918833_vi deo.flv 2 - http://www.ubest1.com/user/hdad/video/13760 3 - http://www.ubest1.com/user/hdad/video/13767 4 - http://www.ubest1.com/user/hdad/video/13780 Quote Link to comment Share on other sites More sharing options...
ChemicalBliss Posted May 21, 2010 Share Posted May 21, 2010 Sorry bout the long reply but it's a simple fix. If you want it to only pick out URLs with http:// etc (protocols) then change the ? (0 or more) to + (1 or more) at the end of the protocol sub-pattern. e.g: /(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)+(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/ if you want to pick out specific URLs that do not have a protocol in the link, you can remove the [a-z]{2} (Match 2 Alphabetical Characters), with the | (or) bracket, then you will have to add all the current two letter top-level domains listed (They can and most likely will change), This is why this regex matches "paris.mp" and "maximum.mp" etc, because it looks like a domain (and its true it does - https://www.mp/). A Better Alternative?: This one should match any 2 character domain, but only if there isnt a 3rd character or digit. (So would match .mp but not .mp3). /(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}[^a-z0-9]+))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/ Last one may not be perfect, not tested it fully. Maybe the guys over at the REGEX forum on phpfreaks can help you further if you need it. -cb- Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.