gdfhghjdfghgfhf Posted May 10, 2010 Share Posted May 10, 2010 hello, i am using this script to remove links in a text: Quote function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } the problem is that it will also remove the first word after the link example: Quote http://www.link.com hello my name is bob would result in: Quote my name is bob how can i fix this ? also, i would like to replace the links with the word "(link)" instead of just removing everything thanks a lot! Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/ Share on other sites More sharing options...
ChemicalBliss Posted May 10, 2010 Share Posted May 10, 2010 Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1055775 Share on other sites More sharing options...
siric Posted May 10, 2010 Share Posted May 10, 2010 Hi, I ran the script and it worked perfectly. $url = "http://www.link.com This is a test and hello my name is bob"; $result = xcleaner($url); print $result; function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } [attachment deleted by admin] Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1055789 Share on other sites More sharing options...
ScotDiddle Posted May 10, 2010 Share Posted May 10, 2010 ungovernable, If I understand your request correctly, the following code produces what you want: link hello my name is bob http://link.com hello my name is bob Scot L. Diddle, Richmond VA <?php Header("Cache-control: private, no-cache"); Header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); Header("Pragma: no-cache"); function xcleaner($url) { $U = explode(' ', $url); $W = array(); $link = array(); $anchorLink = array_shift($U); // $anchorLink => http://www.link.com // $U[0] => hello // $U[1] => my // $U[2] => name // $U[3] => is // $U[4] => bob $hasHTTP = stristr($anchorLink,'http'); if ($hasHTTP) { $W = explode('.', $anchorLink); $numOfWs = count($W); $W1 = $W[1]; if ( ($numOfWs > 1 && $W1 != "com") || ($numOfWs > 2) ) { $link[] = $W1; $merge = array_merge($link, $U); $return = implode(' ', $merge); return $return; } else { $link[] = $anchorLink; $merge = array_merge($link, $U); $return = implode(' ', $merge); return $return; } } } $url1 = 'http://www.link.com hello my name is bob'; $url2 = 'http://link.com hello my name is bob'; echo xcleaner($url1) . "<br /><br/> \n"; echo xcleaner($url2) . "<br /><br/> \n"; ?> Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1055795 Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 10, 2010 Author Share Posted May 10, 2010 Quote Hi, I ran the script and it worked perfectly. $url = "http://www.link.com This is a test and hello my name is bob"; $result = xcleaner($url); print $result; function xcleaner($url) { $U = explode(' ', $url); $W =array(); foreach ($U as $k => $u) { $W = explode('.', $u); if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2)) { unset($U[$k]); return implode(' ',$U); } } return implode(' ',$U); } yes you are right... i just realized the given example will work but if i try with this text it will not work: Quote http://www.dailymotion.com/video/x4o...me-french_news http://www.dailymotion.com/video/x4o...french-p2_news http://www.dailymotion.com/video/x4o...french-p3_news http://www.dailymotion.com/video/x4o...french-p4_news http://www.dailymotion.com/video/x4s...french-p5_news Super Size Me est un film documentaire américain réalisé par Morgan Spurlock. Le journaliste décide de se nourrir exclusivement chez McDonald’s pendant un mois et enquête à travers les États-Unis sur les effets néfastes du fast-food et de la célèbre chaîne spécialiste du hamburger, qui entraînent l'accroissement de l'obésité. i don't understand where the problem comes from.. Quote ungovernable, If I understand your request correctly, the following code produces what you want: link hello my name is bob http://link.com hello my name is bob actually, i want to replace ALL links with the text "link" so something like http://www.awebsite.com/hello/blabla/hi.php would be replaced by "link" Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1056122 Share on other sites More sharing options...
ChemicalBliss Posted May 11, 2010 Share Posted May 11, 2010 Quote Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Have you even tried this script i put together for you? It does exactly what you want. -cb- Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1056549 Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 11, 2010 Author Share Posted May 11, 2010 thanks a lot !!! it's working !! Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1056745 Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 17, 2010 Author Share Posted May 17, 2010 Quote Quote Have a look at preg_replace(), http://uk3.php.net/manual/en/function.preg-replace.php An expression like this should suffice (from : // Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/ $regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/"; $newdata = preg_replace($regex,"(link)",$data); // Where $data is your content you want to replace links for -cb- Have you even tried this script i put together for you? It does exactly what you want. -cb- i have a problem with this script for example, this text: Quote [DL]http://www.megaupload.com/?d=QI29F7AJ[/DL] maxi repressage de 1982 01 - couleurs sur paris.mp3 02 - maximum.mp3 03 - tout ce fric.mp3 04 - poupee de cire.mp3 05 - piano dub.mp3 AlbumArtSmall.jpg Folder.jpg OBERKAMPF-LP-Couleurs5tvert.jpg will turn into: Quote [DL](link)[/DL] maxi repressage de 1982 (link)3 (link)3 (link)3 (link)3 (link)3 AlbumArtSmall.jpg Folder.jpg OBERKAMPF-LP-Couleurs5tvert.jpg i want to convert only the links that start with http:// but the script thinks the list of the mp3 names are links any help would be appreciated! Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1059349 Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 17, 2010 Author Share Posted May 17, 2010 bump Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1059762 Share on other sites More sharing options...
gdfhghjdfghgfhf Posted May 21, 2010 Author Share Posted May 21, 2010 bump! here's another example of a text that will be messed up once parsed with the function given in ChemicalBliss's post Quote Streaming: 1 - http://www.ubest1.com/index.php?video_user=13752|hdad|hdad_1272918833_vi deo.flv 2 - http://www.ubest1.com/user/hdad/video/13760 3 - http://www.ubest1.com/user/hdad/video/13767 4 - http://www.ubest1.com/user/hdad/video/13780 Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1061440 Share on other sites More sharing options...
ChemicalBliss Posted May 21, 2010 Share Posted May 21, 2010 Sorry bout the long reply but it's a simple fix. If you want it to only pick out URLs with http:// etc (protocols) then change the ? (0 or more) to + (1 or more) at the end of the protocol sub-pattern. e.g: /(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)+(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/ if you want to pick out specific URLs that do not have a protocol in the link, you can remove the [a-z]{2} (Match 2 Alphabetical Characters), with the | (or) bracket, then you will have to add all the current two letter top-level domains listed (They can and most likely will change), This is why this regex matches "paris.mp" and "maximum.mp" etc, because it looks like a domain (and its true it does - https://www.mp/). A Better Alternative?: This one should match any 2 character domain, but only if there isnt a 3rd character or digit. (So would match .mp but not .mp3). /(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}[^a-z0-9]+))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/ Last one may not be perfect, not tested it fully. Maybe the guys over at the REGEX forum on phpfreaks can help you further if you need it. -cb- Link to comment https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/#findComment-1061813 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.