Jump to content

help me fix this script please


Recommended Posts

hello,

 

i am using this script to remove links in a text:

 

  Quote
function xcleaner($url)

  {

  $U = explode(' ', $url);

 

  $W =array();

  foreach ($U as $k => $u)

      {

      $W = explode('.', $u);

      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))

        {

        unset($U[$k]);

        return implode(' ',$U);

        }

      }

  return implode(' ',$U);

  }

 

the problem is that it will also remove the first word after the link

 

example:

  Quote
http://www.link.com hello my name is bob

 

would result in:

  Quote
my name is bob

 

how can i fix this ?

 

 

also, i would like to replace the links with the word "(link)" instead of just removing everything

 

thanks a lot!

Link to comment
https://forums.phpfreaks.com/topic/201239-help-me-fix-this-script-please/
Share on other sites

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

Hi,

 

I ran the script and it worked perfectly.

 

$url = "http://www.link.com This is a test and hello my name is bob";
$result = xcleaner($url);

print $result;


function xcleaner($url) {

   $U = explode(' ', $url);

   $W =array();
   foreach ($U as $k => $u) 
      {
      $W = explode('.', $u);
      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))
         {
         unset($U[$k]);
         return implode(' ',$U);
         }
      }
   return implode(' ',$U);
   }

 

 

 

[attachment deleted by admin]

ungovernable,

 

If I understand your request correctly, the following code produces what you want:

 

 

link hello my name is bob

 

http://link.com hello my name is bob

 

 

Scot L. Diddle, Richmond VA

 

<?php

Header("Cache-control: private, no-cache");
Header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
Header("Pragma: no-cache");

function xcleaner($url)  {

   $U    = explode(' ', $url);
   $W 	 = array();
   $link = array();

   $anchorLink =  array_shift($U);    // $anchorLink => http://www.link.com
  									  // $U[0] 		 => hello
   								      // $U[1] 		 => my
   								      // $U[2] 		 => name
   								      // $U[3] 		 => is
   								      // $U[4] 		 => bob

   $hasHTTP = stristr($anchorLink,'http');

   if ($hasHTTP) {

	   $W = explode('.', $anchorLink);

	   $numOfWs = count($W);

	   $W1 = $W[1];

	   if ( ($numOfWs > 1 && $W1 != "com") || ($numOfWs > 2) ) {

	       $link[] = $W1;

	       $merge = array_merge($link, $U);

	       $return = implode(' ', $merge);

	       return $return;

	     }
	     else {

	     	$link[] = $anchorLink;

	        $merge = array_merge($link, $U);

	        $return = implode(' ', $merge);

	  		return $return;

	  	}

   	}

}


   $url1 = 'http://www.link.com hello my name is bob';
   $url2 = 'http://link.com hello my name is bob';

   echo xcleaner($url1) . "<br /><br/> \n";
   echo xcleaner($url2) . "<br /><br/> \n";

?>

  Quote

Hi,

 

I ran the script and it worked perfectly.

 

$url = "http://www.link.com This is a test and hello my name is bob";
$result = xcleaner($url);

print $result;


function xcleaner($url) {

   $U = explode(' ', $url);

   $W =array();
   foreach ($U as $k => $u) 
      {
      $W = explode('.', $u);
      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))
         {
         unset($U[$k]);
         return implode(' ',$U);
         }
      }
   return implode(' ',$U);
   }

 

 

 

yes you are right... i just realized the given example will work

 

but if i try with this text it will not work:

 

  Quote
http://www.dailymotion.com/video/x4o...me-french_news

 

http://www.dailymotion.com/video/x4o...french-p2_news

 

http://www.dailymotion.com/video/x4o...french-p3_news

 

http://www.dailymotion.com/video/x4o...french-p4_news

 

http://www.dailymotion.com/video/x4s...french-p5_news

 

 

Super Size Me est un film documentaire américain réalisé par Morgan Spurlock. Le journaliste décide de se nourrir exclusivement chez McDonald’s pendant un mois et enquête à travers les États-Unis sur les effets néfastes du fast-food et de la célèbre chaîne spécialiste du hamburger, qui entraînent l'accroissement de l'obésité.

 

i don't understand where the problem comes from..

 

  Quote
ungovernable,

 

If I understand your request correctly, the following code produces what you want:

 

 

 

link hello my name is bob

 

http://link.com hello my name is bob

 

actually, i want to replace ALL links with the text "link"

 

so something like

 

http://www.awebsite.com/hello/blabla/hi.php would be replaced by "link"

  Quote

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

 

 

Have you even tried this script i put together for you?

It does exactly what you want.

 

-cb-

  Quote

  Quote

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

 

 

Have you even tried this script i put together for you?

It does exactly what you want.

 

-cb-

 

i have a problem with this script

 

for example, this text:

  Quote
[DL]http://www.megaupload.com/?d=QI29F7AJ[/DL]

 

maxi  repressage de 1982

 

01 - couleurs sur paris.mp3

02 - maximum.mp3

03 - tout ce fric.mp3

04 - poupee de cire.mp3

05 - piano dub.mp3

AlbumArtSmall.jpg

Folder.jpg

OBERKAMPF-LP-Couleurs5tvert.jpg

 

will turn into:

  Quote
[DL](link)[/DL]

 

maxi  repressage de 1982

 

(link)3

(link)3

(link)3

(link)3

(link)3

AlbumArtSmall.jpg

Folder.jpg

OBERKAMPF-LP-Couleurs5tvert.jpg

 

i want to convert only the links that start with http://

 

but the script thinks the list of the mp3 names are links

 

any help would be appreciated!

bump!

 

here's another example of a text that will be messed up once parsed with the function given in ChemicalBliss's post

 

  Quote

Sorry bout the long reply but it's a simple fix.

 

If you want it to only pick out URLs with http:// etc (protocols) then change the ? (0 or more) to + (1 or more) at the end of the protocol sub-pattern. e.g:

 

/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)+(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/

 

if you want to pick out specific URLs that do not have a protocol in the link, you can remove the [a-z]{2} (Match 2 Alphabetical Characters), with the | (or) bracket, then you will have to add all the current two letter top-level domains listed (They can and most likely will change), This is why this regex matches "paris.mp" and "maximum.mp" etc, because it looks like a domain (and its true it does - https://www.mp/).

 

A Better Alternative?:

This one should match any 2 character domain, but only if there isnt a 3rd character or digit. (So would match .mp but not .mp3).

/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}[^a-z0-9]+))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/

 

Last one may not be perfect, not tested it fully. Maybe the guys over at the REGEX forum on phpfreaks can help you further if you need it.

 

-cb-

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.