Jump to content

help me fix this script please


Recommended Posts

hello,

 

i am using this script to remove links in a text:

 

function xcleaner($url)

  {

  $U = explode(' ', $url);

 

  $W =array();

  foreach ($U as $k => $u)

      {

      $W = explode('.', $u);

      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))

        {

        unset($U[$k]);

        return implode(' ',$U);

        }

      }

  return implode(' ',$U);

  }

 

the problem is that it will also remove the first word after the link

 

example:

http://www.link.com hello my name is bob

 

would result in:

my name is bob

 

how can i fix this ?

 

 

also, i would like to replace the links with the word "(link)" instead of just removing everything

 

thanks a lot!

Link to comment
Share on other sites

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

Link to comment
Share on other sites

Hi,

 

I ran the script and it worked perfectly.

 

$url = "http://www.link.com This is a test and hello my name is bob";
$result = xcleaner($url);

print $result;


function xcleaner($url) {

   $U = explode(' ', $url);

   $W =array();
   foreach ($U as $k => $u) 
      {
      $W = explode('.', $u);
      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))
         {
         unset($U[$k]);
         return implode(' ',$U);
         }
      }
   return implode(' ',$U);
   }

 

 

 

[attachment deleted by admin]

Link to comment
Share on other sites

ungovernable,

 

If I understand your request correctly, the following code produces what you want:

 

 

link hello my name is bob

 

http://link.com hello my name is bob

 

 

Scot L. Diddle, Richmond VA

 


<?php

Header("Cache-control: private, no-cache");
Header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
Header("Pragma: no-cache");

function xcleaner($url)  {

   $U    = explode(' ', $url);
   $W 	 = array();
   $link = array();

   $anchorLink =  array_shift($U);    // $anchorLink => http://www.link.com
  									  // $U[0] 		 => hello
   								      // $U[1] 		 => my
   								      // $U[2] 		 => name
   								      // $U[3] 		 => is
   								      // $U[4] 		 => bob

   $hasHTTP = stristr($anchorLink,'http');

   if ($hasHTTP) {

	   $W = explode('.', $anchorLink);

	   $numOfWs = count($W);

	   $W1 = $W[1];

	   if ( ($numOfWs > 1 && $W1 != "com") || ($numOfWs > 2) ) {

	       $link[] = $W1;

	       $merge = array_merge($link, $U);

	       $return = implode(' ', $merge);

	       return $return;

	     }
	     else {

	     	$link[] = $anchorLink;

	        $merge = array_merge($link, $U);

	        $return = implode(' ', $merge);

	  		return $return;

	  	}

   	}

}


   $url1 = 'http://www.link.com hello my name is bob';
   $url2 = 'http://link.com hello my name is bob';

   echo xcleaner($url1) . "<br /><br/> \n";
   echo xcleaner($url2) . "<br /><br/> \n";

?>

Link to comment
Share on other sites

Hi,

 

I ran the script and it worked perfectly.

 

$url = "http://www.link.com This is a test and hello my name is bob";
$result = xcleaner($url);

print $result;


function xcleaner($url) {

   $U = explode(' ', $url);

   $W =array();
   foreach ($U as $k => $u) 
      {
      $W = explode('.', $u);
      if (stristr($u,'http') || (count($W) > 1 && $W[1] != "") || (count($W) > 2))
         {
         unset($U[$k]);
         return implode(' ',$U);
         }
      }
   return implode(' ',$U);
   }

 

 

 

yes you are right... i just realized the given example will work

 

but if i try with this text it will not work:

 

http://www.dailymotion.com/video/x4o...me-french_news

 

http://www.dailymotion.com/video/x4o...french-p2_news

 

http://www.dailymotion.com/video/x4o...french-p3_news

 

http://www.dailymotion.com/video/x4o...french-p4_news

 

http://www.dailymotion.com/video/x4s...french-p5_news

 

 

Super Size Me est un film documentaire américain réalisé par Morgan Spurlock. Le journaliste décide de se nourrir exclusivement chez McDonald’s pendant un mois et enquête à travers les États-Unis sur les effets néfastes du fast-food et de la célèbre chaîne spécialiste du hamburger, qui entraînent l'accroissement de l'obésité.

 

i don't understand where the problem comes from..

 

ungovernable,

 

If I understand your request correctly, the following code produces what you want:

 

 

 

link hello my name is bob

 

http://link.com hello my name is bob

 

actually, i want to replace ALL links with the text "link"

 

so something like

 

http://www.awebsite.com/hello/blabla/hi.php would be replaced by "link"

Link to comment
Share on other sites

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

 

 

Have you even tried this script i put together for you?

It does exactly what you want.

 

-cb-

Link to comment
Share on other sites

Have a look at preg_replace(),

http://uk3.php.net/manual/en/function.preg-replace.php

 

An expression like this should suffice (from :

// Regex taken from: http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/
$regex = "/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/";
$newdata = preg_replace($regex,"(link)",$data);

// Where $data is your content you want to replace links for

 

-cb-

 

 

Have you even tried this script i put together for you?

It does exactly what you want.

 

-cb-

 

i have a problem with this script

 

for example, this text:

[DL]http://www.megaupload.com/?d=QI29F7AJ[/DL]

 

maxi  repressage de 1982

 

01 - couleurs sur paris.mp3

02 - maximum.mp3

03 - tout ce fric.mp3

04 - poupee de cire.mp3

05 - piano dub.mp3

AlbumArtSmall.jpg

Folder.jpg

OBERKAMPF-LP-Couleurs5tvert.jpg

 

will turn into:

[DL](link)[/DL]

 

maxi  repressage de 1982

 

(link)3

(link)3

(link)3

(link)3

(link)3

AlbumArtSmall.jpg

Folder.jpg

OBERKAMPF-LP-Couleurs5tvert.jpg

 

i want to convert only the links that start with http://

 

but the script thinks the list of the mp3 names are links

 

any help would be appreciated!

Link to comment
Share on other sites

bump!

 

here's another example of a text that will be messed up once parsed with the function given in ChemicalBliss's post

 

Link to comment
Share on other sites

Sorry bout the long reply but it's a simple fix.

 

If you want it to only pick out URLs with http:// etc (protocols) then change the ? (0 or more) to + (1 or more) at the end of the protocol sub-pattern. e.g:

 

/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)+(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/

 

if you want to pick out specific URLs that do not have a protocol in the link, you can remove the [a-z]{2} (Match 2 Alphabetical Characters), with the | (or) bracket, then you will have to add all the current two letter top-level domains listed (They can and most likely will change), This is why this regex matches "paris.mp" and "maximum.mp" etc, because it looks like a domain (and its true it does - https://www.mp/).

 

A Better Alternative?:

This one should match any 2 character domain, but only if there isnt a 3rd character or digit. (So would match .mp but not .mp3).

/(?#Protocol)(??:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?#Username:Password)(?:\w+:\w+@)?(?#Subdomains)(??:[-\w]+\.)+(?#TopLevel Domains)(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}[^a-z0-9]+))(?#Port)(?::[\d]{1,5})?(?#Directories)(??:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|\#)?(?#Query)(??:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?#Anchor)(?:\#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?/

 

Last one may not be perfect, not tested it fully. Maybe the guys over at the REGEX forum on phpfreaks can help you further if you need it.

 

-cb-

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.