rea|and

August 11, 2009

Hi,

I've added 2 patterns in your regex. One to match links, the other to match images. They are generic patterns, you could change the first with something more specific or add some other images extensions in the second. Try it

$BBCode_Text=preg_replace('/(https?:\/\/(www)?)\S+|\S+\.(jpe?g|gif|png)|\S{'.$character_limit.'}(?=\S)/im', "$0 ", $BBCode_Text);

June 15, 2009

Try this one:

$pattern = '/<(a(?=[^>]+class=(\'|")Link(?:\\2))|span(?=[^>]+class=(\'|")Title(?:\\3)))[^>]+>(.*?)<\/\\1>/si';
preg_match_all($pattern, $result, $matches, PREG_SET_ORDER);

foreach ($matches as $val) {
   echo "Tag: {$val[1]} - {$val[4]}<br />";
}

June 3, 2009

Thanks.. will try it. Can't we do like just remove all tags except few. I know some regex but not so complex.

I know we can add ^ which means "NOT".

Can't we do something like that?

Have you tried strip_tags? As second parameter it takes the allowable tags.

Anyways if you want do it with regex try this one (I've modified Crayon Violent regexp adding a negative lookahead assertion):

$content=preg_replace('/<\/?(?!input|textarea|select)[^>]*>/','',$content);

In some cases it could have problems (html code within html comments ... casually I tried it against a page that had it).

May 13, 2009

Yup, that's the issue. I entered dd+d but when it's echoed it shows with whitespace, dd d instead.

The whitespace is url-encoded as plus (+) so I guess you need to urlencode your "t.value" before passing it to json. Try to search urlencode in the js section of this forum.

May 12, 2009

Uhm... that print_r() shouldn't have been there... should have been:
function countContains($string, $chars)
{
preg_match_all('#[' . preg_quote($chars, '#') . ']#', $string, $matches);
return count($matches[0]);
}
Sorry.

Hello, it's just a side issue, but you could rewrite it in this way:

function countContains($string, $chars){
return preg_match_all('#[' . preg_quote($chars, '#') . ']#', $string, $matches);
}

January 3, 2008

[...]

Second, I would like to check for a "http://, ftp://" start to a url - and if there isn't one add "http:" to the front of it. (if there is a "[a-z]+://" then don't add anything) Can you look at my code and see what I am doing wrong?

Well, you could not hard-set your output within your regular expression. What I've told before is that using a conditional statement you will know in advance that your output will be "A" or "B" match, but you couldn't force your output to be "A", "B" or some thing else.

You needs to work with a callback function to check & force your output.

January 3, 2008

The syntax you are using is also a conditional statement; you can set two ways of matching a string in the same regexp. I.e.

<?php 
# if a line starts w/ a number then match something, otherwise something else
#(?(condition)true case|false case)
$rex='/^(?(?=\d+)something|somethingelse)/m';

In a way you could set your output, but it is not its first aim. I'd keep on using a callback function to check your values.

December 27, 2007

Try this code:

<?php 
if (!preg_match("/^[a-z]+(,[a-z]+)+$/",$a))
{
   echo "no";
}
?>

August 20, 2007

yes realand it works i changed the regex as you told to do in the comment.

so how can you make it to work with google.com ... ? also i have this site http://www.vizual.co.in/enquiry_form.html

where the input boxes are available but their source is in this way

<INPUT name=name22 id=name22 value="" class="formfield">

there is no type="text" attribute . so can you change the regex ?

thanks for the advanced help.

Back from lunch.

That page you posted doesn't have any form tags, so regex can't match anything, instead I've modified the regexp to match no type att. or no quote cases, it seems to work:

<pre><?php
$form=array();
preg_replace_callback('/<form[^>]+action=("|\')(.+?)(\\1).+?<\/form>/is','cb_form',$text);

function cb_form($mth){
global $form;
preg_match_all('/<input(?(?=[^>]+type=)(?=[^>]+type=(?(?="|\')("|\')(?:text|password)(?:\\1)|(?:text|password))))[^>]+name=(?(?="|\')("|\')(.+?)(?:\\2)|(\S+))/is',$mth[0],$names);
$form[$mth[2]]=($names[2][0]!='')?$names[3]:$names[4];
}
print_r($form);
?></pre>

August 20, 2007

Currently the regexps work only with double/single quotes (so not something like name=namefield).

What does that mean ?

preg_match_all('/<input(?=[^>]+type="text")[^>]+name=("|\')(.+?)(\\1)/is',$mth[0],$names);

this regex works only for type="text" , if i need for both "password" and "text" where i should i add ?

It means that if some forms don't use quotes, like google, my code doesn't match the name fields. I could add if you need it, but for now let's check if it works.

For the password problem I wrote a comment just below the preg_match_all line to explain how to add more than one attribute.

August 20, 2007

Try something like this... I've tried this code only against the last page you've posted. Currently the regexps work only with double/single quotes (so not something like name=namefield).

<pre><?php 
$text='your html code';
$form=array();
preg_replace_callback('/<form[^>]+action=("|\')(.+?)(\\1).+?<\/form>/is','cb_form',$text);

function cb_form($mth){
global $form;
preg_match_all('/<input(?=[^>]+type="text")[^>]+name=("|\')(.+?)(\\1)/is',$mth[0],$names);
// for more than one type attribute 
// '/<input(?=[^>]+type="(?:text|hidden)")[^>]+name=("|\')(.+?)(\\1)/is'
$form[$mth[2]]=$names[2];
}

print_r($form);
?></pre>

July 25, 2007

Try this regex:

$rex='/("|\')(??:\\\\\\\\)*|.*?[^\\\\](?:\\\\\\\\)*)(?:\1|$)/s';

(It is a part of the other regex I posted yesterday)

a little explanation

$rex='/
("|\')		# match " or \'
(?:		
(?:\\\\\\\\)*|	# empty string or even number of slashes 
		# like \\" or \\\\" or \\\\\\" etc
		# OR
.*?[^\\\\](?:\\\\\\\\)*	# non-empty string 
			# ending with a even number of slashes
			# checking that the previous char is not a slash 
)
(?:\1|$)	# match the first matching or the end of the string
/xs';

July 23, 2007

mmm, I've adjusted a class I wrote to parsing php code to work as request (I hope). It works as lococobra deduced. The result is an array where even elements contain html code and odd ones php code. I've made only few tests so I don't assure anything

<?php
include_once 'cl.split.code.php';  
$code=file_get_contents('some_mixed_code.php');

$hcode = new lh_splitCode() ;
$hcode->lh_splitting( $code ) ;
print_r( $hcode->lh_get_code() );
?>

here the class:


<?php 

/**
* Andrea Ponzi, b 1.0, 23/07/2007
*
*/


class lh_splitCode {

var $original_code ;
var $hliteCode ;
var $parsedCode ;
var $endphptag='[ENDPHPTAG]';
var $re_open_tag_php = '/(?>^(.*?)<\?(??i)php)?(.*)$)/sS' ;
var $re_parse_mixed_code='/(?"|\')(??:\\\\\\\\)*|.*?[^\\\\](?:\\\\\\\\)*)(\1|$))|(??:#|\/\/)(?m-s).*\r?\n)|(?:\/\*.*?(?:\*\/|$))|(?:\?>.*$)|<\?/sS';	

function __lh_initialize($code)
  	{
  		$this->original_code = $code ;
	$this->hliteCode = $this->original_code ;
	$this->parsedCode = array() ;
  	}	
  	
  	function lh_splitting( $code=false )
{
	$this->__lh_initialize($code);

	if ($this->original_code==false) 
		return false;

	$this->__lh_parsing_code();

	for($i=1,$c=count($this->parsedCode);$i<$c;$i+=2)
	$this->parsedCode[$i]=str_replace('[OPENPHP]','<?',$this->parsedCode[$i]);
}

function lh_get_code(){ return $this->parsedCode ; }	

function __lh_parsing_code(){

		while(preg_match($this->re_open_tag_php, $this->hliteCode, $mth)){
			$this->parsedCode[] = $mth[1] ;
			$this->hliteCode = preg_replace_callback
							  (
								$this->re_parse_mixed_code
							   ,array( &$this,'__lh_parsing_engine_cback' )
							   ,$mth[2]
							);

	if ( strpos($this->hliteCode,$this->endphptag)!==false )
			{
				$tmp = explode($this->endphptag, $this->hliteCode) ;
				$this->parsedCode[] = $tmp[0] ;
				$this->hliteCode = $tmp[1] ;
			}

		}

	if (trim($this->hliteCode)!='')
		$this->parsedCode[] = $this->hliteCode ;
}

function __lh_parsing_engine_cback($mths) 
{
	if( $mths[0]=='' ) return '';
	if( $mths[0]=='<?' ) return '[OPENPHP]';

	$str=($mths[0]{0}=='?')?$this->endphptag.substr($mths[0],2):$mths[0];
return $str ;
}	

}

EDIT: forgotten to say that the php tags are splitted so they are not in the results.

July 10, 2007

Try this code:

<?php 
$rex='/\[(\w+)\](?!.*?\[\\1\]).*?\[\/\\1\]/s';

while(preg_match($rex,$str,$mth))
  $str=str_replace($mth[0],'',$str);
?>

July 6, 2007

Mac address' are in hexadecimal rapresentation, if i'm not wrong, so you have also to consider letteral characters, something like this


echo preg_replace('/\b[A-F0-9]\b/i','0$0',$mac);

July 3, 2007

Well, do you need to match only one line or many? anyways, to match one single line try this:


$myString='|$INFO Some information01$Some information02$Some information03$Some information04$Some information05$Some information06$Some information07$Some information08$|';
$myString=preg_split('/\|\$INFO\s*|\$|\|/',$myString,-1,PREG_SPLIT_NO_EMPTY);
echo '<pre>'.print_r($myString,true).'</pre>';

July 2, 2007

Look just few 3d below this one http://www.phpfreaks.com/forums/index.php/topic,147173.msg631053.html#msg631053

July 2, 2007

Try this:

preg_match_all('/<h2>.*?<\/h2>/is',$your_text,$mth);
echo implode("\n",$mth[0]);

June 25, 2007

Another way to write the characters is \xXX syntax, where XX is the hexadecimal value of a char. So :

$rex='/\x20/'; # white space
$rex='/\x0A/'; # new line
# etc

that I guess it is more readable.

June 25, 2007

I used that code against your link, anyways, try to use only the first preg_match, that works for multiline strings and it matches the div's content.

$htmlpage='your html code here';
if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
    echo $mth[1];
else echo 'No matching found.';

June 24, 2007

If you want to match each line within the div you needs two r.expressions. The first matches the div's and the second each line. Try this:

if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
{
preg_match_all('/<p>(.+?)<\/p>/s',$mth[1],$paragrahps);
 # if you want to exclude titles (span/strong lines)
 # preg_match_all('/<p>(?!<strong|<span)(.+?)(?:<br>)?<\/p>/s',$mth[1],$paragrahps);
echo '<pre>'.print_r($paragrahps[1],true).'</pre>';
}
else echo 'No matching found.';

June 24, 2007

I didn't get if you want the entire DIV's content or only the first paragraph, anyways.. for the latter try this one:

$rex='/(?<=<div class="stdcontent" id="topStoryPreviewDiv">)\s*<p>(.+)<\/p>/';

June 21, 2007

You could replace + with {3,21}

/^[a-zA-Z0-9 -]{3,21}$/

or use a lookahead assertion like

/^(?=.{3,21}$)[a-zA-Z0-9\x20-]+$/

In this case first it checks the number of chars and afterward it checks the characters.

June 20, 2007

Try to change your function in this way:


function filter($string)
{
  $pattern[0] = 'fuck';
  $pattern[1] = 'ass';
  $pattern[2] = 'shit';
  $replacement = 'beep';
  
  $pattern=implode('|',$pattern);
  #matches only whole words 
  return preg_replace("/\b($pattern)\b/i", $replacement, $string);
}

November 8, 2006

Yes, regex engine usually doesn't work with nested patterns. But in this case I guess you could define one more pattern that matches <place*:
first replace those tags starting with <place (something like '/<place[b]\B[/b][^>]*>(.*?)<\/place[b]\B[/b][^>]*>/is' ) and then replace the last ones (that match <place).

Sign In

rea|and

Posts

Joined

Last visited

Content Type

Profiles

Forums

Posts posted by rea|and

[SOLVED] Breaking up long words without interfering with URLs

Php preg_match_all problem

How to clear all html tags except some from html file using regex?

Allowing the plus sign in a sting

Regex with custom character sets

Assigning Values within Regex

Assigning Values within Regex

[SOLVED] preg

Greatest Problem On Earth !

Greatest Problem On Earth !

Greatest Problem On Earth !

More problem with quotes

[SOLVED] Extremly compilcated regex/coding problem.

[SOLVED] Substring using preg_replace

[SOLVED] help with mac address

[SOLVED] Matching

[SOLVED] RegEx problem again...

[SOLVED] RegEx problem again...

spaces in a preg_match(?????)

preg_match HELP

preg_match HELP

preg_match HELP

[SOLVED] can I combine these two regexs into one?

[SOLVED] bad word filter

Remove Office Non-Standard Microsoft Office Tags

Browse

Activity

Important Information