Jump to content

rea|and

Members
  • Posts

    32
  • Joined

  • Last visited

    Never

Posts posted by rea|and

  1. Try this one:

     

    $pattern = '/<(a(?=[^>]+class=(\'|")Link(?:\\2))|span(?=[^>]+class=(\'|")Title(?:\\3)))[^>]+>(.*?)<\/\\1>/si';
    preg_match_all($pattern, $result, $matches, PREG_SET_ORDER);
    
    foreach ($matches as $val) {
       echo "Tag: {$val[1]} - {$val[4]}<br />";
    }
    
    

     

     

  2. Thanks.. will try it. Can't we do like just remove all tags except few. I know some regex but not so complex.

    I know we can add ^ which means "NOT".

    Can't we do something like that?

    Have you tried strip_tags? As second parameter it takes the allowable tags.

    Anyways if you want do it with regex try this one (I've modified Crayon Violent regexp adding a negative lookahead assertion):

     

    $content=preg_replace('/<\/?(?!input|textarea|select)[^>]*>/','',$content);

     

    In some cases it could have problems (html code within html comments ...  casually I tried it against a page that had it).

     

     

     

  3. Yup, that's the issue. I entered dd+d but when it's echoed it shows with whitespace, dd d instead.

    The whitespace is url-encoded as plus (+) so I guess you need to urlencode your "t.value" before passing it to json. Try to search urlencode in the js section of this forum.

     

     

  4. Uhm... that print_r() shouldn't have been there... should have been:

     

    function countContains($string, $chars)
    {
    preg_match_all('#[' . preg_quote($chars, '#') . ']#', $string, $matches);
    return count($matches[0]);
    }

     

    Sorry.

    Hello, it's just a side issue, but you could rewrite it in this way:

     

    function countContains($string, $chars){
    return preg_match_all('#[' . preg_quote($chars, '#') . ']#', $string, $matches);
    }

     

     

  5. [...]

    Second, I would like to check for a "http://, ftp://" start to a url - and if there isn't one add "http:" to the front of it. (if there is a "[a-z]+://" then don't add anything) Can you look at my code and see what I am doing wrong?

     

    Well, you could not hard-set your output within your regular expression. What I've told before is that using a conditional statement you will know in advance that your output will be "A" or "B"  match, but you couldn't force your output to be "A", "B" or some thing else.

    You needs to work with a callback function to check & force your output.

     

     

  6. The syntax you are using is also a conditional statement; you can set two ways of matching a string in the same regexp. I.e.

     

    <?php 
    # if a line starts w/ a number then match something, otherwise something else
    #(?(condition)true case|false case)
    $rex='/^(?(?=\d+)something|somethingelse)/m';
    

     

    In a way you could set your output, but it is not its first aim. I'd keep on using a callback function to check your values.

  7. yes realand it works i changed the regex as you told to do in the comment.

     

    so how can you make it to work with google.com ... ? also i have this site http://www.vizual.co.in/enquiry_form.html

     

    where the input boxes are available but their source is in this way

    <INPUT name=name22 id=name22 value="" class="formfield">

     

    there is no type="text" attribute . so can you change the regex ?

     

    thanks for the advanced help.

     

    Back from lunch.

    That page you posted doesn't have any form tags, so regex can't match anything, instead  I've modified the regexp to match no type att. or no quote cases, it seems to work:

    <pre><?php
    $form=array();
    preg_replace_callback('/<form[^>]+action=("|\')(.+?)(\\1).+?<\/form>/is','cb_form',$text);
    
    function cb_form($mth){
    global $form;
    preg_match_all('/<input(?(?=[^>]+type=)(?=[^>]+type=(?(?="|\')("|\')(?:text|password)(?:\\1)|(?:text|password))))[^>]+name=(?(?="|\')("|\')(.+?)(?:\\2)|(\S+))/is',$mth[0],$names);
    $form[$mth[2]]=($names[2][0]!='')?$names[3]:$names[4];
    }
    print_r($form);
    ?></pre>

  8. Currently the regexps work only with double/single quotes (so not something like name=namefield).

     

     

    What does that mean ?

     

    preg_match_all('/<input(?=[^>]+type="text")[^>]+name=("|\')(.+?)(\\1)/is',$mth[0],$names);

     

    this regex works only for type="text" , if i need for both "password" and "text" where i should i add ?

     

    It means that if some forms don't use quotes, like google, my code doesn't match the name fields. I could add if you need it, but for now let's check if it works.

     

    For the password problem I wrote a comment just below the preg_match_all line to explain how to add more than one attribute. :)

     

  9. Try something like this... I've tried this code only against the last page you've posted. Currently the regexps work only with double/single quotes (so not something like name=namefield).

     

    <pre><?php 
    $text='your html code';
    $form=array();
    preg_replace_callback('/<form[^>]+action=("|\')(.+?)(\\1).+?<\/form>/is','cb_form',$text);
    
    function cb_form($mth){
    global $form;
    preg_match_all('/<input(?=[^>]+type="text")[^>]+name=("|\')(.+?)(\\1)/is',$mth[0],$names);
    // for more than one type attribute 
    // '/<input(?=[^>]+type="(?:text|hidden)")[^>]+name=("|\')(.+?)(\\1)/is'
    $form[$mth[2]]=$names[2];
    }
    
    print_r($form);
    ?></pre>
    

  10. Try this regex:

     

    $rex='/("|\')(??:\\\\\\\\)*|.*?[^\\\\](?:\\\\\\\\)*)(?:\1|$)/s';
    

    (It is a part of the other regex I posted yesterday)

     

    a little explanation

    $rex='/
    ("|\')		# match " or \'
    (?:		
    (?:\\\\\\\\)*|	# empty string or even number of slashes 
    		# like \\" or \\\\" or \\\\\\" etc
    		# OR
    .*?[^\\\\](?:\\\\\\\\)*	# non-empty string 
    			# ending with a even number of slashes
    			# checking that the previous char is not a slash 
    )
    (?:\1|$)	# match the first matching or the end of the string
    /xs';
    

  11. mmm, I've adjusted a class I wrote to parsing php code to work as request (I hope). It works as lococobra deduced. The result is an array where even elements contain html code and odd ones php code. I've made only few tests so I don't assure anything :)

     

    <?php
    include_once 'cl.split.code.php';  
    $code=file_get_contents('some_mixed_code.php');
    
    $hcode = new lh_splitCode() ;
    $hcode->lh_splitting( $code ) ;
    print_r( $hcode->lh_get_code() );
    ?>
    

     

     

    here the class:

     

    
    <?php 
    
    /**
    * Andrea Ponzi, b 1.0, 23/07/2007
    *
    */
    
    
    class lh_splitCode {
    
    var $original_code ;
    var $hliteCode ;
    var $parsedCode ;
    var $endphptag='[ENDPHPTAG]';
    var $re_open_tag_php = '/(?>^(.*?)<\?(??i)php)?(.*)$)/sS' ;
    var $re_parse_mixed_code='/(?"|\')(??:\\\\\\\\)*|.*?[^\\\\](?:\\\\\\\\)*)(\1|$))|(??:#|\/\/)(?m-s).*\r?\n)|(?:\/\*.*?(?:\*\/|$))|(?:\?>.*$)|<\?/sS';	
    
    function __lh_initialize($code)
      	{
      		$this->original_code = $code ;
    	$this->hliteCode = $this->original_code ;
    	$this->parsedCode = array() ;
      	}	
      	
      	function lh_splitting( $code=false )
    {
    	$this->__lh_initialize($code);
    
    	if ($this->original_code==false) 
    		return false;
    
    	$this->__lh_parsing_code();
    
    	for($i=1,$c=count($this->parsedCode);$i<$c;$i+=2)
    	$this->parsedCode[$i]=str_replace('[OPENPHP]','<?',$this->parsedCode[$i]);
    }
    
    function lh_get_code(){ return $this->parsedCode ; }	
    
    function __lh_parsing_code(){
    
    		while(preg_match($this->re_open_tag_php, $this->hliteCode, $mth)){
    			$this->parsedCode[] = $mth[1] ;
    			$this->hliteCode = preg_replace_callback
    							  (
    								$this->re_parse_mixed_code
    							   ,array( &$this,'__lh_parsing_engine_cback' )
    							   ,$mth[2]
    							);
    
    	if ( strpos($this->hliteCode,$this->endphptag)!==false )
    			{
    				$tmp = explode($this->endphptag, $this->hliteCode) ;
    				$this->parsedCode[] = $tmp[0] ;
    				$this->hliteCode = $tmp[1] ;
    			}
    
    		}
    
    	if (trim($this->hliteCode)!='')
    		$this->parsedCode[] = $this->hliteCode ;
    }
    
    function __lh_parsing_engine_cback($mths) 
    {
    	if( $mths[0]=='' ) return '';
    	if( $mths[0]=='<?' ) return '[OPENPHP]';
    
    	$str=($mths[0]{0}=='?')?$this->endphptag.substr($mths[0],2):$mths[0];
    return $str ;
    }	
    
    }
    
    
    

     

    EDIT: forgotten to say that the php tags are splitted so they are not in the results.

     

     

  12. Well, do you need to match only one line or many? anyways, to match one single line try this:

     

    
    $myString='|$INFO Some information01$Some information02$Some information03$Some information04$Some information05$Some information06$Some information07$Some information08$|';
    $myString=preg_split('/\|\$INFO\s*|\$|\|/',$myString,-1,PREG_SPLIT_NO_EMPTY);
    echo '<pre>'.print_r($myString,true).'</pre>';
    
    

  13. I used that code against your link, anyways, try to use only the first preg_match, that works for multiline strings and it matches the div's content.

     

    $htmlpage='your html code here';
    if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
        echo $mth[1];
    else echo 'No matching found.';
    

  14. If you want to match each line within the div you needs two r.expressions. The first matches the div's and the second each line. Try this:

     

    if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
    {
    preg_match_all('/<p>(.+?)<\/p>/s',$mth[1],$paragrahps);
     # if you want to exclude titles (span/strong lines)
     # preg_match_all('/<p>(?!<strong|<span)(.+?)(?:<br>)?<\/p>/s',$mth[1],$paragrahps);
    echo '<pre>'.print_r($paragrahps[1],true).'</pre>';
    }
    else echo 'No matching found.';
    

  15. I didn't get if you want the entire DIV's content or only the first paragraph, anyways.. for the latter try this one:

     

    $rex='/(?<=<div class="stdcontent" id="topStoryPreviewDiv">)\s*<p>(.+)<\/p>/';
    

     

  16. Try to change your function in this way:

    
    function filter($string)
    {
      $pattern[0] = 'fuck';
      $pattern[1] = 'ass';
      $pattern[2] = 'shit';
      $replacement = 'beep';
      
      $pattern=implode('|',$pattern);
      #matches only whole words 
      return preg_replace("/\b($pattern)\b/i", $replacement, $string);
    }
    

     

     

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.