Jump to content

index everything between < and > but ignoring php, style and script


superaktieboy

Recommended Posts

hi

i have a code like this (selected from a Database, so is different as this, as it is posted by users)
[code]
<?
echo 'this is php higlighted';
?>

<html>
this is as html highlighted
</html>

script language="text/javascript" (with < and > but coz that posting problem)
// this is javascript highlighted
/script (with < and > but coz that posting problem)

<style>
/*
this is css highlighted
*/
</style>
[/code]

now i got for each highlighter it's own function called: highlight_php, highlight_html, highlight_js and highlight_css and next to that i got a function called 'highlight($input, $unparse = false)'

now i use this to highlight html
[code]
preg_match_all('#(<.*?>)#is', $highlighted, $matches);
for ($i = 0; $i <= count($matches[0]); $i++)
{
$replacer = highlight_html($matches[0][$i]);
$highlighted = str_replace($matches[0][$i], $replacer, $highlighted);
}
[/code]

this is the function highlight_html():
[code]
function highlight_html($input)
{
/**
* These three array will make it easyer to preg_replace to colors
*/
$commands_boldblue = array
(
'#(\ba\b)#', '#(\babbr\b)#', '#(\babove\b)#', '#(\bacronym\b)#', '#(\baddress\b)#', '#(\bapplet\b)#', '#(\barea\b)#',
'#(\barray\b)#', '#(\bb\b)#', '#(\bbase\b)#', '#(\bbdo\b)#', '#(\bbgsound\b)#', '#(\bbig\b)#', '#(\bxmp\b)#',
'#(\bblink\b)#', '#(\bblockquote\b)#', '#(\bbody\b)#', '#(\bbox\b)#', '#(\bbr\b)#', '#(\bbutton\b)#', '#(\bcaption\b)#',
'#(\bcenter\b)#', '#(\bcite\b)#', '#(\bcode\b)#', '#(\bcol\b)#', '#(\bcolgroup\b)#', '#(\bcomment\b)#', '#(\bdd\b)#',
'#(\bdel\b)#', '#(\bdfn\b)#', '#(\bdir\b)#', '#(\bdiv\b)#', '#(\bdl\b)#', '#(\bdoctype\b)#', '#(\bdt\b)#', '#(\bem\b)#',
'#(\bembed\b)#', '#(\bfieldset\b)#', '#(\bfig\b)#', '#(\bform\b)#', '#(\bframe\b)#', '#(\bframeset\b)#', '#(\btfoot\b)#',
'#(\bh\b)#', '#(\bh1\b)#', '#(\bh2\b)#', '#(\bh3\b)#', '#(\bh4\b)#', '#(\bh5\b)#', '#(\bh6\b)#', '#(\bhead\b)#',
'#(\bhr\b)#', '#(\bhta\b)#', '#(\bhtml\b)#', '#(\bi\b)#', '#(\biframe\b)#', '#(\bimg\b)#', '#(\binput\b)#', '#(\bins\b)#',
'#(\bisindex\b)#', '#(\bkbd\b)#', '#(\blabel\b)#', '#(\blegend\b)#', '#(\bli\b)#', '#(\blink\b)#', '#(\blisting\b)#',
'#(\bmap\b)#', '#(\bmarquee\b)#', '#(\bmenu\b)#', '#(\bmeta\b)#', '#(\bmulticol\b)#', '#(\bnextid\b)#', '#(\bnobr\b)#',
'#(\bnoframes\b)#', '#(\bnoscript\b)#', '#(\bnote\b)#', '#(\bol\b)#', '#(\boptgroup\b)#', '#(\boption\b)#', '#(\bp\b)#',
'#(\bparam\b)#', '#(\bplaintext\b)#', '#(\bpre\b)#', '#(\bq\b)#', '#(\brange\b)#', '#(\broot\b)#', '#(\bs\b)#',
'#(\bsamp\b)#', '#(\bselect\b)#', '#(\bsmall\b)#', '#(\bsound\b)#', '#(\bspacer\b)#', '#(\btbody\b)#', '#(\btd\b)#',
'#(\bsqrt\b)#', '#(\bstrike\b)#', '#(\bstrong\b)#', '#(\bsub\b)#', '#(\bsup\b)#', '#(\btext\b)#', '#(\btextarea\b)#',
'#(\btable\b)#', '#(\btextflow\b)#', '#(\bth\b)#', '#(\bthead\b)#', '#(\btitle\b)#', '#(\btr\b)#', '#(\btt\b)#',
'#(\bu\b)#', '#(\bul\b)#', '#(\bvar\b)#', '#(\bwbr\b)#',
/**
* we replaced the text 'span' in 'c3Bhbg' by using
* the function base64_encode().
* so the next text is 'span'
*/
'#(\bc3Bhbg\b)#',
/**
* we replaced the text 'style' in 'c3R5bGU' by using
* the function base64_encode().
* so the next text is 'style'
*/
'#(\bc3R5bGU\b)#',
/**
* we replaced the text 'font' in 'Zm9udA' by using
* the function base64_encode().
* so the next two texts are
* [1] basefont
* [2] font
*/
'#(\bbaseZm9udA\b)#',
'#(\bZm9udA\b)#'
);

$commands_red = array
(
'#(\baccept\b)#', '#(\baccesskey\b)#', '#(\baction\b)#', '#(\balign\b)#', '#(\balink\b)#', '#(\balt\b)#',
'#(\bapplicationname\b)#', '#(\barchive\b)#', '#(\baxis\b)#', '#(\bbackground\b)#', '#(\bbehavior\b)#', '#(\bbelow\b)#',
'#(\bcellpadding\b)#', '#(\bbgproperties\b)#', '#(\bborder\b)#', '#(\bcellspacing\b)#', '#(\bchar\b)#',
'#(\bcharoff\b)#', '#(\bcharset\b)#', '#(\bchecked\b)#', '#(\bclass\b)#', '#(\bclassid\b)#', '#(\bclear\b)#',
'#(\bcodebase\b)#', '#(\bcodetype\b)#', '#(\bcols\b)#',  '#(\bcompact\b)#', '#(\bhttp-equiv\b)#', '#(\bhttp-equiv\b)#',
'#(\bcontent\b)#', '#(\bcoords\b)#', '#(\bdata\b)#', '#(\bdatetime\b)#', '#(\bdeclare\b)#', '#(\bdefer\b)#',
'#(\bdirection\b)#', '#(\bdisabled\b)#', '#(\bdynsrc\b)#', '#(\benctype\b)#', '#(\bequiv\b)#', '#(\bface\b)#',
'#(\bfor\b)#', '#(\bframeborder\b)#', '#(\bframespacing\b)#', '#(\bgutter\b)#', '#(\bheaders\b)#', '#(\bheight\b)#',
'#(\bhref\b)#', '#(\bhreflang\b)#', '#(\bhspace\b)#', '#(\bicon\b)#', '#(\bid\b)#', '#(\bismap\b)#',
'#(\blanguage\b)#', '#(\bleftmargin\b)#', '#(\blongdesc\b)#', '#(\bloop\b)#', '#(\blowsrc\b)#', '#(\bmarginheight\b)#',
'#(\bmarginwidth\b)#', '#(\bmaximizebutton\b)#', '#(\bmaxlength\b)#', '#(\bmedia\b)#', '#(\bmethod\b)#', '#(\bmethods\b)#',
'#(\bminimizebutton\b)#', '#(\bmultiple\b)#', '#(\bname\b)#', '#(\bnohref\b)#', '#(\bnoresize\b)#', '#(\bnoshade\b)#',
'#(\bnowrap\b)#', '#(\bobject\b)#', '#(\bonabort\b)#', '#(\bonblur\b)#', '#(\bonchange\b)#', '#(\bonclick\b)#',
'#(\bondblclick\b)#', '#(\bonfocus\b)#', '#(\bonkeydown\b)#', '#(\bonkeypress\b)#', '#(\bonkeyup\b)#', '#(\bonload\b)#',
'#(\bonmousedown\b)#', '#(\bonmousemove\b)#', '#(\bonmouseout\b)#', '#(\bonmouseover\b)#', '#(\bonmouseup\b)#',
'#(\bonreset\b)#', '#(\bonselect\b)#', '#(\bonsubmit\b)#', '#(\bonunload\b)#', '#(\bprofile\b)#', '#(\bprompt\b)#',
'#(\breadonly\b)#', '#(\brel\b)#', '#(\brev\b)#', '#(\brows\b)#',
'#(\brules\b)#', '#(\brunat\b)#', '#(\bscheme\b)#', '#(\bscope\b)#', '#(\bscrollamount\b)#', '#(\bscrolldelay\b)#',
'#(\bshape\b)#', '#(\bshowintaskbar\b)#', '#(\bsingleinstance\b)#', '#(\bsize\b)#', '#(\bsrc\b)#', '#(\bstandby\b)#',
'#(\bstart\b)#', '#(\bsummary\b)#', '#(\bsysmenu\b)#', '#(\btabindex\b)#', '#(\btarget\b)#', '#(\btopmargin\b)#',
'#(\btype\b)#', '#(\burn\b)#', '#(\busemap\b)#', '#(\bvalign\b)#', '#(\bvalue\b)#', '#(\bvaluetype\b)#',
'#(\bversion\b)#', '#(\bvlink\b)#', '#(\bvrml\b)#', '#(\bvspace\b)#', '#(\bwidth\b)#', '#(\bwindowstate\b)#',
'#(\bwrap\b)#', '#(\bscrolling\b)#', '#(\bselected\b)#',
/**
* we replaced the text 'color' in 'Y29sb3I' by using
* the function base64_encode().
* So the next five texts are
* [1] bordercolor
* [2] bordercolordark
* [3] bordercolorlight
* [4] bgcolor
* [5] color
*/
'#(\bborderY29sb3I\b)#',
'#(\bborderY29sb3Idark\b)#',
'#(\bborderY29sb3Ilight\b)#',
'#(\bbgY29sb3I\b)#',
'#(\bY29sb3I\b)#',
/**
* we replaced the text 'span' in 'c3Bhbg' by using
* the function base64_encode().
* so the next two texts are
* [1] rowspan
* [2] colspan
*/
'#(\browc3Bhbg\b)#',
'#(\bcolc3Bhbg\b)#',
/**
* we replaced the text 'style' in 'c3R5bGU' by using
* the function base64_encode().
* so the next text is 'borderstyle'
*/
'#(\bborderc3R5bGU\b)#'
);

$commands_purple = array
(
'#(\b_blank\b)#', '#(\bblack\b)#', '#(\bblue\b)#', '#(\bbottom\b)#', '#(\bgreen\b)#', '#(\bhidden\b)#', '#(\bleft\b)#',
'#(\bmagenta\b)#', '#(\bmiddle\b)#', '#(\borange\b)#', '#(\bpublic\b)#', '#(\bpurple\b)#', '#(\bred\b)#', '#(\bright\b)#',
'#(\btop\b)#', '#(\bwhite\b)#', '#(\byellow\b)#'
);

/**
* First we make the < and > special html characters, otherwise it won't echo
*/
$temp = str_replace('<', '&lt;', $input);
$temp = str_replace('>', '&gt;', $temp);

/**
* Then we give the things in the arrays $commands_purple,
* $commands_red and $commands_boldblue their own color
*/
$temp = preg_replace($commands_purple, '<span style="color: #9C029C;font-weight:bold">$1</span>', $temp);
$temp = preg_replace($commands_boldblue, '<span style="color: #0000FF;font-weight:bold">$1</span>', $temp);
$temp = preg_replace($commands_red, '<span style="color: red;font-weight:bold">$1</span>', $temp);
$temp = preg_replace($commands_red, '<span style="color: red;font-weight:bold">$1</span>', $temp);
return $temp;
}
[/code]

but when i call this function it also indexes php, script and style. and as far as i know it is this 'preg_match_all('#(<.*?>)#is', $highlighted, $matches);'
so does anyone know how to index only html, and ignoring php, script, and style??

greetzz

oh btw, i encoded some things coz otherwise it wouldn't work proper. you'll see what they're in the array
this is not realy regex, but in the function highlight_html() i put in the following before i change anything:
[code]
<?php
if(strstr('style', $input) OR strstr('<?', $input) OR strstr('<?php', $input) OR strstr('script', $input))
{
    return $input;
}
?>
[/code]

(stupid i didn't came on that earlier :P)

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.