Jump to content

highlighting search terms


Recommended Posts

well, I started this in the regular PHP section, but it no longer fits there. Suffice it to say, I'm trying to take individual search terms that are being $_POSTed and highlighting them in the search results.

The [url=http://www.phpfreaks.com/forums/index.php/topic,122532.0.html]Original Post[/url] talked about using str_replace to handle this. New problem, though, when the same search terms show up inside a HTML tag (like <img src="search_term">).

I'm trying "/\b(?!<.+?>)search_term\b/" -- but it's still finding "search_term" inside <img src="search_term">.

Thanks!
Link to comment
Share on other sites

Either [url=http://www.phpfreaks.com/forums/index.php/topic,99040.msg389868.html#msg389868]separate the tags from the content and process[/url], or just analyze the non-tagged content:

[code]
<pre>
<?php
$tests = array(
'<img src="search_term">',
'<a>search_term</a>',
'<a>Xsearch_termX</a>',
);

$term = 'search_term';
echo "Searching for <b>$term</b>...<br>";
foreach ($tests as $test) {
echo htmlspecialchars($test), ' => ';
$test = preg_replace_callback(
'/>(.+?)</',
create_function(
'$matches',
'return preg_replace("/\b(' . preg_quote($term) . ')\b/", "<b>\\\1</b>", $matches[0]);'
),
$test
);
echo htmlspecialchars($test), '<br>';
}
?>
</pre>

[/code]
Link to comment
Share on other sites

Wow, that's intense. All kinds of functions I've never heard of. In fact after reading the manual page on create_function(), I still don't understand it. $matches doesn't exist anywhere outside of create_function()???

Anyway, it's doing the same thing my original preg_replace() was doing (finding matches inside tags). Here's the new function, minus the array stuff:
[code]<?php
$test="<table>\r\n<tr>\r\n\t<td><img src=\"somestring.jpg\" alt=\"\"></td>\r\n</tr>\r\n<tr>\r\n\t<td>somestring</td>\r\n</tr>\r\n</table>\r\n";
$term = 'somestring';
$test = preg_replace_callback(
'/>(.+?)</',
create_function(
'$matches',
'return preg_replace("/\b(' . preg_quote($term) . ')\b/", "<span style=\"background:#FF0;\">\\\1</span>", $matches[0]);'
),
$test
);
echo $test;
?>[/code]

and the output

[code]<table>
<tr>
        <td><img src="<span style="background:#FF0;">somestring</span>.jpg" alt=""></td>
</tr>
<tr>
        <td><span style="background:#FF0;">somestring</span></td>
</tr>
</table>[/code]
Link to comment
Share on other sites

hey, check this out...
[code]<?php
$test="<table>\r\n<tr>\r\n\t<td><img src=\"somestring.jpg\" alt=\"\"></td>\r\n</tr>\r\n<tr>\r\n\t<td>somestring</td>\r\n</tr>\r\n</table>\r\n";
preg_match("/>(.+?)</",$test,$new_array);
print_r($new_array);
?>[/code]

result:
[code]Array
(
    [0] => ><img src="somestring.jpg" alt=""><
    [1] => <img src="somestring.jpg" alt="">
)
[/code]
Link to comment
Share on other sites

here's what />(.*?)</ matches in the example:
[code]    [0] => Array
        (
            [0] => ><
            [1] => ><
            [2] => > <
            [3] => >somestring<
            [4] => > <
        )

    [1] => Array
        (
            [0] =>
            [1] =>
            [2] => 
            [3] => somestring
            [4] => 
        )
[/code]
Link to comment
Share on other sites

Dude, it's magic. One more thing that is not quite working right: MySQL is returning case insensitive results. Is there any way to make [code=php:0]'/\b(' . preg_quote($newstring) . ')\b/'[/code] case insensitive?

Also, if you're willing to educate, I'm struggling to follow this part of the code. (?<=>) is a lookbehind for ">"? It's working perfectly to locate the text I'm searching for -- even text that isn't preceeded by ">". I don't get it, is lookbehind optional?
Link to comment
Share on other sites

[tt]([^<]+) [/tt]is capturing the CRs and NLs, and[tt] (?<=>) [/tt]is still anchoring at ">". Observe:

[code]
<pre>
<?php
$test = "<table>\r\n<tr>\r\n\t<td><img src=\"somestring.jpg\" alt=\"\"></td>\r\n</tr>\r\n<tr>\r\n\t<td>somestring</td>\r\n</tr>\r\n</table>\r\n";
preg_match_all('/(?<=>)([^<]+)/', $test, $matches);
$replace = array(
"\n" => '\n',
"\r" => '\r',
);
foreach ($matches as &$array) {
foreach ($array as &$match) {
$match = preg_replace('/([\r\n])/e', '$replace["\1"]', $match);
}
}
print_r($matches);
?>
</pre>

[/code]
Link to comment
Share on other sites

  • 11 months later...

I do like this...

 

<?php

$arrayofwords = array ();
$arrayofwords[0] = "This";
$arrayofwords[1] = "text";
$arrayofwords[2] = "need";
$arrayofwords[3] = "words";

$str = 'This is my <img src="" title="This image text"> long text <a href="#">words</a> where I need to highlight words in the HTML text.';

$str = preg_replace ( "/(?!(?:[^<]+>|[^>]+<\/a>))\b(" . implode ( '|', $arrayofwords ) . ")\b/is", "<strong>\\1</strong>", $str );

echo $str;

?>

Link to comment
Share on other sites

  • 6 months later...
  • 3 months later...

Also, if you want to exclude search_term from within head/script/a blocks as well as from within tags:

$html=preg_replace_callback('~(<head>.*?</head>|<script\s[^>]*>.*?</script>|<a\s[^>]*>.*?</a>)|search_term(?!(?=[^<>]*>))~is',create_function('$matches','return isset($matches[1]) ? $matches[1] : "<strong>$matches[0]</strong>" ;'),$html);

Link to comment
Share on other sites

  • 1 month later...
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.