Jump to content

Trouble getting <img> alt text with regular expression


Greaser9780

Recommended Posts

Here is what I have:

$regexp = "<img\s[^>]*alt=(\"??)([^\" >]*?)\\1[^>]*>"; 
if(preg_match_all("/$regexp/siU", $input, $img, PREG_SET_ORDER)) { 
foreach($img as $alt){
echo "$alt[2] <br />";
}}

 

I am a reg expression noob. This presents empty results. Can anyone find the error?

I'm in the midst of building a search engine. I am going to rank sites for each search on things like % of query words to total words for various groups(alt text, description words, title, keywords) and so on. I did however get the following to work:

$regexp = "<img src=\"(.*)\" alt=\"(.*)\">"; 
if(preg_match_all("/$regexp/siU", $input, $img, PREG_SET_ORDER)) { 
foreach($img as $alt){
$alt_words = adv_count_words($alt[2]);
echo "$alt[2]  -  $alt_words<br />";

 

So far I can get the words and word count for the following:

title

meta description

meta keywords

link text

link href

heading text

bold text

underlined text

img alt text

Now I'm starting to wonder how to put it all in the db. Since the links, alt text,headings, and bold type are all listed by line. I hate to input it with a foreach loop since this will cause many small queries, but maybe this is better than one ginormous query.

    Then there is the problem when a search is performed. How exactly should the user query to keyword % get checked?

1. Check for all keywords

  if all are found what is the percentage

  if all are not found break string into individual words and check for %

or

2. Break user query into individual words and check percentage this way?

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.