Jump to content

Little regexp for SEO purposes


AudiS2

Recommended Posts

Hi Gang,

 

I am trying to create a php function that will parse a html, and alter or <img> tags to add the alt and title attributes that will match the src filename.

 

If you do not know html this is what I am trying to do.

 

<img src="http://www.someserver.com/.../ferrari.jpg">

 

should be replaced with

 

<img src="http://www.someserver.com/.../ferrari.jpg" alt="ferrari" title="ferrari" >

 

If you understand html you will know what I mean.

 

So function should work like this

 

1. Accept a string containing html with zero or more <img> tags

2. Extract the name of the image from the src attribute. src can be found in several forms:

src="http:/.../image.ext"

src = "http:/.../image.ext"

src ='http:/.../image.ext'

src=http:/.../image.ext

3. Add alt="image" and title="image" (without extension) to the <img> tag

4. Repeat for all images

5. Return altered html (yikes!)

 

I hope someone can help.

 

Link to comment
https://forums.phpfreaks.com/topic/93600-little-regexp-for-seo-purposes/
Share on other sites

<pre>
<?php
$data = <<<DATA
	<h1>Test</h1>
	ABC
	<img src="http://www.someserver.com/.../ferrari.jpg">
	DEF
	<img src='http://www.someserver.com/.../ferrari.jpg'>
	GHI
	<img src=http://www.someserver.com/.../ferrari.jpg alt="cheese">
	JKL
	<img src = http://www.someserver.com/.../ferrari.jpg title="perl">
DATA;

function process ($matches) {
	### Normalize spacing around attributes.
	$matches[0] = preg_replace('/\s*=\s*/', '=', $matches[0]);
	### Get source.
	preg_match('/src\s*=\s*([\'"])?((?(1).+?|[^\s>]+))(?(1)\1)/', $matches[0], $source);
	### Swap with file's base name.
	preg_match('%[^/]+(?=\.[a-z]{3}\z)%', $source[2], $source);
	### Separate URL by attributes.
	$pieces = preg_split('/(\w+=)/', $matches[0], -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
	### Add missing pieces.
	if (!in_array('title=', $pieces)) {
		array_push($pieces, 'title="' . $source[0] . '"');
	}
	if (!in_array('alt=', $pieces)) {
		array_push($pieces, 'alt="' . $source[0] . '"');
	}
	return implode(' ', $pieces);
}

echo preg_replace_callback('/<img[^>]+/', 'process', $data);
?>
</pre> 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.