Simple Regex problem driving me mad

paulious · October 23, 2009

Hi everyone, i have just started using regex and i can't see where i am going wrong. I want to use it to search for this string within a website

Inspection number</th>
<td>327355</td>

it should look for inspection number and continue along until the </td> tag and return the number, i tried something simple like this:

$res = preg_match(
    "Inspection number.........[0-9]${6}"

but it complains - preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash in /home/paulious/public_html/email-extractor.php on line 38

i also tried:

"Inspection number</th><td>(*.)</td>",

with and without the escape character \ infront of /

It is only the $pattern parameter that is confusing me the rest i have working fine. Thanks in advance

MadTechie · October 23, 2009

try this

$text = 'Inspection number</th>
<td>327355</td>';
if (preg_match('%Inspection number</th>\s*<td>(\d*?)</td>%sm', $text, $regs)) {
$result = $regs[1];
}else{
$result = 'N/A';
}
echo $result;

EDIT:

Delimiter must not be alphanumeric or backslash

means you didn't add the delimiters (in my example i used %)

so this would also be okay

"\Inspection number</th><td>(*.?)</td>\",

paulious · October 23, 2009

Thank you that worked perfectly the only change i made was not to define $text as you have but i used


$text = file_get_contents($_REQUEST['url']);

basically reading in all of the source code of the specified url

Please could you explain how your $pattern string works? as i will need to understand it to scan for other similar matches

MadTechie · October 23, 2009

Please could you explain how your $pattern string works? as i will need to understand it to scan for other similar matches

Sure

%Inspection number</th>\s*<td>(\d*)</td>%sm

Inspection number</th> = this is a literally a match for match (Inspection number</th> matches Inspection number</th>)

\s* = matches Zero or more white spaces (ie spaces, returns, tabs etc)

<td> = literal match (matches <td>)

(\d*)</td> = (\d*) =Captures and matches \d digits (numbers) Until it finds</td> = literal match (matches </td>)

I hope that makes sense

Summary

\d = digits

\s = white spaces

* = find zero or more

Sign In

Simple Regex problem driving me mad

Recommended Posts

paulious

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

paulious

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information