Jump to content

Simple Regex problem driving me mad


paulious

Recommended Posts

Hi everyone, i have just started using regex and i can't see where i am going wrong.  I want to use it to search for this string within a website

 

Inspection number</th>
<td>327355</td>

 

it should look for inspection number and continue along until the </td> tag and return the number, i tried something simple like this:

 

$res = preg_match(
    "Inspection number.........[0-9]${6}"

 

but it complains  -  preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash in /home/paulious/public_html/email-extractor.php on line 38

 

i also tried:

 

"Inspection number</th><td>(*.)</td>", 

with and without the escape character \ infront of /

 

 

It is only the $pattern parameter that is confusing me the rest i have working fine. Thanks in advance

 

Link to comment
Share on other sites

try this

$text = 'Inspection number</th>
<td>327355</td>';
if (preg_match('%Inspection number</th>\s*<td>(\d*?)</td>%sm', $text, $regs)) {
$result = $regs[1];
}else{
$result = 'N/A';
}
echo $result;

 

EDIT:

Delimiter must not be alphanumeric or backslash

means you didn't add the delimiters (in my example i used %)

so this would also be okay

"\Inspection number</th><td>(*.?)</td>\", 

Link to comment
Share on other sites

Thank you that worked perfectly :) the only change i made was not to define $text as you have but i used

 


$text = file_get_contents($_REQUEST['url']);

 

basically reading in all of the source code of the specified url

 

Please could you explain how your $pattern string works? as i will need to understand it to scan for other similar matches

Link to comment
Share on other sites

Please could you explain how your $pattern string works? as i will need to understand it to scan for other similar matches

 

Sure

%Inspection number</th>\s*<td>(\d*)</td>%sm

Inspection number</th> = this is a literally a match for match (Inspection number</th> matches Inspection number</th>)

\s* = matches Zero or more white spaces (ie spaces, returns, tabs etc)

<td> = literal match (matches <td>)

(\d*)</td> = (\d*) =Captures and matches \d digits (numbers) Until it finds</td> = literal match (matches </td>)

I hope that makes sense

 

Summary

\d = digits

\s = white spaces

* = find zero or more

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.