Jump to content

Preg Match Question Multiple Values


factoring2117

Recommended Posts

I need to preg match multiple values, but my code only seems to want to grab the first value then it stops.

 

Here is an example

 

HTML Code:

<Td><a href="list.php?a=add&id=274619213&g=1">me</a></td>
<Td ><a href="list.php?a=add&id=463335839&g=1">me</a></td>
<Td><a href="list.php?a=add&id=106690164&g=1">me</a></td>

 

I need to extract the id number from every line on the page, but there are hundreds on the page. This is the code I have so for. I believe I need a for statement but I don't know how to set it up.

 

if (preg_match('#&id=(.+)&g=1#', $html, $matches)) {
$id = $matches[1];
}

 

Please help me figure this out.

 

Thank you.

Link to comment
Share on other sites

Here is my example:

 

$data = <<<HTML
<Td><a href="list.php?a=add&id=274619213&g=1">me</a></td>
<Td ><a href="list.php?a=add&id=463335839&g=1">me</a></td>
<Td><a href="list.php?a=add&id=106690164&g=1">me</a></td>
HTML;

preg_match_all('#<td[^>]*><a.+?id=(\d+).*?>.*?</td>#is', $data, $matches);
echo '<pre>'.print_r($matches[1], true);

 

output:

Array
(
    [0] => 274619213
    [1] => 463335839
    [2] => 106690164
)

 

I am making some assumptions...they are:

a) I use the i modifiers (for case insensitivity, as there may be <td or <Td or <TD), and I use the s modifier incase some segements within the pattern are on another line... most likely, you won't need the s, but I added it as a safe guard just in case.

b) Since one the examples is <Td > there is a space there, so I used <td[^>]*> to match anything up to, and including the >.

c) I am assuming that all ids are found with the a tag...

 

The solution I provided is a 'quick and dirty' way, which isn't necessarily bulletproof. But for the example you provided, assuming the pages have that sort of formatting, it should do the trick.

 

I think you could also use this pattern:

#<td[^>]*><a[^>]+id=(\d+).*?>.*?</td>#is

 

the [^>]+ will match up to the last character before the first > of the opening a tag, then backtrack to find id=.... This method is slower I would wager, however might add an extra layer of assurance that it checks for id= as an attribute with the opening a tag, and not match some id somewhere else.

 

EDIT, actually, I'm not so sure about that last example / explanation, so just try the first one and see what it gives you.

Link to comment
Share on other sites

Another alternative (using DOMDocument / XPath()) could include:

 

$data = <<<HTML
<Td><a href="list.php?a=add&id=274619213&g=1">me</a></td>
<Td ><a href="list.php?a=add&id=463335839&g=1">me</a></td>
<Td><a href="list.php?a=add&id=106690164&g=1">me</a></td>
HTML;

$dom = new DOMDocument;
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//td/a[@href]');
foreach ($aTag as $val) {
if(preg_match('#id=(\d+)#', $val->getAttribute('href'), $match)){
	echo $match[1] . "<br />\n";
}
}

 

This would be a better alternative IMO. Feels more solid with less room for mishaps.

For this to work on a site page, you would change:

 

@$dom->loadHTML($data);

 

to:

 

@$dom->loadHTMLFile('http://www.whateversite.whatever'); // insert the URL in question within the quotes.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.