Jump to content

Get two patterns on Html File


thales.pereira

Recommended Posts

Helllo all,

 

Again, im struggling at one situation.

 

The following output is a extract from a group html file where name = group name and Membership are the members that have access to the group. So, im trying to extract that information and insert into a database ( i got no problem with that ) but i just cant figure it out how im able to extract just the group name, and all the members for that group. 

 

 

 

On this example, i just pasted with two groups, but the original file have 20+

 

 

 

<TD valign="top"><B>name</B></TD> 
<TD>Administrators</TD> 
</TR> 
<TR> 
<TD valign="top"><B>membership</B></TD> 
<TD> 
<TABLE> 
<TR> 
<TD>Administrator</TD> 
</TR> 
<TR> 
<TD>MWM</TD> 
</TR> 
<TR> 
<TD>MW_USER</TD> 
</TR>
<TD valign="top"><B>name</B></TD> 
<TD>Developers</TD> 
</TR> 
<TR> 
<TD valign="top"><B>membership</B></TD> 
<TD> 
<TABLE> 
<TR> 
<TD>user_a</TD> 
</TR> 
<TR> 
<TD>user_b</TD> 
</TR> 
<TR> 
<TD>user_c</TD> 
</TR>
and more...

 

 

Until now, im only able to get the group names:

 

   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, "http://$iUser:$iPass@$iHost/invoke/wm.server.access/groupList");
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   $output = curl_exec($ch);
   curl_close($ch);

   preg_match_all('#name</B></TD>\s<td>([^<]+)#i', $output, $match);
   print_r($match[1]);


}  

 

Current Output:

Array
(
    [0] => Administrators
    [1] => Developers
)  

 

 

But what i really need is something like:

 

 

Array
(
    [0] => Administrators
    Array
    (
          [0] => Administrator
          [1] => MWM
          [2] => MW_USER
    )

    [1] => Developers
    Array
    (
          [0] => user_a
          [1] => user_b
          [2] => user_c
    )
)  

 

 

 

 

If anyone have idea about how im able to do this, would be awesome :)

 

thanks in advance!

 

 

 

Link to comment
Share on other sites

I think it was I who gave you the last reply to your question about this and provided you with a working regex. Anyhow, your current regex doesn't match anything at all with the content you gave me.

 

What you're dealing with here are repeating series and that's something Regex doesn't support so you'll have to solve it some other way. If you have a string like "mamamamama"

and want to capture every ma like this: '#(ma)*#' you'll only be able to match the first and the last ma not those in-between, that's a big shortcoming of Regex, maybe they'll develop a way to support this in the future.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.