Jump to content

Parsing HTML code to get all input fields


Recommended Posts

'm trying to retrive all the input fields from the HTML (retrieved via CURL.  I need to get the fields' names and values.

The following grabs TWO fields and doesn't give me the names but DOES give me the values.  The quote at the bottom shows what the print_r sends to the browser.

[code]
$page = '<input name="action" type="hidden" value="transmit">
                        SELECTION <FONT COLOR=Green> »<INPUT NAME=Fld1 TYPE=Text SIZE=2 MAXLENGTH=2  VALUE=""></FONT>«
                                                        »<INPUT NAME=Fld2 TYPE=Text SIZE=1 MAXLENGTH=1  VALUE="">«
              <br>
              <input type="submit" style="width:75px" name="transmit" value="Transmit"><br>
              <input type="submit" style="width:75px" name="quit" value="Quit"><br>
              <input type="submit" style="width:75px" name="exit" value="Exit">
';
  preg_match_all('/<INPUT.*NAME=[b](.*)[/b].*VALUE=[b]("")[/b]>/imU',$page,$matches);
echo "<pre>";
    print_r($matches);
  echo "</pre>";
[/code]

[quote]
<pre>Array
(
    [0] => Array
        (
            [0] => <INPUT NAME=Fld1 TYPE=Text SIZE=2 MAXLENGTH=2  VALUE="">
            [1] => <INPUT NAME=Fld2 TYPE=Text SIZE=1 MAXLENGTH=1  VALUE="">
        )

    [1] => Array
        (
            [0] =>
            [1] =>
        )

    [2] => Array
        (
            [0] => ""
            [1] => ""
        )

)
</pre>
[/quote]
I added a question mark after the stars (*) and it's working BETTER.  But still not right.

Now this is what I get:
[quote]
<pre>Array
(
    [0] => Array
        (
            [0] => <INPUT NAME=Fld1 TYPE=Text SIZE=2 MAXLENGTH=2  VALUE="">
            [1] => <INPUT NAME=Fld2 TYPE=Text SIZE=1 MAXLENGTH=1  VALUE="">
        )

    [1] => Array
        (
            [0] => Fld1 TYPE=Text SIZE=2 MAXLENGTH=2 
            [1] => Fld2 TYPE=Text SIZE=1 MAXLENGTH=1 
        )

    [2] => Array
        (
            [0] => ""
            [1] => ""
        )

)
</pre>
[/quote]

Somehow I need to deal with the fact that the name isn't in the same place in all the fields.  Sometimes there's text (type, size etc) between the name and the value other times, they're right next to each other.
UPDATE:  Ok I've continued to try and get this working.  I've got it to pull the field names AND the values.  But it seems to be ignoring the  fields that were defined using lower case even though I've used the "PCRE_CASELESS" modifier (/i).
I figured it out.  It was a matter of the caseless issue.  I noticed that the "missing" fields had values but my pattern was assuming fields would be blank.

I figured I'd share what ended up working.  But I DO have one question:  what's [b]PREG_PATTERN_ORDER[/b] do?  I tried it both on there and NOT on there.  In both cases the results looked the same to me.
[code]
preg_match_all('/<INPUT.*?NAME=(.*)[\s]+.*VALUE="(.*)">/imU',$page,$matches, PREG_PATTERN_ORDER);
[/code]
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.