Jump to content

Hopefully a simple Regexp to process some HTML files


kabbink

Recommended Posts

Hey all,

 

Sorry I have never used regular expressions in my PHP before so this is a beginners question!

 

What I have is some HTML source code files with lines similar to the following:

 

<td width="13%" class="AdminCenterTableContentsOdd"><input type="text" class="AdminTextBox" name="txtLastName244" maxlength="50" size="6" value="lastname" ID="Text35"></td>

 

<td width="13%" class="AdminCenterTableContentsOdd"><input type="text" class="AdminTextBox" name="txtFirstName244" maxlength="50" size="6" value="firstname" ID="Text36"></td>

 

<td width="13%" class="AdminCenterTableContentsOdd"><input type="text" class="AdminTextBox" name="txtEmail244" maxlength="255" size="6" value="some@email.com" ID="Text37"></td>

 

What I need to do is grab the values from these lines and output them to a CSV file or just get them in an array or something.

 

Note that the number after the field name (bolded above) is different for each customer in the HTML files so that part will have to allow any number.

 

Hope this is an easy one...

 

Thanks for the help and sorry for the noob questions!

Kevin

 

Link to comment
Share on other sites

dsaba,

 

One problem is I have about 100 of these entries per page.  They all have other stuff in between them (table formatting etc).

 

How could I keep the records together?

 

Also if it isnt too much trouble can you explain the regexp you posted above so that next time I can answer some questions instead of asking them.

 

Thanks a lot for the help so far!

Kevin

Link to comment
Share on other sites

Thanks a lot for the help so far!

No problem. :)

 

I can answer some questions instead of asking them

That's a great way to show your appreciation for the help you've received. I try to do the same.

 

What I do when I see a regex I don't understand is I look it up each symbol, similar to looking up words in a dictionary from a sentence that is unclear.

-------------------------------------------------------------------------------

~ is the modifier (see manual, its required for pcre/preg flavor regexes)

name="txt is a string literal

[^\d] match any character that is not numerical

+ match 1 or more of these

(\d*) match 0 or more numerals and capture these in the 1st subgroup $out[1]

" another string literal to match exactly "

 

 

Here's a good reference to understand the symbols:

http://www.regular-expressions.info/reference.html

I also have a good amount of samples on my regextester, you can play with them in real time. Experimenting & learning through example is a great way to learn.

 

 

One problem is I have about 100 of these entries per page.  They all have other stuff in between them (table formatting etc).

 

How could I keep the records together?

The regex I specified works fine in the data you supplied, I cannot guess regexes that work in data that I am not aware of. Regexes are specific to the data. What do you mean keep records together?

 

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.