why isn't this working?

tadakan · June 22, 2011

I'm trying to write a regex for a url that starts out with the same subdomain.domain and so and then deviates with the arguments that are sent to it.

What I originally thought was working:

%http://appraisalzone.lendervend.com/AssignOrder.aspx.[a-zA-Z0-9/?=&-]{1,}/.$%

I'm parsing this out of an email and it turns out that email has the link somewhat different than it appears in a client. It looks like the link is partially in hex?

anyway the print_r of the email from stdin() gives this link:

<a =

href=3D"http://appraisalzone.lendervend.com/AssignOrder.aspx?orderid=3D35=

7262&userid=3D116315&customerorderid=3DEFC33B67-A60D-434B-99E1-7D=

4FA5C9DA1C

There are also either newline or carriage returns in there that I'm not sure how to allow for in the regex.

Adam · June 23, 2011

The email's in HTML entities. Just run it through html_entity_decode first. As for matching new-lines, you can add the 's' modifier after the last percent:

[...] {1,}/.$%s

Edit: Actually if the email string is split over multiple lines, you might want to first remove line-breaks, using similar to:

$string = str_replace(array("\n", "\r"), '', $string);

.. As the 's' modifier will allow only the dot character to match over multiple-lines.

tadakan · June 23, 2011

It was the html entities.

Thank you so much, I thought I needed to figured out how to use the mime parsing abilities of php and that was looking daunting.

Sign In

why isn't this working?

Recommended Posts

tadakan

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

tadakan

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information