[SOLVED] Maybe regex?

seventheyejosh · September 11, 2009

Hello all!

I've never attempted regex stuff, so I hope this is the right place.

Basically I'm checking the http_referer, but I just want to know the base site.

so if they come from

http://www.site.com/reg/

or

http://site.com/view/

or whatever, I would like that to be matched up with

'site'

unless that is too difficult, in which case i can force it to be

'http://www.site.com'

and just chop off the ending piece, but I'd really like to just get whatever is after 'www.' and before '.org'/'.com' unless there is no www. then just get after 'http://' and before '.org'/'.com'

Is this doable? Thanks much

Adam · September 11, 2009

Odd one. Thought there'd be a little more built-in functionality for this but can't seem to find anything. Came up with this in the end, seems a little bloated for what it is though:

if (isset($_SERVER['HTTP_REFERRER']))
{
    if (preg_match('/^(www\.)?([^\.]+)/', parse_url($_SERVER['HTTP_REFERRER'], PHP_URL_HOST), $matches))
    {
        print $matches[2];
    }
}

seventheyejosh · September 11, 2009

looks like that might be what i needed

referrer was misspelled tho, so it took me a few mins to get it working

could you elaborate on the

'/^(www\.)?([^\.]+)/'

part?

i get that the 2nd param is the parsed url, and the 3rd is the output, correct?

Thanks a lot!

nrg_alpha · September 12, 2009

could you elaborate on the

'/^(www\.)?([^.]+)/'

part?

i get that the 2nd param is the parsed url, and the 3rd is the output, correct?

Thanks a lot!

The opening and closing / characters are delimiters (every preg pattern needs delimiters - although they don't need to be /.. you can read up about that here). So excluding the delimiters;

^ - From the beginning of the source string(www.)? - If www followed by a dot is present, capture all this..(captures use parenthesis -() to store what is matched within these into variables) but this capture is optional, due to the ? character([^.]+) - This capture involves [^.]+, which is a character class - [..] that looks to match (or not match) a single character that is listed within the character class at the current location within the source string.. In this case, since the first character within the character class is ^, this makes the character class negated... so this states, match anything that is NOT a dot, one or more times (one or more times is represented by the + sign).

You can read up about regex in these links:

http://www.phpfreaks.com/tutorial/regular-expressions-part1---basic-syntax http://www.regular-expressions.info/http://weblogtoolscollection.com/regex/regex.php

These should be more than enough to help kickstart things.

Adam · September 14, 2009

Thankyou nrg_alpha

seventheyejosh · September 14, 2009

Thank you both, actually

Sign In

[SOLVED] Maybe regex?

Recommended Posts

seventheyejosh

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

seventheyejosh

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

seventheyejosh

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information