Jump to content

[SOLVED] Maybe regex?


seventheyejosh

Recommended Posts

Hello all!

 

I've never attempted regex stuff, so I hope this is the right place.

 

Basically I'm checking the http_referer, but I just want to know the base site.

 

so if they come from

 

http://www.site.com/reg/

 

or

 

http://site.com/view/

 

or whatever, I would like that to be matched up with

 

'site'

 

unless that is too difficult, in which case i can force it to be

 

'http://www.site.com'

 

and just chop off the ending piece, but I'd really like to just get whatever is after 'www.' and before '.org'/'.com' unless there is no www. then just get after 'http://' and before '.org'/'.com'

 

Is this doable? Thanks much :)

Link to comment
Share on other sites

Odd one. Thought there'd be a little more built-in functionality for this but can't seem to find anything. Came up with this in the end, seems a little bloated for what it is though:

 

if (isset($_SERVER['HTTP_REFERRER']))
{
    if (preg_match('/^(www\.)?([^\.]+)/', parse_url($_SERVER['HTTP_REFERRER'], PHP_URL_HOST), $matches))
    {
        print $matches[2];
    }
}

Link to comment
Share on other sites

could you elaborate on the

 

'/^(www\.)?([^.]+)/'

 

part?

 

i get that the 2nd param is the parsed url, and the 3rd is the output, correct?

 

Thanks a lot!

 

The opening and closing / characters are delimiters (every preg pattern needs delimiters - although they don't need to be /.. you can read up about that here). So excluding the delimiters;

^ - From the beginning of the source string(www.)? - If www followed by a dot is present, capture all this..(captures use parenthesis -() to store what is matched within these into variables) but this capture is optional, due to the ? character([^.]+) - This capture involves [^.]+, which is a character class  - [..] that looks to match (or not match) a single character that is listed within the character class at the current location within the source string.. In this case, since the first character within the character class is ^, this makes the character class negated... so this states, match anything that is NOT a dot, one or more times (one or more times is represented by the + sign).

You can read up about regex in these links:

http://www.phpfreaks.com/tutorial/regular-expressions-part1---basic-syntaxhttp://www.regular-expressions.info/http://weblogtoolscollection.com/regex/regex.php

These should be more than enough to help kickstart things.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.