Jump to content

Recommended Posts

Hi everyone.

 

Im absolutely useless at regex so i was wondering if anybody would be able to help me.

Im using the code below to validate whether a domain name is valid or not, however i have just noticed a problem that i have no clue how to solve.

 

With the code below it is possible to enter a domain with more then 1 tld listed in the array, such as "testing.com.net".

Does anybody know how i would adjust the regex to make it so that only 1 array value would be valid as the tld?

 

$tld_list = array('.com','.net','.org');

$label = '[\\w][\\w\\.\\-]{0,61}[\\w]'; 
$tld = '[\\w]+'; 

foreach($lines as $line)
{	
// check that each line/domain is in the valid format, else return an error for each domain
if(preg_match( "/^($label)\\.($tld)$/", $line, $match ) && in_array($match[2], $tld_list ))
{ 
}
else
{
$error[] = $line . " is not a valid domain!<br />";
}

 

Thanks

Link to comment
https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/
Share on other sites

That code doesn't look like it would even work as is, because the leading period of the tld would not be part of the match array.  Seems like that array should be array('com', 'net'..... etc);

 

With that said, you could use the same idea and add a conditional check of substr($match[1], -3). 

 

if(preg_match( "/^($label)\\.($tld)$/", $line, $match )  && (!in_array(substr($match[1], -3), $tld_list)) && in_array($match[2], $tld_list ))

Hi guys,

 

Yeah sorry the array in the actual code is $tld_list = array('com','net','org' etc...., i forgot to copy it and so just quickly typed it and added the dots by mistake.

 

Iv just tried the code provided however unfortunately it does not work.

 

I don't intend to be offensive, but unless I'm missing some requirement, this is a terrible way to validate a url.

Do you only want to validate certain TLDs? What about .cc, .us, .ws, .jp, .info, etc.... ?

You'd be much better off validating the pattern only and not the actual text. (what regular expressions are meant to do)

Such as, instead of a list of tlds, use the pattern: "\.[a-zA-Z]{2,4}".

What exactly do you want to validate, just a domain name? no protocol, uri, etc ?

hi markjoe,

 

The actual code im using has a list of over 400 tld's, i only typed a few here as an example.

Its just domains im trying to validate for a domain script im working on. I have to validate the actual text rather than just the pattern, as what if a user trys to add something like "example.thisisnotadomain".

I got the original script from a post on a forum some time ago however I cant remember where it is.

".thisisnotadomain" would actually not pass "\.[a-zA-Z]{2,4}". The "{2,4}" says: 2 minimum, 4 maximum.

But if you need to validate only certain tlds, then I see how you're stuck to the array.

 

Instead of using "+" (which says 1 or more), use {min, max}. You could even iterate through the array and find min and max values.

Iv just been playing with the {min, max} you suggested. Would I be right in saying that the {min, max} defines the minimum and maximum amount of characters the array value can be?

Using {1,3} it worked for .com but not .mobi.

Is there a way to define the {min, max} number of arrays rather then number of characters?

If I use it as number of characters the user could use something like .aaa which isnt a correct tld. So I think I definitely need to use the set array values.

 

Thanks :)

 

Hi guys,

 

Yeah sorry the array in the actual code is $tld_list = array('com','net','org' etc...., i forgot to copy it and so just quickly typed it and added the dots by mistake.

 

Iv just tried the code provided however unfortunately it does not work.

 

 

Did you do any analysis of why that is?  I type things in off the top of my head, so there may be some small issue that can be debugged, but I cant tell you that it will do what you originally asked for, in terms of taking the piece of the hostname and looking to see if the last 3 characters are com, net etc.

Hi gizmola,

 

I didnt fully understand the code to begin with, i got it from another forum a while back.

Its not that i want to check that the last 3 characters are .com etc..., as some domains like .mobi or .co.uk are longer, but that i want to make sure than only 1 array value can be used.

 

Iv just got back from work so i will have a play see what i can do with it, but when it comes to regex i haven't got a clue lol.

 

Thanks

All the code I provided was meant to do was use the same array that is used to validate the tld  (com, net, org etc.)  does not appear as the host, so no .com.com or .net.com.  That's what you originally asked for.  It has nothing to do with regex at all.  There is a piece of that code that is regex, but that check is done by taking a string and using in_array to check whether or not it is in the $tld_list array. 

 

This is not hard to understand even if you don't know regex.  Just go through the code looking at the manual for any of the functions being used.  You can take these functions or pieces of them, assign them to strings and echo or var_dump their values in order to debug things so you understand better how they work.

Hi gizmola,

 

I finally figured out why its not working after days of looking at it. It only works with 3 letter tld's. if i try .mobi.net or .co.uk.net they pass the validation. I tried changing the -3 in the substr to -4 to account for longer tld's however then 3 letter tld's stop working.

 

Do you know how id be able to change this to account for any tld in the array rather than the length? As it needs to work for tlds such as .co.uk buy not .co.uk.net for example. .co.uk would be in the array.

 

Thanks mate.

I maybe wrong but wouldn't it be easier to implode your $tld_list array into your pattern using the pipe character as a the separator. That way preg_match will only match the urls that contain the tlds within your array.

$label = '[\\w][\\w\\.\\-]{0,61}[\\w]'; 

$tld_list = array('com', 'net', 'org', 'co.uk', 'mobi');
$tld_s = implode('|', array_map('preg_quote', $tld_list));

foreach($lines as $line)
{	
// check that each line/domain is in the valid format, else return an error for each domain
if(preg_match( "/^($label)\\.($tld_s)$/", $line, $match ))

 

The problem is to with your regex pattern.

 

If the url is www.test.com.net

 

The first bit ^($label) which is expanded to

^([\\w][\\w\\.\\-]{0,61}[\\w])

Is matching www.test.com.

 

And the second part to your pattern \\.($tld_s), which is expanded to

\\.(com|net|org|co\.uk)

Is matching the .net part of your url.

 

So the problem your first pattern is being too greedy. You'll need to modify it so it doesn't match your tlds. I have came up with a different pattern to try. For $label set it to the following pattern www\.?([\w-]+){0,61}.

Thank you so much you are a life saver!!!

 

Im not allowing the www. to be entered on my site so i changed your www\.?([\w-]+){0,61} to ([\w-]+){0,61} and it is working perfectly.

 

Please tell me if i have badly edited your regex as im useless with it lol, but it seems to be perfect.

 

Thanks again i really appreciate everyone's help!

The www. bit is optional that is what the question mark represents after the www\. part.

 

That's not the case at all.

 

www\.? means, match www optionally followed by a dot.

 

To make the whole www\. sequence optional, you'd need to wrap it into a (usually non-capturing) group (like (?:www\.)?).

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.