perky416 Posted May 2, 2011 Share Posted May 2, 2011 Hi everyone. Im absolutely useless at regex so i was wondering if anybody would be able to help me. Im using the code below to validate whether a domain name is valid or not, however i have just noticed a problem that i have no clue how to solve. With the code below it is possible to enter a domain with more then 1 tld listed in the array, such as "testing.com.net". Does anybody know how i would adjust the regex to make it so that only 1 array value would be valid as the tld? $tld_list = array('.com','.net','.org'); $label = '[\\w][\\w\\.\\-]{0,61}[\\w]'; $tld = '[\\w]+'; foreach($lines as $line) { // check that each line/domain is in the valid format, else return an error for each domain if(preg_match( "/^($label)\\.($tld)$/", $line, $match ) && in_array($match[2], $tld_list )) { } else { $error[] = $line . " is not a valid domain!<br />"; } Thanks Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/ Share on other sites More sharing options...
requinix Posted May 2, 2011 Share Posted May 2, 2011 Did you know that cnet owns com.com? There's nothing wrong with testing.com.net. You should not invalidate that. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209634 Share on other sites More sharing options...
gizmola Posted May 2, 2011 Share Posted May 2, 2011 That code doesn't look like it would even work as is, because the leading period of the tld would not be part of the match array. Seems like that array should be array('com', 'net'..... etc); With that said, you could use the same idea and add a conditional check of substr($match[1], -3). if(preg_match( "/^($label)\\.($tld)$/", $line, $match ) && (!in_array(substr($match[1], -3), $tld_list)) && in_array($match[2], $tld_list )) Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209636 Share on other sites More sharing options...
perky416 Posted May 2, 2011 Author Share Posted May 2, 2011 Hi guys, Yeah sorry the array in the actual code is $tld_list = array('com','net','org' etc...., i forgot to copy it and so just quickly typed it and added the dots by mistake. Iv just tried the code provided however unfortunately it does not work. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209647 Share on other sites More sharing options...
markjoe Posted May 2, 2011 Share Posted May 2, 2011 I don't intend to be offensive, but unless I'm missing some requirement, this is a terrible way to validate a url. Do you only want to validate certain TLDs? What about .cc, .us, .ws, .jp, .info, etc.... ? You'd be much better off validating the pattern only and not the actual text. (what regular expressions are meant to do) Such as, instead of a list of tlds, use the pattern: "\.[a-zA-Z]{2,4}". What exactly do you want to validate, just a domain name? no protocol, uri, etc ? Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209649 Share on other sites More sharing options...
perky416 Posted May 2, 2011 Author Share Posted May 2, 2011 hi markjoe, The actual code im using has a list of over 400 tld's, i only typed a few here as an example. Its just domains im trying to validate for a domain script im working on. I have to validate the actual text rather than just the pattern, as what if a user trys to add something like "example.thisisnotadomain". I got the original script from a post on a forum some time ago however I cant remember where it is. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209653 Share on other sites More sharing options...
markjoe Posted May 2, 2011 Share Posted May 2, 2011 ".thisisnotadomain" would actually not pass "\.[a-zA-Z]{2,4}". The "{2,4}" says: 2 minimum, 4 maximum. But if you need to validate only certain tlds, then I see how you're stuck to the array. Instead of using "+" (which says 1 or more), use {min, max}. You could even iterate through the array and find min and max values. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209657 Share on other sites More sharing options...
perky416 Posted May 2, 2011 Author Share Posted May 2, 2011 Iv just been playing with the {min, max} you suggested. Would I be right in saying that the {min, max} defines the minimum and maximum amount of characters the array value can be? Using {1,3} it worked for .com but not .mobi. Is there a way to define the {min, max} number of arrays rather then number of characters? If I use it as number of characters the user could use something like .aaa which isnt a correct tld. So I think I definitely need to use the set array values. Thanks Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209665 Share on other sites More sharing options...
gizmola Posted May 2, 2011 Share Posted May 2, 2011 Hi guys, Yeah sorry the array in the actual code is $tld_list = array('com','net','org' etc...., i forgot to copy it and so just quickly typed it and added the dots by mistake. Iv just tried the code provided however unfortunately it does not work. Did you do any analysis of why that is? I type things in off the top of my head, so there may be some small issue that can be debugged, but I cant tell you that it will do what you originally asked for, in terms of taking the piece of the hostname and looking to see if the last 3 characters are com, net etc. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209673 Share on other sites More sharing options...
perky416 Posted May 3, 2011 Author Share Posted May 3, 2011 Hi gizmola, I didnt fully understand the code to begin with, i got it from another forum a while back. Its not that i want to check that the last 3 characters are .com etc..., as some domains like .mobi or .co.uk are longer, but that i want to make sure than only 1 array value can be used. Iv just got back from work so i will have a play see what i can do with it, but when it comes to regex i haven't got a clue lol. Thanks Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1209930 Share on other sites More sharing options...
gizmola Posted May 3, 2011 Share Posted May 3, 2011 All the code I provided was meant to do was use the same array that is used to validate the tld (com, net, org etc.) does not appear as the host, so no .com.com or .net.com. That's what you originally asked for. It has nothing to do with regex at all. There is a piece of that code that is regex, but that check is done by taking a string and using in_array to check whether or not it is in the $tld_list array. This is not hard to understand even if you don't know regex. Just go through the code looking at the manual for any of the functions being used. You can take these functions or pieces of them, assign them to strings and echo or var_dump their values in order to debug things so you understand better how they work. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1210000 Share on other sites More sharing options...
perky416 Posted May 7, 2011 Author Share Posted May 7, 2011 Hi gizmola, I finally figured out why its not working after days of looking at it. It only works with 3 letter tld's. if i try .mobi.net or .co.uk.net they pass the validation. I tried changing the -3 in the substr to -4 to account for longer tld's however then 3 letter tld's stop working. Do you know how id be able to change this to account for any tld in the array rather than the length? As it needs to work for tlds such as .co.uk buy not .co.uk.net for example. .co.uk would be in the array. Thanks mate. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211904 Share on other sites More sharing options...
wildteen88 Posted May 7, 2011 Share Posted May 7, 2011 I maybe wrong but wouldn't it be easier to implode your $tld_list array into your pattern using the pipe character as a the separator. That way preg_match will only match the urls that contain the tlds within your array. $label = '[\\w][\\w\\.\\-]{0,61}[\\w]'; $tld_list = array('com', 'net', 'org', 'co.uk', 'mobi'); $tld_s = implode('|', array_map('preg_quote', $tld_list)); foreach($lines as $line) { // check that each line/domain is in the valid format, else return an error for each domain if(preg_match( "/^($label)\\.($tld_s)$/", $line, $match )) Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211933 Share on other sites More sharing options...
perky416 Posted May 7, 2011 Author Share Posted May 7, 2011 Hi wildteen88 Thanks for your suggestions however unfortunately I have just tried it and it accepts domains such as test.com.net. Both of which are in my array list but shouldn't be used together. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211950 Share on other sites More sharing options...
wildteen88 Posted May 7, 2011 Share Posted May 7, 2011 The problem is to with your regex pattern. If the url is www.test.com.net The first bit ^($label) which is expanded to ^([\\w][\\w\\.\\-]{0,61}[\\w]) Is matching www.test.com. And the second part to your pattern \\.($tld_s), which is expanded to \\.(com|net|org|co\.uk) Is matching the .net part of your url. So the problem your first pattern is being too greedy. You'll need to modify it so it doesn't match your tlds. I have came up with a different pattern to try. For $label set it to the following pattern www\.?([\w-]+){0,61}. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211964 Share on other sites More sharing options...
perky416 Posted May 7, 2011 Author Share Posted May 7, 2011 Thank you so much you are a life saver!!! Im not allowing the www. to be entered on my site so i changed your www\.?([\w-]+){0,61} to ([\w-]+){0,61} and it is working perfectly. Please tell me if i have badly edited your regex as im useless with it lol, but it seems to be perfect. Thanks again i really appreciate everyone's help! Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211971 Share on other sites More sharing options...
wildteen88 Posted May 7, 2011 Share Posted May 7, 2011 The www. bit is optional that is what the question mark represents after the www\. part. I'm no regex expert but yea I guess it should work for you. Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1211977 Share on other sites More sharing options...
salathe Posted May 8, 2011 Share Posted May 8, 2011 The www. bit is optional that is what the question mark represents after the www\. part. That's not the case at all. www\.? means, match www optionally followed by a dot. To make the whole www\. sequence optional, you'd need to wrap it into a (usually non-capturing) group (like (?:www\.)?). Quote Link to comment https://forums.phpfreaks.com/topic/235378-allow-only-1-array-value/#findComment-1212306 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.