ankur0101 Posted February 5, 2012 Share Posted February 5, 2012 Hi, I am making a whois script, where textbox name is domain $domain = $_POST['domain']; A user should submit domain such as abc123-abc.tld abc123-abc can consist of small alpha, numbers and '-' tld can consist only of small alpha So I want to write something like if (resular ex condition as I asked above) { Success } else { Invalid domain name } Need help Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/ Share on other sites More sharing options...
joe92 Posted February 5, 2012 Share Posted February 5, 2012 Will domains be suffixed with the http:// and www.? Will they be submitted with trailing GET information such as '.tdl/?index.php'? Both those questions will change the regex. However, from what you have said so far, the following should suffice: if(preg_match("/^[a-z0-9-]+\.[a-z]{2,10}([a-z]{2,5})?$/", $domain)) Hope that helps you, Joe Edit:: Noticed you said small alpha's so removed case insensitivity Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1314804 Share on other sites More sharing options...
ankur0101 Posted February 5, 2012 Author Share Posted February 5, 2012 Thank you so much, problem solved. Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1314824 Share on other sites More sharing options...
ragax Posted February 5, 2012 Share Posted February 5, 2012 Not sure, but it seems to me there might be a slight bug. As Joe says, It's always a damned typo! As it is, the regex matches a.zzzzzzzzzzzzzzz That is because as they are, the last two character classes [a-z]{2,10}([a-z]{2,5})? do not really make sense unless something is missing: apart from the capture (which ankur doesn't seem to care about), the regex above is equivalent to a simple [a-z]{2,15} Joe, did you mean to say: if(preg_match("/^[a-z0-9-]+\.[a-z]{2,5}$/", $domain)) Or am I missing something? Just waking up, so that's entirely possible. Wishing you all a fun Sunday. Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1314837 Share on other sites More sharing options...
joe92 Posted February 6, 2012 Share Posted February 6, 2012 Oops! It's always a damned typo! Haha. That isn't supposed to be used as a capturing parenthesis. It was a questionable string and it was meant to start with a dot. I personally don't see the point in telling the regex engine not to remember a parenthesis when we're talking about a maximum of 5-10 bytes. That should say: [a-z]{2,10}(\.[a-z]{2,5})? That is to cover tdl's such as a .co.uk address or a .gov.uk address. Also, I was recently reading that personal tdl strings are very soon going to be launched onto the world wide web. In fact, I think the 'reveal' date of all the new one's is going to be 1st May. Tdl's such as .museum and .aero have already been introduced, although are reserved in this case for museum's and aerospace firms. Let's not forget the .name which allows individuals (I'm guessing just the fabulously wealthy/famous ones) to have their name in a tdl. Once they start becoming more common you could have tdl's such as .hammersmith or .frankenstein. Got to accommodate for the future now. Anyway, good spot playful! Here is the complete code I suggest you use ankur. It will allow some wrong ones through (e.g. .zzzzzz.zz), but unless you put a very big OR statement at the end to capture every type of legal tdl ((com?|co\.uk|info|..etc.)) you won't be able to get around it I'm afraid. if(preg_match("/^[a-z0-9-]+\.[a-z]{2,10}(\.[a-z]{2,5})?$/", $domain)) Here's the Root Zone Database if you do wish to do that though Joe Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1314895 Share on other sites More sharing options...
ragax Posted February 6, 2012 Share Posted February 6, 2012 Hi Joe, That should say: [a-z]{2,10}(\.[a-z]{2,5})? But that matches aa (No tld needed.) So my question still stands: did you mean to say if(preg_match("/^[a-z0-9-]+\.[a-z]{2,5}$/", $domain)) Thank you for your interesting information about the future of tlds, and the link! Wishing you a fun day. Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1314919 Share on other sites More sharing options...
joe92 Posted February 6, 2012 Share Posted February 6, 2012 Hi playful, That should say: [a-z]{2,10}(\.[a-z]{2,5})? But that matches aa (No tld needed.) Yes, that alone does, but not when included in the entire regex. So my question still stands: did you mean to say if(preg_match("/^[a-z0-9-]+\.[a-z]{2,5}$/", $domain)) No, I meant what I posted. I'll explain. The OP states: A user should submit domain such as abc123-abc.tld That means the the domain can consist of three parts: abc123-abd . tld Each part is: abc123-abd - [a-z0-9-]+ [/td] [td]. - \. tdl - [a-z]{2,10}(\.[a-z]{2,5})? The reason the tdl contains so much is because of how many types of tdl's there are. If we limit the tdl to just [a-z]{2,5} as you are suggesting then we are ruling out co.uk addresses etc., they will fail on the extra dot. However, saying that, the .uk is an optional part of the tdl as it .com's and .net's etc don't include them. That is why I encase it in parenthesis followed by a question mark. If it's there, include it, else if the regex fails here it doesn't matter as it's not imperative to the entire match. Hence I stand by my original regex. It will allow for every type of tdl available and must match characters and a dot before it: if(preg_match("/^[a-z0-9-]+\.[a-z]{2,10}(\.[a-z]{2,5})?$/", $domain)) Hope I explained myself properly, Joe Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1315081 Share on other sites More sharing options...
ragax Posted February 6, 2012 Share Posted February 6, 2012 Hi Joe, Perfectly clear. I thought that when you wrote That should say: [a-z]{2,10}(\.[a-z]{2,5})? in your second message, you meant that you intended that to be the entire regex. This surprised me as it matches aa. I missed the bottom of that message, where your corrected expression lived. My bad. Hence I stand by my original regex. Your second version of your original regex. For the record, then, this regex matches a.aaaaaaaaaa but not a.aa.aaaaaaaaaa In other words, you can have a ten-letter tld, but only if it is not preceded by a sub-tld. This makes me wonder if you wouldn't prefer the middle part to be the optional one, rather than the last part. Not a big deal, though, and I think I can hear your answer from here ("I stand by my regex" ?) I don't mean to enter a nitpicking contest. For the record, the history of the convo is that at first I thought I saw a bug, and there was one. Then I thought I saw a second bug, and I saw wrong. At this stage if you are happy with the feature above, no probs, just thought I'd bring it up. There are a million ways to match a url, all with their personalities. Nothing wrong with that, it's just nice to "date" a little and get to know a personality before getting married. Wishing you a fun day Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1315203 Share on other sites More sharing options...
joe92 Posted February 6, 2012 Share Posted February 6, 2012 Hey playful, From what I've seen of tdl's, the optional extension on the end is usually the shorter of the two tdl components. That's why I made it the shorter one. It's neither here nor there though. Once you start trying to match url's in a block of text you can't even end at the tdl, for the url might be pointing to a file 3 sub directories down the tree. This is all specific to the OP's needs. And true, I stand by the second version of my original regex haha Joe Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1315210 Share on other sites More sharing options...
ankur0101 Posted February 7, 2012 Author Share Posted February 7, 2012 I would like to ask one more question. I have an array such as >> $whoisservers = array( "ac" =>"whois.nic.ac", "ae" =>"whois.nic.ae", "aero"=>"whois.aero", "af" =>"whois.nic.af", "ag" =>"whois.nic.ag", "al" =>"whois.ripe.net", "am" =>"whois.amnic.net", "arpa" =>"whois.iana.org", "as" =>"whois.nic.as", "asia" =>"whois.nic.asia", "at" =>"whois.nic.at", "au" =>"whois.aunic.net", "az" =>"whois.ripe.net", "ba" =>"whois.ripe.net", "be" =>"whois.dns.be", "bg" =>"whois.register.bg", "bi" =>"whois.nic.bi", "biz" =>"whois.biz", "bj" =>"whois.nic.bj", "br" =>"whois.registro.br", "bt" =>"whois.netnames.net", "by" =>"whois.ripe.net", "bz" =>"whois.belizenic.bz", "ca" =>"whois.cira.ca", "cat" =>"whois.cat", "cc" =>"whois.nic.cc", "cd" =>"whois.nic.cd", "ch" =>"whois.nic.ch", "ci" =>"whois.nic.ci", "ck" =>"whois.nic.ck", "cl" =>"whois.nic.cl", "cn" =>"whois.cnnic.net.cn", "com" =>"whois.verisign-grs.com", "coop" =>"whois.nic.coop", "cx" =>"whois.nic.cx", "cy" =>"whois.ripe.net", "cz" =>"whois.nic.cz", "de" =>"whois.denic.de", "dk" =>"whois.dk-hostmaster.dk", "dm" =>"whois.nic.cx", "dz" =>"whois.ripe.net", "edu" =>"whois.educause.edu", "ee" =>"whois.eenet.ee", "eg" =>"whois.ripe.net", "es" =>"whois.ripe.net", "eu" =>"whois.eu", "fi" =>"whois.ficora.fi", "fo" =>"whois.ripe.net", "fr" =>"whois.nic.fr", "gb" =>"whois.ripe.net", "gd" =>"whois.adamsnames.com", "ge" =>"whois.ripe.net", "gg" =>"whois.channelisles.net", "gi" =>"whois2.afilias-grs.net", "gl" =>"whois.ripe.net", "gm" =>"whois.ripe.net", "gov" =>"whois.nic.gov", "gr" =>"whois.ripe.net", "gs" =>"whois.nic.gs", "gw" =>"whois.nic.gw", "gy" =>"whois.registry.gy", "hk" =>"whois.hkirc.hk", "hm" =>"whois.registry.hm", "hn" =>"whois2.afilias-grs.net", "hr" =>"whois.ripe.net", "hu" =>"whois.nic.hu", "ie" =>"whois.domainregistry.ie", "il" =>"whois.isoc.org.il", "in" =>"whois.inregistry.net", "info" =>"whois.afilias.net", "int" =>"whois.iana.org", "io" =>"whois.nic.io", "iq" =>"vrx.net", "ir" =>"whois.nic.ir", "is" =>"whois.isnic.is", "it" =>"whois.nic.it", "je" =>"whois.channelisles.net", "jobs" =>"jobswhois.verisign-grs.com", "jp" =>"whois.jprs.jp", "ke" =>"whois.kenic.or.ke", "kg" =>"www.domain.kg", "ki" =>"whois.nic.ki", "kr" =>"whois.nic.or.kr", "kz" =>"whois.nic.kz", "la" =>"whois.nic.la", "li" =>"whois.nic.li", "lt" =>"whois.domreg.lt", "lu" =>"whois.dns.lu", "lv" =>"whois.nic.lv", "ly" =>"whois.nic.ly", "ma" =>"whois.iam.net.ma", "mc" =>"whois.ripe.net", "md" =>"whois.ripe.net", "me" =>"whois.meregistry.net", "mg" =>"whois.nic.mg", "mil" =>"whois.nic.mil", "mn" =>"whois.nic.mn", "mobi" =>"whois.dotmobiregistry.net", "ms" =>"whois.adamsnames.tc", "mt" =>"whois.ripe.net", "mu" =>"whois.nic.mu", "museum" =>"whois.museum", "mx" =>"whois.nic.mx", "my" =>"whois.mynic.net.my", "na" =>"whois.na-nic.com.na", "name" =>"whois.nic.name", "net" =>"whois.verisign-grs.net", "nf" =>"whois.nic.nf", "nl" =>"whois.domain-registry.nl", "no" =>"whois.norid.no", "nu" =>"whois.nic.nu", "nz" =>"whois.srs.net.nz", "org" =>"whois.pir.org", "pl" =>"whois.dns.pl", "pm" =>"whois.nic.pm", "pr" =>"whois.uprr.pr", "pro" =>"whois.registrypro.pro", "pt" =>"whois.dns.pt", "re" =>"whois.nic.re", "ro" =>"whois.rotld.ro", "ru" =>"whois.ripn.net", "sa" =>"whois.nic.net.sa", "sb" =>"whois.nic.net.sb", "sc" =>"whois2.afilias-grs.net", "se" =>"whois.iis.se", "sg" =>"whois.nic.net.sg", "sh" =>"whois.nic.sh", "si" =>"whois.arnes.si", "sk" =>"whois.ripe.net", "sm" =>"whois.ripe.net", "st" =>"whois.nic.st", "su" =>"whois.ripn.net", "tc" =>"whois.adamsnames.tc", "tel" =>"whois.nic.tel", "tf" =>"whois.nic.tf", "th" =>"whois.thnic.net", "tj" =>"whois.nic.tj", "tk" =>"whois.dot.tk", "tl" =>"whois.nic.tl", "tm" =>"whois.nic.tm", "tn" =>"whois.ripe.net", "to" =>"whois.tonic.to", "tp" =>"whois.nic.tl", "tr" =>"whois.nic.tr", "travel" =>"whois.nic.travel", "tv" => "tvwhois.verisign-grs.com", "tw" =>"whois.twnic.net.tw", "ua" =>"whois.net.ua", "ug" =>"whois.co.ug", "uk" =>"whois.nic.uk", "us" =>"whois.nic.us", "uy" =>"nic.uy", "uz" =>"whois.cctld.uz", "va" =>"whois.ripe.net", "vc" =>"whois2.afilias-grs.net", "ve" =>"whois.nic.ve", "vg" =>"whois.adamsnames.tc", "wf" =>"whois.nic.wf", "ws" =>"whois.website.ws", "yt" =>"whois.nic.yt", "yu" =>"whois.ripe.net"); The TLD should match to given tlds in array. If it wont match, it will go back to index.php with header() I am confused , how to do this ? Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1315471 Share on other sites More sharing options...
ankur0101 Posted February 23, 2012 Author Share Posted February 23, 2012 For the record, then, this regex matches a.aaaaaaaaaa but not a.aa.aaaaaaaaaa In other words, you can have a ten-letter tld, but only if it is not preceded by a sub-tld. Yes, I forgot that point. What to do for domains such as >> something.co.in something.com.mx Thanks Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1320301 Share on other sites More sharing options...
ankur0101 Posted February 23, 2012 Author Share Posted February 23, 2012 Hi, I am using following syntax >> /^[a-z0-9][a-z0-9\-]+[a-z0-9](\.[a-z]{2,4})+$/i Is that right ? Quote Link to comment https://forums.phpfreaks.com/topic/256466-how-to-match-domain-with-this-type/#findComment-1320303 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.