Jump to content

.com or domain detection not working


Monkuar

Recommended Posts

if (preg_match('^[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)$', $text)) {

message("You cannot post links or urls unless you have made 10 Posts");
}

 

$text is the users's input

 

i try ".com" doesn't return true any idea?

Link to comment
Share on other sites

I meant to edit my post.

 

but the site wont let me

 

I got it to work:

 

if (preg_match("/^[a-zA-Z0-9]*((-|\.)?[a-zA-Z0-9])*\.([a-zA-Z]{2,4})$/", $text)) {
message("You cannot post links or urls unless you have made 10 Posts");
}

 

But, if people enter

 

".com2" it still let's them, I need to add a wildcard to the .com so if they have any text characters after the .com or .net that it errors out, any idea fellas?

Link to comment
Share on other sites

I meant to edit my post.

 

but the site wont let me

 

I got it to work:

 

if (preg_match("/^[a-zA-Z0-9]*((-|\.)?[a-zA-Z0-9])*\.([a-zA-Z]{2,4})$/", $text)) {
message("You cannot post links or urls unless you have made 10 Posts");
}

 

But, if people enter

 

".com2" it still let's them, I need to add a wildcard to the .com so if they have any text characters after the .com or .net that it errors out, any idea fellas?

 

really, this is not an ideal regex as it will allow a lot of unwanted things through.

I recommend you use filter_var

 

var_dump(filter_var("example@example.com", FILTER_VALIDATE_EMAIL));

Link to comment
Share on other sites

AyKay, he wants to find URLs, not email addresses.

 

Monkuar, please try this code:

 

<?php

$text = 'abc example.com2 abc';

if (preg_match('/(?<=[a-z0-9])\.(com|org|net|mil|edu|de|us|uk|au|info)/i', $text)) {
echo "You cannot post links or urls unless you have made 10 Posts";
}

?>

Link to comment
Share on other sites

If you really want to list every type of domain ending then here is a convenient list for you:

http://www.iana.org/domains/root/db/

 

But fair warning, this will take time to complete.

 

Otherwise, I would suggest changing your code to look for a http:// or www. or the two together, followed by a dot and the central part (consisting of letters, numbers and dashes), followed by a dot and the tdl. The tdl can be one or two part, but can never contain a digit.

 

Something like this:

~(https?://(www.)?|www.)[a-z0-9-]+\.[a-z]{2,5}(\.[a-z]{2,5})?~i

 

But then again that will let through www.a.zzz. It's your call on this one. Speed vs accuracy. The big or statement will take time to read through and hence be slow, whereas my snippet will allow incorrect domains through but will be read in a much greater speed.

 

Hope this helps you,

Joe

Link to comment
Share on other sites

AyKay, he wants to find URLs, not email addresses.

 

Monkuar, please try this code:

 

<?php

$text = 'abc example.com2 abc';

if (preg_match('/(?<=[a-z0-9])\.(com|org|net|mil|edu|de|us|uk|au|info)/i', $text)) {
echo "You cannot post links or urls unless you have made 10 Posts";
}

?>

 

I was a bit off key with this one, disregard my original post.

Looking at the solutions here, parse_url() will still allow invalid url's through like test@test.com2

If you want the filter to be very tight, which it appears that you do, something along the lines of one of the regex patterns provided is what you want.

Link to comment
Share on other sites

parse_url doesn't allow or disallow anything. It simply allows you to do it more easily by breaking the url into separate components.

 

says it right in the descrip, my mistake.

I have not fiddled around with the components argument, so I am not exactly sure what it can return.

What would the advantage be of using parse_url() over a regex in this case? Seems like using parse_url() would require much more code.

Link to comment
Share on other sites

AyKay, he wants to find URLs, not email addresses.

 

Monkuar, please try this code:

 

<?php

$text = 'abc example.com2 abc';

if (preg_match('/(?<=[a-z0-9])\.(com|org|net|mil|edu|de|us|uk|au|info)/i', $text)) {
echo "You cannot post links or urls unless you have made 10 Posts";
}

?>

 

Works exactly what I asked for.

 

Now 1 more problem

 

users are spamming sites " rofl . c o m " with spaces

 

Way to fix that also ? Wildcard is working

 

 

@ManiacDan,@Pikachu,@Aykay, thanks for ur codes and advice to

 

Gotta love me some regex for the morning

Link to comment
Share on other sites

People are going to bypass your filter no matter how strict you make it.

 

Next, it'll be website(dot)com. Once that's blocked, Add dot com to the end of 'website'. Or how about redirectors using domain names not on your whitelist? http://tiny.cc/diHe2b

 

You need to ban users that attempt to bypass your protections. It's going to happen no matter how strict you try to make your RegEx.

Link to comment
Share on other sites

People are going to bypass your filter no matter how strict you make it.

 

Next, it'll be website(dot)com. Once that's blocked, Add dot com to the end of 'website'. Or how about redirectors using domain names not on your whitelist? http://tiny.cc/diHe2b

 

You need to ban users that attempt to bypass your protections. It's going to happen no matter how strict you try to make your RegEx.

 

True... I cant really stop it

 

thank you

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.