Jump to content

Validating a URL from form


Kemik

Recommended Posts

Hello,

 

How would be best to validate the following kinds of URLs?

http://www.url.com

http://subdomain.url.co.uk

http://www.url.co.uk/myweb/

 

If possible I'd like to add http:// to any urls where the user enters www. before the domain however I can overcome this by putting "Ensure http:// is included when submitting your website URL".

 

I'm trying to adapt the current template I have for the email address...

 

<?php

      /* Email error checking */
      $field = "email";  //Use field name for email
      if(!$subemail || strlen($subemail = trim($subemail)) == 0){
         $form->setError($field, "* Email not entered");
      }
      /* Check if valid email address */
      else{
          $regex = "^[_+a-zA-Z0-9-]+(\.[_+a-zA-Z0-9-]+)*"
                 ."@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]{1,})*"
                 ."\.([a-zA-Z]{2,}){1}$";
         if(!eregi($regex,$subemail)){
            $form->setError($field, "* Email invalid");
         }
         $subemail = stripslashes($subemail);
      }
?>

 

I've found: http://regexlib.com/REDetails.aspx?regexp_id=732 however, I'm not 100% sure that will match my requirements and if I can just replace the $regex value with the one listed on that page.

Link to comment
Share on other sites

Is this just a single field you are receiving input from?  It might be harder to find a URL sans protocol in a large text block, but if you just have a single line, that shouldn't be difficult.

 

First, you can default the form element to contain "http://", that way users won't have to type it themselves (God forbid!).

 

<input type=text value="http://">

 

Second, you can use this code to add http:// if users haven't:

 

<?php

$_POST['url'] = trim($_POST['url']);

if (!preg_match('|^[a-z]+://[\w-]+(?:\.[\w-]+)*(/\S*)?$|i',$_POST['url'])) $_POST['url'] = 'http://' . $_POST['url'];

// That's a really general URL match.  For something more simple/specific,
// use the following regular expression:
// |^http://|

?>

Link to comment
Share on other sites

Any suggestions for the validation bit?

 

Mmm, do you mean that the URL is a properly formatted URL?  How specific do you want to get?  The regex in the first preg_match above is a general server/path match (actually I should have included a possible port, too).  The protocol is indeterminate (http, ftp, news, rss, etc.).  And there are plenty of servers w/o www as the first bit, http://news.google.com/ for instance.

 

So how specific do you want to get?

Link to comment
Share on other sites

Not very tbh.

 

It's simply to prevent sql injection or anything of that sort and promote the user to add some kind of URL. So if the preg_match() you gave will prevent those that I'll just use that. I don't mind what the user submits as it's for their profile.

Link to comment
Share on other sites

So these links will almost invariably be http:// links to the point where you don't want to allow anything else, right?

 

<?php

// Trim unnecessary whitespace.
$_POST['url'] = trim($_POST['url']);

// Add the http:// protocol if absent.
if (!preg_match('|^http://|i',$_POST['url'])) $_POST['url'] = 'http://' . $_POST['url'];

// Very simple "validation"
if (!preg_match('|^http://\S+$|i', $_POST['url']) {
echo "Invalid URL.  DO IT AGAIN!";
}

?>

 

(Thought:  Are you storing these links in a database?  If the pages are dynamically generated, why not STRIP the "http://" instead of adding it?  That'll save you seven bytes of storage space per link in your database, and it can be easily added when serving the data.)

 

As far as SQL inj., you should be fine with mysql_real_escape_string(), assuming you're using MySQL.

Link to comment
Share on other sites

Yes, I'll be inserting it in to a MySQL database. That's a good suggestion too. Always looking to make my system more efficient. Wouldn't that mean I'd need to strip http:// though, instead of adding?

 

Thanks again for your help, I'll insert the code in to the script.

 

By the way, which is more secure mysql_real_escape_string() or stripslashes()?

Link to comment
Share on other sites

mysql_real_escape_string is purpose-built for inserting data into the database; it adds to \r and \n as well.  If anyone tries to inject anything, it's just going to be in a string, so no worries.

 

I was working on a better URL validator (although I'm sure there are some floating around on the 'net)... but I need to do some real work, so maybe I'll post it tomorrow.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.