vanleurth Posted January 13, 2011 Share Posted January 13, 2011 Hola Everybody !! I'm putting together a web app similar to Digg and was wondering if there is a function or code example I can use to avoid users submitt the same url. For example: Right now the user can submit; 1. http://www.example.com?post01 2. http://example.com?post01 3. www.example.com?post01 I want the web app to check if the link has been submitted by the user first and look for duplicate submission. Any ideas ? Thank you, V. Quote Link to comment https://forums.phpfreaks.com/topic/224359-how-to-check-for-duplicate-url-submission/ Share on other sites More sharing options...
QuickOldCar Posted January 14, 2011 Share Posted January 14, 2011 What I did was drop off any protocols like http://, https, www, and so on. Also remove the end slashes. In my case I use these as my titles, but you can use them just for checking purposes. Is actually a lot to this, need to lowercase just the domain area in case they capitalize. Checks for inserting in the form so they can type it in any way, like http://aol.com,http://www.aol.com,http://www.aol.com/,aol.com or anything similar can be inserted and be the same values. I then resolve them through curl. Then you get url's such as http://mysite.com, which could also be the same exact url as http://mysite.com/index.html or http://mysite.com/index.php or http://mysite.com/index.asp and on and on. That's why I try to let curl resolve them. Javascript redirects aren't too pleasant, but you should be able to follow any normal redirects. I been working on my login system so you can't browse my index right now, but the system I described works for me and took me a great deal of time to figure out. I did leave the non login areas live though like the add. So try a url in any way and will see it will not do a duplicate. http://dynaindex.com/add Quote Link to comment https://forums.phpfreaks.com/topic/224359-how-to-check-for-duplicate-url-submission/#findComment-1159138 Share on other sites More sharing options...
QuickOldCar Posted January 14, 2011 Share Posted January 14, 2011 I had some time waiting for huge sized folders to transfer, I thought I'd be nice and write up a function to clean the url's and then another to check them if similar. So the concept is to eliminate all the stuff that would make them different, but ultimately would go to the same or similar url. That would include any protocols, the www , end slash , # at end , ? at end. The www and end slash you will find out sometimes are or are not required because the website owners did not allow for that. Best to use curl to try and resolve the urls first. But then the url the user inserted would be different if was a normal redirect. Lowercase anything from the domain name forward. Here's the function file compareurl.php <?php function cleanUrl($input_url) { if ($input_url == '') { echo "EMPTY URL VALUE"; DIE;//redirect on empty value somewhere } else { $input_url = trim($input_url); $input_url = rtrim($input_url,"/"); if ((substr($input_url, 0, == "https://") OR (substr($input_url, 0, 7) == "http://") OR (substr($input_url, 0, 6) == "ftp://") OR (substr($input_url, 0, 7) == "feed://")) { $new_url = $input_url; } else { /*replace uppercase or unsupported to normal*/ $url_input .= str_replace(array('feed://www.','feed://','HTTP://','HTTP://www.','HTTP://WWW.','http://WWW.','HTTPS://','HTTPS://www.','HTTPS://WWW.','https://WWW.'), '', $input_url); $new_url = "http://www.$url_input"; } $get_parse_url = parse_url($new_url, PHP_URL_HOST);//the parsed host $host_parse_url .= str_replace(array('Www.','WWW.'), '', $get_parse_url);//replace any uppers $host_parse_url = strtolower($host_parse_url);//lowercase host area $port_parse_url = parse_url($new_url, PHP_URL_PORT);//the port, omitted from clean_url $user_parse_url = parse_url($new_url, PHP_URL_USER);//users account $pass_parse_url = parse_url($new_url, PHP_URL_PASS);//users password $get_path_parse_url = parse_url($new_url, PHP_URL_PATH);//the file location or path $path_parse_url .= str_replace(array('Www.','WWW.'), '', $get_path_parse_url);//don't recall why I did this $query_add_parse_url = parse_url($new_url, PHP_URL_QUERY);//the query $query_add_parse_url = "?$query_add_parse_url";//add the ? back to front of query $query_add_parse_url = rtrim($query_add_parse_url, "#");//remove # from end $fragment_parse_url = parse_url($new_url, PHP_URL_FRAGMENT);//the end fragment $fragment_parse_url = "#$fragment_parse_url";//add # back to beginning fragment $fragment_parse_url = rtrim($fragment_parse_url,"#");//remove any # from end fragment $hostpath_url = "$host_parse_url$path_parse_url";//combine parsed url and path $hostpath_url = rtrim($hostpath_url, '?');//remove ? from parsed url and path $query_add_parse_url = rtrim($query_add_parse_url, '?');//remove ? from end of query $hostpathquery_url = "$host_parse_url$path_parse_url$query_add_parse_url";//host path and query combined $complete_url = "$host_parse_url$user_parse_url$pass_parse_url$path_parse_url$query_add_parse_url$fragment_parse_url";//all combined minus port $cleaned_url = "$host_parse_url$user_parse_url$pass_parse_url$path_parse_url$query_add_parse_url$fragment_parse_url";//all combined minus port, if want query or fragment gone remove it $cleaned_url = trim($cleaned_url);//double check is no whitespace $cleaned_url = rtrim($cleaned_url,"?");//remove ? from end of url $cleaned_url = rtrim($cleaned_url,"#");//remove # from end of url $cleaned_url = rtrim($cleaned_url,"/");//remove end slash $cleaned_url = ltrim($cleaned_url, "www.");//remove www. if exists RETURN $cleaned_url; } } function compareUrl($url1,$url2) { if (cleanUrl($url1) == cleanUrl($url2)) { RETURN TRUE; } } ?> here's some example url's and usage: <?php //usage //compareUrl() requires 2 variables to check against include('compareurl.php'); //sample url's $url1 = "http://www.site.com/mail/"; $url2 = "HTTP://SITE.com/mail?"; $url3 = "site.com/mail/"; $url4 = "https://site.com/mail?"; $url5 = "site.com/mail/"; $url6 = "mysite.com"; $url7 = "http://site.com?"; $url8 = "http://site.com/index.php?"; $url9 = "HTTP://SITE.COM/index.php/?"; $url10 = "http://site.com/index.php"; //check url 1 versus 2 if (compareUrl($url1,$url2) == TRUE) { echo "$url1 and $url2 are the same <br />";//reject insert code } else { echo "$url1 and $url2 are different <br />";//accept insert code } //check url 3 versus 4 if (compareUrl($url3,$url4) == TRUE) { echo "$url3 and $url4 are the same <br />"; } else { echo "$url3 and $url4 are different <br />"; } //check url 5 versus 6 if (compareUrl($url5,$url6) == TRUE) { echo "$url5 and $url6 are the same <br />"; } else { echo "$url5 and $url6 are different <br />"; } //check url 6 versus 7 if (compareUrl($url6,$url7) == TRUE) { echo "$url6 and $url7 are the same <br />"; } else { echo "$url6 and $url7 are different <br />"; } //check url 8 versus 9 if (compareUrl($url8,$url9) == TRUE) { echo "$url8 and $url9 are the same <br />"; } else { echo "$url8 and $url9 are different <br />"; } //check url 9 versus 10 if (compareUrl($url9,$url10) == TRUE) { echo "$url9 and $url10 are the same <br />"; } else { echo "$url9 and $url10 are different <br />"; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/224359-how-to-check-for-duplicate-url-submission/#findComment-1159263 Share on other sites More sharing options...
vanleurth Posted February 13, 2011 Author Share Posted February 13, 2011 Wow !! Thank you so much for this code. I haven't test it but I'm very excited to give it a try. Thank you so much again, V. Quote Link to comment https://forums.phpfreaks.com/topic/224359-how-to-check-for-duplicate-url-submission/#findComment-1173769 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.