bunnyali2013 Posted November 21, 2012 Share Posted November 21, 2012 (edited) I know how to validate URL in PHP by using the FILTER_VALIDATE_URL or simply, using regular expression. However, I want to know how I can validate a URL to see if it contains file. For example: www.xxxx.com/abc.exe, www.xxxx.com/abc/abc.jpg, etc... As you see, the links contain a file, one has an executable and the other has an image. I want to know, how I can validate a URL to know if it has a file or not? Because I do not want URL with contain file to be in my form! So, any regular expression or other way to do that? Edited November 21, 2012 by bunnyali2013 Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted November 21, 2012 Share Posted November 21, 2012 $path_arr = pathinfo('http://www.xxxx.com/abc.exe'); $path_extension = $path_arr['extension']; //exe note that you will want to check for an empty string or null value returned if an extension cannot be found. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 (edited) well first i check if the textfield to write URL is empty then post an error message. Then I check the URL to see if it is valid using regular expression or FILTER_VALIDATE_URL. If not valid, error message. Then I check if the URL contains extension. If yes, error message, else proceed. Good? By the way, $path_arr['extension']; Can I check many extension together? Edited November 21, 2012 by bunnyali2013 Quote Link to comment Share on other sites More sharing options...
mrMarcus Posted November 21, 2012 Share Posted November 21, 2012 Just checking the file extension does not offer any real additional security. You want to check the signature/MIME type. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 MIME type in link? I thought it applies only on file upload. But a MIME can be modified right? Quote Link to comment Share on other sites More sharing options...
Muddy_Funster Posted November 21, 2012 Share Posted November 21, 2012 $path_arr['extension'] returns the extension in the url, you would need to apply your own logic to check that extension against what you want to validate, that's why AyKay47 said that you would need to do a check to make sure there was an extension at all over and above this. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 Well, I will check if it has an extension, for example, does this work? if (($path_extension) == "exe") { echo "Error" }; Quote Link to comment Share on other sites More sharing options...
Muddy_Funster Posted November 21, 2012 Share Posted November 21, 2012 that's an example of how you would use it, yeah. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 But does we do it like this? I mean the good is good? I cant use text editor on this computer else I would check it... Quote Link to comment Share on other sites More sharing options...
mrMarcus Posted November 21, 2012 Share Posted November 21, 2012 Are these paths to files being stored on your server? What I mean is: are these paths to files that have been uploaded to your server by public users? That's why I suggested you check the MIME type of the file. I had just assumed the files were present on the server and not just standalone URLs. Quote Link to comment Share on other sites More sharing options...
Muddy_Funster Posted November 21, 2012 Share Posted November 21, 2012 (edited) well that exact code won't work because you have put the ; after the } rather than at the end of the echo line, but other than that it's fine edt : just noticed you have extra parenthesis around the $path_extension variable. I don't know if this will throw an error or not (not something I have ever done), but they don't need to be there regardless. Edited November 21, 2012 by Muddy_Funster Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted November 21, 2012 Share Posted November 21, 2012 Just checking the file extension does not offer any real additional security. You want to check the signature/MIME type. You're absolutely wrong about that. The server uses the extension to determine what to do, not the MIME type. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 (edited) Oh lol yes, I typed too fast and did not pay attention the semi-colon. @Mr Marcus, no, these are link validations. Suppose you come on my website, you insert a link, which will store in my database. I dont accept links which have file directly, like www.xxxx.com/abc.exe for example. Because these links stored in database will appear on a page, and if the links have files, when the user will click, ithe file will be downloaded straight-away. This is why I have to check. No fuiles are uploaded in my server, so MIME is useless I think as its not file upload. Edited November 21, 2012 by bunnyali2013 Quote Link to comment Share on other sites More sharing options...
mrMarcus Posted November 21, 2012 Share Posted November 21, 2012 You're absolutely wrong about that. The server uses the extension to determine what to do, not the MIME type. That's a bold statement. An .exe file can be uploaded as a .jpg and renamed back to .exe on the server (with gained permissions) and executed. It is quite possible, so I am not absolutely wrong about that. Simply checking the extension is not enough... if you care about system integrity. However, I am a bit off-topic here as I thought he was addressing files that currently resided on the server. So, for simple URL validation, simply checking the extension should suffice. Quote Link to comment Share on other sites More sharing options...
mrMarcus Posted November 21, 2012 Share Posted November 21, 2012 Oh lol yes, I typed too fast and did not pay attention the semi-colon. @Mr Marcus, no, these are link validations. Suppose you come on my website, you insert a link, which will store in my database. I dont accept links which have file directly, like www.xxxx.com/abc.exe for example. Because these links stored in database will appear on a page, and if the links have files, when the user will click, ithe file will be downloaded straight-away. This is why I have to check. No fuiles are uploaded in my server, so MIME is useless I think as its not file upload. My bad. I assumed you were also hosting the said files. Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted November 21, 2012 Share Posted November 21, 2012 That's a bold statement. An .exe file can be uploaded as a .jpg and renamed back to .exe on the server (with gained permissions) and executed. It is quite possible, so I am not absolutely wrong about that. Simply checking the extension is not enough... if you care about system integrity. However, I am a bit off-topic here as I thought he was addressing files that currently resided on the server. So, for simple URL validation, simply checking the extension should suffice. A user would only be able to tamper with uploaded files if permissions allowed it, which they shouldn't. An MIME type can easily be spoofed. Quote Link to comment Share on other sites More sharing options...
mrMarcus Posted November 21, 2012 Share Posted November 21, 2012 A user would only be able to tamper with uploaded files if permissions allowed it, which they shouldn't. An MIME type can easily be spoofed. Shouldn't isn't the same as can't. Don't rely on that. Checking both the MIME type and extension are in no way guarantees of safety. OP doesn't need to worry about this, however, as he's not storing files on his server. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 (edited) Exactly, I am not storing files on the server. If this was a file hosting, then things would be different. I cannot accept direct file link. I will try to do it as suggested earlier, with pathinfo(). I will post here if something goes wrong. By the way, I really wanted to make an image hosting, but this involves lot of security. MIME and file extension check up would be too basic. The damn Apache accepts more than one file extension on a file. .htcaccess has to be configured to accept only files, .htcaccess has to be outside the directory, files have to be stored in another directory outside, firewall should be configured, antivirus, mod_security stuffs, damn, forget that! Edited November 21, 2012 by bunnyali2013 Quote Link to comment Share on other sites More sharing options...
salathe Posted November 21, 2012 Share Posted November 21, 2012 You appear to be under the impression that URLs dictate the file type of the resource. This simply is not true. There is no reason why http://example.org/foo.exe could not be a completely normal web page (or a animated kitten gif), nor why http://example.org/foo/ could not be the location of an executable. If you still want to "validate" URLs based on whether they appear to contain what might possibly be a direct download link to a file, then sure go ahead but it seems like a silly idea to be. P.S. AyKay47's post isn't a very good example, even if you were to go down that route. Quote Link to comment Share on other sites More sharing options...
bunnyali2013 Posted November 21, 2012 Author Share Posted November 21, 2012 You are right Salathe! I have just been informed that checking file extension is lame as there are nay files, like Bash, which this extension checking will not work! It will not work on short-links as well. I think I will check the URL normally. Any last idea? Take example of this website: http://katzbb.com/submit.html, even it mentions to not post direct link files, I just tested it, and their validation did not work. It accepted a ZIP. Seems they did not make a validation as it will be useless! Quote Link to comment Share on other sites More sharing options...
DavidAM Posted November 26, 2012 Share Posted November 26, 2012 If you are simply trying to protect your users from nefarious (other) users, you may want to look into curl. After verifying that the url is formatted properly, you can use curl to send a HEAD request to the url. If the request fails, either the server is down or the user entered an incorrect or not-active URL. If the request succeeds, the HEAD request sends you all of the headers (but none of the content). Check the Content-Type header to see what kind of data the url sends. You can also look for a Content-Disposition header (which is used to force a download). The Content-Type should be more definitive than the file extension on the link. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.