Joshua4550 Posted May 13, 2010 Share Posted May 13, 2010 Hey, it's me again - YES! ;o From the title, you may be thinking "HTML? This is the PHP section!", but please read on... I have created (finally) a basic dynamic page which users can edit using forms, and I have allowed HTML - because I want HTML! But I just don't want them to be allowed to inject to mess up the default-code of the template - only their own code. I was thinking of maybe making an application to check the HTML code, and if error-full, give them an error message and not proceed with the submission, but hey - this would probably be timely to create. This in mind, I decided to post here for you experts' advice/answers Okay, so basically I need to know if, and the most logic way, to make it so they cannot inject to the default page. What I mean is: Template: <?php /* * $input would be defined here, grabbing content from * a database, but after knowing this - you should understand what I mean. */ echo ' <div id="class1"> Hello there ' . $input . ' </div> '; ?> Okay, so if this was the case - what i'm asking is for the most logic/efficient way to let the $input be printed, but not interfere with the other code, so if $input = "</div>" then it shouldn't end the <div="class1">. Get it? Hope theres an answer, and looking forward to what you think Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/ Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 strip_tags() could be your solution. You can add a second parameter for "safe" tags. So you would specify <p>, <a>, etc. An example: echo ' <div id="class1"> Hello there ' . strip_tags($input, "<p><a><br><font>") . ' </div> '; Or, you could use this class. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057777 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 I don't want to strip tags, because the template i'm ACTUALLY using has almost every tag they'd want to use. I want them to still be available to use any tag, but for them to not intefere with the rest of the page. If only php/html had a way to print it as a "new page" inside this one, kind of like a tag. Any other answers, or is this just not possible? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057778 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 Well, you would have to write a function to check how many <div> tags there are in the user's code. Then check how many </div> tags there are, and compare the two. If there are more </div> tags, then clip the code on the last one (or just remove any excess tags). I'll give it a crack now. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057780 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 Funny, when I said an application for a HTML code checker - that's EXACTLY what I had in mind. Thing is, there must be a way in php to ask if they use <"anything here"> or </"anything here"> or <"anything here" />, right? If so, it wouldn't be that hard to make, I suppose. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057783 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 EDIT AGAIN: Ok this works. <?php if(!function_exists("stripos")) { // For PHP 4 (My Local Server) function stripos($haystack, $needle){ return strpos($haystack, stristr($haystack, $needle)); } } $data = 'aaa <div class="main">content</div> bbb</div>'; preg_match_all("/<div.*?>/", $data, $starttags); preg_match_all("/<\/div>/", $data, $endtags); $countstart = count($starttags[0]); $countend = count($endtags[0]); if($countstart > 0 || $countend > 0) { if($countstart > $countend) { $data .= str_repeat("</div>", ($countstart - $countend)); } else if($countstart < $countend) { $reverse = strrev($data); $i = ($countend - $countstart); while($i > 0) { $nextdiv = stripos($reverse, ">vid/<"); // ">vid/<" is "</div>" reversed. $reverse = substr($reverse, 0, $nextdiv) . substr($reverse, ($nextdiv + 6)); // Get up to the ">vid/<" and after it. $i--; } $data = strrev($reverse); } } echo $data; ?> Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057794 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 Seems logical, and looks like it'd work - but what about other tags? i'm guessing theres something you can do - like your code says .*?, so instead of div, something that represents "anything", but maybe anything except a / My sentence may not make sense, but bare with me because i'm sleepy But do you get what I mean? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057800 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 Well, the .*? actually allows for anything after the <div part, so things like <div id="asdf"> are matched as well as <div>. To do it for other tags, you would need to put it in a loop and have the "div" replaced with your tag name. This is always the problem with the HTML validation systems, because a computer will never have human instinct. It can't know what tags to look for unless you specify them. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057805 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 But surely, just like you used .*?, theres something to specify anything. Infact if .*? means anything, couldn't we use anything like: preg_match_all("/<.*?>/", $data, $starttags); preg_match_all("/<\/.*?>/", $data, $endtags); ? Also, maybe one for a tag that ends itself (eg: <div />)? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057810 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 In theory, yes. Although the problem with this is that it will confuse tags. Say that you have '<div><a href="/"><span></span>'. The first array will have '<div><a><span>' and the second '</span>'. It will see that there are less </ tags than <*> tags, and add two more </div>s. You need to be able to know what end tags to add and where. That is where human instinct is needed. The best you can do is to put it in a loop and go through all tags you want to check. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057814 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 So maybe instead of it adding tags where needed, notifies the user theres a problem, and tells them how many more of {<tag> || </tag>} than the other there is? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057817 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 Yes, that would work. Hope I've been of assistance. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057827 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 tar Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057859 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 <?php $data = 'lol <a dd="/"> <div> <gflool> </gflool> </div> </a> </lol>'; preg_match_all("/<.*?>/", $data, $starttags); preg_match_all("/<\/.*?>/", $data, $endtags); $countstart = count($starttags[0]); $countend = count($endtags[0]); $output = ""; if($countstart > 0 || $countend > 0) { if($countstart > $countend) { $output = "Theres more opening tags than closing tags"; } else if($countstart < $countend) { $output = "Theres more closing tags than opening tags"; } else { $output = "Your code is fine!"; } } echo $output; ?> Seems that it doesn't work, because it always says: "Theres more opening tags than closing tags", but theres actually more closing tags? I made the last closing tag an opening tag, but he output remained the same. I also tried using elseif rather than else if, but still no workie! Anything i'm doing wrong? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057884 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 The code looks alright, but let's debug it. Add print_r($starttags[0]); just below the first regular expression, and do the same for the $endtags one. Then paste the output here. Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057935 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 Ooh, it would seem otherwise: $data = 'lol <a dd="/"> <div> <gflool> </gflool> </div> </a> </lol>'; preg_match_all("/<.*?>/", $data, $starttags); preg_match_all("/<\/.*?>/", $data, $endtags); print_r($starttags[0]); print_r($endtags[0]); Array ( [0] => [1] => [2] => [3] => [4] => [5] => [6] => ) Array ( [0] => [1] => [2] => [3] => ) Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057946 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 Anyone see the error that's causing this? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057965 Share on other sites More sharing options...
mattal999 Posted May 13, 2010 Share Posted May 13, 2010 Try these regular expressions: preg_match_all("/<([A-Z][A-Z0-9]*)\b[^>]*>/", $data, $starttags); preg_match_all("/<\/([A-Z][A-Z0-9]*)\b[^>]*>/", $data, $endtags); If that fails, try these: preg_match_all("/</?\w+\s+[^>]*>/", $data, $starttags); preg_match_all("/<\/?\w+\s+[^>]*>/", $data, $endtags); I actually just realised that image tags will be counted by the first but not by the second. You'll need to come up with a fix to exclude image tags. I can't help as I'm off now, but good luck! Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057968 Share on other sites More sharing options...
Joshua4550 Posted May 13, 2010 Author Share Posted May 13, 2010 Sadly, both fail. Could we use something like this for the pattern? |[[\/\!]*?[^\[\]]*?]|si This is obviously what they use to replace BB tags, but can't we just disect this pattern and use it for html tags? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1057973 Share on other sites More sharing options...
Joshua4550 Posted May 14, 2010 Author Share Posted May 14, 2010 Anyone know? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058100 Share on other sites More sharing options...
anups Posted May 14, 2010 Share Posted May 14, 2010 U can use htmlentities function, this will display the exact HTML on the page. <?php echo ' <div id="class1"> Hello there ' . htmlentities($input). ' </div> '; ?> Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058107 Share on other sites More sharing options...
Joshua4550 Posted May 14, 2010 Author Share Posted May 14, 2010 You didn't read the thread. Anyone actually know? Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058309 Share on other sites More sharing options...
mattal999 Posted May 14, 2010 Share Posted May 14, 2010 Ok I checked this, and it seems to work as expected. I also allowed for <img /> tags to be missed, as they are a standalone element. <?php if(!function_exists("stripos")) { // For PHP 4 (My Local Server) function stripos($haystack, $needle){ return strpos($haystack, stristr($haystack, $needle)); } } $data = 'lol <a dd="/"> <div> <gflool> </gflool> </div> </a> </lol> <img src="" />'; preg_match_all("/<(?:\"[^\"]*\"['\"]*|'[^']*'['\"]*|[^'\">])+>/i", $data, $starttags); $i = 0; foreach($starttags[0] as $starttag) { if(substr($starttag, 0, 2) == "</") { unset($starttags[0][$i]); } else if(substr($starttag, -2) == "/>") { unset($starttags[0][$i]); } $i++; } //print_r($starttags); preg_match_all("/<\/(?:\"[^\"]*\"['\"]*|'[^']*'['\"]*|[^'\">])+>/i", $data, $endtags); //print_r($endtags); $countstart = count($starttags[0]); $countend = count($endtags[0]); if($countstart > 0 || $countend > 0) { if($countstart > $countend) { /* $data .= str_repeat("</".$tag.">", ($countstart - $countend)); */ echo "You have too many opening tags! Please try again."; } else if($countstart < $countend) { /* $reverse = strrev($data); $i = ($countend - $countstart); while($i > 0) { $nextdiv = stripos($reverse, ">".$tag."/<"); // ">vid/<" is "</div>" reversed. $reverse = substr($reverse, 0, $nextdiv) . substr($reverse, ($nextdiv + 6)); // Get up to the ">vid/<" and after it. $i--; } $data = strrev($reverse); */ echo "You have too many closing tags! Please try again."; } else { echo $data; } } ?> Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058361 Share on other sites More sharing options...
Joshua4550 Posted May 14, 2010 Author Share Posted May 14, 2010 Amazing, thanks alot. When you say It allows <img /> - It will allow any tag which includes a closing element inside itself too, right? IE: <anytag /> Edit: Read the code thouroughly and I see it will work, thanks alot again! Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058364 Share on other sites More sharing options...
mattal999 Posted May 14, 2010 Share Posted May 14, 2010 No problem! Glad I could help, even if it did take 2 days Quote Link to comment https://forums.phpfreaks.com/topic/201643-html-injection-precautions/#findComment-1058368 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.