Jump to content

XSS Injections with htmlspecialchars($string, ENT_QUOTES, 'UTF-8');


Monkuar

Recommended Posts

htmlspecialchars($str, ENT_QUOTES, 'UTF-8');
Using this code with UTF-8. I need someone to help craft up some smalll xss injections with this.

 

I heard htmlspecialchars doesn't stop all xss attacks, so I'm wondering what's the most xss attack you can craft to load a simple cookie loader. (Basically, just simple javascript injection is all I'm trying to find, because people can use cookie loaders with it, and yeah, that's not good).

 

Looking for your code to be posted, and once it is.. I'll copy it and submit it through the code I posted above and see if there is any vulnerabilities. (On my localhost server)

Thanks!

 

$str = user input.

 

Oh, and here is the BBCODE regex that the code passes through before this function is returned on the text.

 

$text = preg_replace( "#\[b\](.+?)\[/b\]#is", "<span class='b'>\\1</span>", $text );
			$text = preg_replace( "#\[i\](.+?)\[/i\]#is", "<i>\\1</i>", $text );
			$text = preg_replace( "#\[u\](.+?)\[/u\]#is", "<u>\\1</u>", $text );
			$text = preg_replace( "#\[s\](.+?)\[/s\]#is", "<s>\\1</s>", $text );
			
			//Spoiler
			
			$text = preg_replace( "#\[right\](.+?)\[/right\]#is", "<div style='text-align:right'>$1</div>", $text );
			
			//Beautiful Colors
			$text = preg_replace( "%\[colou?r=([a-zA-Z]{3,20}|\#[0-9a-fA-F]{6}|\#[0-9a-fA-F]{3})](.+?)\[/colou?r\]%msi", '<span style="color: \\1">\\2</span>', $text );
			$text = preg_replace( "%\[colou?r=(rgb\(\d{1,3}, ?\d{1,3}, ?\d{1,3}\))](.+?)\[/colou?r\]%msi", '<span style="color: \\1">\\2</span>', $text );
If anyone can craft up an XSS for this, I'd appreciate it. Because I need this to be secure. Edited by Monkuar
Link to comment
Share on other sites

If you need this to be secure, then don't write your own BBCode parser. You will end up with vulnerabilities. Even if we told you that everything is fine (which we don't), that wouldn't prove anything. We're a bunch of random programmers who've spent maybe 5 minutes on your code – that's laughable. The only way to be halfway sure is to use an established parser, run the result through HTML Purifier and add Content Security Policy on top.

 

The htmlspecialchars() function is fine for preventing classical injections, but your task is much more complex. You do want the user to generate HTML, you just want to limit their abilities. Simple escaping won't help you with that.

 

For example, CSS contexts are inherently unsafe, especially in older browsers (e. g. dynamic expressions). Even seemingly trivial contexts like the href attribute can be used for attacks:

<?php

header('Content-Type: text/html; charset=utf-8');


$xss_1 = 'javascript:alert("XSS 1")';
$xss_2 = 'data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTIDInKTwvc2NyaXB0Pg==';

?>
<!DOCTYPE HTML>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>XSS test</title>
    </head>
    <body>
        <p>
            The strings are properly escaped, yet still they can be used for XSS.
        </p>
        <a href="<?= htmlspecialchars($xss_1, ENT_QUOTES, 'UTF-8') ?>">Click for XSS 1</a><br>
        <a href="<?= htmlspecialchars($xss_2, ENT_QUOTES, 'UTF-8') ?>">Click for XSS 2</a>
    </body>
</html>

Trying to fix those vulnerabilities is futile, because it will just be an arms race between you and the attackers: You fix something, they come up with something new.

 

You need a fundamentally different approach: an established library and multiple layers of protection.

Edited by Jacques1
Link to comment
Share on other sites

Thanks Jac. I ran those codes through my parser, but not sure why it's still not being XSS injected.

 

I'm returning:

 

<a href='http://data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTIDInKTwvc2NyaXB0Pg==' target='_blank'>data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTIDInKTwvc2NyaXB0Pg==</a><br /><br />[URL]javascript:alert("XSS 1")[/URL]
7235bfc9e679658642ed66eb0e3971a5.png

 

Upon clicking the link, nothing happens.

 

I can try HTML purifier, but I don't know how bloated it is, and not sure if I even need it at this point. I guess once you or I find an injectable code, I might do it, but as of right now.... meh.

 

Also, I've tried all of these as well:

 

http://jeffchannell.com/Other/bbcode-xss-howto.html

 

None work. I'm vigorously trying to find why.

Edited by Monkuar
Link to comment
Share on other sites

Security doesn't work like this.

 

If you assume that your code is secure until somebody proves the opposite, you're doing it wrong. The absence of evidence is not the evidence of absence. Even if none of the persons you ask is able to come up with a fancy attack (which may very well be the case), that doesn't mean anything. Maybe they're not smart enough, maybe they're not good at attacking applications, maybe they just don't care about those silly break-my-code challenges.

 

Do you really want to base the entire security of your application on that? Then obviously your website isn't important at all. Otherwise you wouldn't gamble with it.

 

Security is about thinking ahead. You don't wait until somebody breaks into your application so that you can fix this specific hole. You make sure that here are no holes in the first place. We can definitely help you with that. But if you're more interested in entertainment, this is probably the wrong forum.

 

 

 

// For the sake of completeness: The reason why the example attack vectors don't work for you is because your parser doesn't understand URLs. It blindly prepends “http://”, even if there's already a scheme. That's a bug.

Edited by Jacques1
Link to comment
Share on other sites

Security doesn't work like this.

 

If you assume that your code is secure until somebody proves the opposite, you're doing it wrong. The absence of evidence is not the evidence of absence. Even if none of the persons you ask is able to come up with a fancy attack (which may very well be the case), that doesn't mean anything. Maybe they're not smart enough, maybe they're not good at attacking applications, maybe they just don't care about those silly break-my-code challenges.

 

Do you really want to base the entire security of your application on that? Then obviously your website isn't important at all. Otherwise you wouldn't gamble with it.

 

Security is about thinking ahead. You don't wait until somebody breaks into your application so that you can fix this specific hole. You make sure that here are no holes in the first place. We can definitely help you with that. But if you're more interested in entertainment, this is probably the wrong forum.

 

 

 

// For the sake of completeness: The reason why the example attack vectors don't work for you is because your parser doesn't understand URLs. It blindly prepends “http://”, even if there's already a scheme. That's a bug.

Yeah I get your prerogative okay. I am not trying to be entertaining, I apologize if I came off that way.

 

The regex for building my url is as follows:

 

 

 

$text = preg_replace( "#\[url\](\S+?)\[/url\]#ise"                                       , "regex_build_url(array('html' => '\\1', 'show' => '\\1'))", $text );
Function for regex_build_url:

 

 

function regex_build_url($url=array()) {
	
		$skip_it = 0;
		
		// Make sure the last character isn't punctuation.. if it is, remove it and add it to the
		// end array
		if ( (strlen($url['show']) < 2 )){
			return $url['html'];
		}
		
		
		if ( preg_match( "/([\.,\?]|!)$/", $url['html'], $match) )
		{
			$url['end'] .= $match[1];
			$url['html'] = preg_replace( "/([\.,\?]|!)$/", "", $url['html'] );
			$url['show'] = preg_replace( "/([\.,\?]|!)$/", "", $url['show'] );
		}
		
		// Make sure it's not being used in a closing code/quote/html or sql block
		
		if (preg_match( "/\[\/(html|quote|code|sql)/i", $url['html']) )
		{
			return $url['html'];
		}
		
		// clean up the ampersands
		$url['html'] = preg_replace( "/&/" , "&" , $url['html'] );
		
		// Make sure we don't have a JS link
		$url['html'] = preg_replace( "/javascript:/i", "java script: ", $url['html'] );
		
		// Do we have http:// at the front?
		
		if ( ! preg_match("#^(http|news|https|ftp|aim)://#", $url['html'] ) )
		{
			$url['html'] = 'http://'.$url['html'];
		}
		
		//-------------------------
		// Tidy up the viewable URL
		//-------------------------

 
		
		if ( (strlen($url['show']) -58 ) < 3 )  $skip_it = 1;
		

		
		// Make sure it's a "proper" url
		
		if (!preg_match( "/^(http|ftp|https|news):\/\//i", $url['show'] )) $skip_it = 1;
		
		$show  = $url['show'];
		
		if ($skip_it != 1) {
			$stripped = preg_replace( "#^(http|ftp|https|news)://(\S+)$#i", "\\2", $url['show'] );
			$uri_type = preg_replace( "#^(http|ftp|https|news)://(\S+)$#i", "\\1", $url['show'] );
			
			//$show = $uri_type.'://'.substr( $stripped , 0, 35 ).'...'.substr( $stripped , -15   );
		}
		
		return $url['st'] . "<a href='".$url['html']."' target='_blank'>".$show."</a>" . $url['end'];
		
	}
You're right, it did just blindly accepted my http://, but that stopped your specific attack. Now, since you gave me your xss attack, I improved the code and added this:

 

if (filter_var($url['html'], FILTER_VALIDATE_URL) === FALSE) {
			return $url['html'];
		}
Which will just return the text because data based dynamic expressions are not valid urls now :). So, step by step I am improving this. It seems like the url tags are very well protected at the moment, can you spot any other red flags?

 

I am not saying my code is 100% secure. I believe code cannot be 100% secure and there is always, ALWAYS room for someone to craft up an exploit if given enough resources and time. That's why I posted here! And I do appreciate your help thus far. I would of never added filter_var without seeing your xss.... So, believe it or NOT, you are making a difference to me. And I do appreciate it.

Edited by Monkuar
Link to comment
Share on other sites

The point is that the approach is already wrong. It's a classical anti-pattern which has a long history of causing security issues. The bugtrackers and vulnerability databases are full of this never-ending break-fix cycle:

  • A developer writes an XSS filter, thinking that they've finally figured out how to do it.
  • Somebody else breaks the filter.
  • The developer fixes the problem, hoping that now the filter works.
  • Somebody else breaks the filter.
  • ... and so on.

Sure, you could say that the filter gets better each time. But I'd rather say that it was garbage from the beginning and should be thrown away. Because each time the filter breaks, it puts all users at risk.

 

If you want to make this experience yourself, well, go ahead. But this has nothing to do with proper programming. At all.

Link to comment
Share on other sites

The point is that the approach is already wrong. It's a classical anti-pattern which has a long history of causing security issues. The bugtrackers and vulnerability databases are full of this never-ending break-fix cycle:

  • A developer writes an XSS filter, thinking that they've finally figured out how to do it.
  • Somebody else breaks the filter.
  • The developer fixes the problem, hoping that now the filter works.
  • Somebody else breaks the filter.
  • ... and so on.
Sure, you could say that the filter gets better each time. But I'd rather say that it was garbage from the beginning and should be thrown away. Because each time the filter breaks, it puts all users at risk.

 

If you want to make this experience yourself, well, go ahead. But this has nothing to do with proper programming. At all.

 

So, I guess I'm going your route as you've been more than enough patient with me. Is HTML purifier the only library that's worth using? (in your opinion). I do actually agree with what you're saying now, no one really has been this blunt towards me as my ego is insanely high. I'm looking at this from an objective standpoint too, and I think your path is the right path. Regardless about my issues with bloatness from html purifier, or the xss semantic issue with patching each time a user finds a loophole. But, a good question remains.. Even with using HTML purifier... The risk is still there right? But it's for sure of a lower chance than what I'm using, correct?

Edited by Monkuar
Link to comment
Share on other sites

First of all: Great reaction. I understand that you've probably invested a lot of time and effort into the filtering approach, so you'd have every right to stubbornly insist on it. But you don't. That's pretty rare these days. :)

 

Note that I didn't suggest HTML Purifier alone. It's actually a three-layer approach:

  • Use an established BBCode parser which has already proven itself in real applications. This greatly reduces the risk of “stupid” mistakes.
  • Since there may still be subtle bugs in the parser, it's a good idea to keep it in a kind of “sandbox”. That's what HTML Purifier is for: It makes sure that the parser is restricted to specific HTML tags like <b> or <a>.
  • In addition to the server-side protection, you also tell the browser that it shouldn't execute arbitrary inline scripts. This is what the Content-Security-Policy header does.

The combination of all three layers provides maximum security, and it's still pretty lightweight given the complex task. The only way to be even more secure is to not allow user comments in the first place.

Edited by Jacques1
  • Like 1
Link to comment
Share on other sites

First of all: Great reaction. I understand that you've probably invested a lot of time and effort into the filtering approach, so you'd have every right to stubbornly insist on it. But you don't. That's pretty rare these days. :)

 

Note that I didn't suggest HTML Purifier alone. It's actually a three-layer approach:

  • Use an established BBCode parser which has already proven itself in real applications. This greatly reduces the risk of “stupid” mistakes.
  • Since there may still be subtle bugs in the parser, it's a good idea to keep it in a kind of “sandbox”. That's what HTML Purifier is for: It makes sure that the parser is restricted to specific HTML tags like <b> or <a>.
  • In addition to the server-side protection, you also tell the browser that it shouldn't execute arbitrary inline scripts. This is what the Content-Security-Policy header does.
The combination of all three layers provides maximum security, and it's still pretty lightweight given the complex task. The only way to be even more secure is to not allow user comments in the first place.

 

 

Yeah, I already added the headers and will be using html purifier as another added layer of protection. For the bbcode package, not sure. I feel like just grabbing phpbb's or something as I know they have spent a lot of time on theirs. And they know what they're doing. I'll try to find a well respected and secure one and not use my own.

 

In any event, the thing that got me from what you said was

Do you really want to base the entire security of your application on that? Then obviously your website isn't important at all. Otherwise you wouldn't gamble with it.

 

Which really hit the nail on the head. Working on this RPG game for the past several months and then not even caring about security is pretty childish and doesn't show true dedication.  I do honor. honest criticism every once in a while, so don't be afraid to give me more of it! It will only help me, trust me! :)

Edited by Monkuar
  • Like 1
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.