Jump to content

encoding a pathname


ginerjm

Recommended Posts

Seem to be having a problem with json_encode.  My first time using this.  What I have is some php generated filenames that include their relative paths along with some attributes such as extension,width & height.  I have made a php array out of these with the filename string as the first level key and the attributes being an array under each of those keys. 

 

An example:

Array (

[/photos/mainmenu/Chateau_dAgassac] => Array ( => jpg [caption] => txt [w] => 600 [h] => 400 ) [/photos/mainmenu/IMG_1358] => Array ( => jpg [caption] => txt [w] => 600 [h] => 400 ) [/photos/mainmenu/IMG_1367] => Array ( => jpg [caption] => txt [w] => 600 [h] => 400 )

)

 

I then did a json_encode of this array and assigned it to a JS variable and it looks like I want it to look in my js code when I view the source of the output page.  Except for one thing.  All the keys (filenames) have a backslash in front of all the forward slashes that are part of the path.  This complicates things when I try and take a filepath/name string from my html and use it to grab some attributes for that filename

 

An example of the js array:

img_data = {

"\/photos\/mainmenu\/Chateau_dAgassac":{"img":"jpg","caption":"txt","w":600,"h":400}, "\/photos\/mainmenu\/IMG_1358":{"img":"jpg","caption":"txt","w":600,"h":400},

"\/photos\/mainmenu\/IMG_1367":{"img":"jpg","caption":"txt","w":600,"h":400}

}

 

This is literally how it looks in a view of the source code of my browser page.

 

What's the trick to accessing this js array now when I have a filename (href) in my html <img> tag that obviously no longer matches what I have in my js array?

Link to comment
Share on other sites

Do not turn off slash escaping! This is a very important security feature which prevents cross-site scripting attacks. The PHP developers haven't implemented this just for fun.

 

Try this code with standard JSON encoding:

<?php

$json = array(
	'foo' => '</script><script>alert(/XSS/);</script><script>',
);

?>
<script>
	var foo = <?= json_encode($json, JSON_UNESCAPED_SLASHES) ?>;
</script>

Nothing bad happens. Now the same thing with JSON_UNESCAPED_SLASHES as suggested by Ch0cu3r:

<?php

$json = array(
	'foo' => '</script><script>alert(/XSS/);</script><script>',
);

?>
<script>
	var foo = <?= json_encode($json, JSON_UNESCAPED_SLASHES) ?>;
</script>

Congratulations, we now have a cross-site scripting vulnerability, because any occurence of the term “</script>” within the JSON data terminates the current script context and allows the user to create a new one with arbitrary JavaScript code.

 

So this is not a solution. It doesn't even address the actual problem.

 

The problem is that ginerjm takes some PHP value and drops it right into a script element. While that may seem like the obvious thing to do, it's really an awful approach which can lead to all kinds of security vulnerabilities and bugs (as we just saw). Never generate JavaScript code. Instead, make a PHP script which serves the JSON-encoded data and then fetch it with JavaScript. jQuery even has the getJSON() method which automatically decodes the response:

$.getJSON('image_data.php', function (image_data) {
	// image_data is the already decoded array
});

If you absolutely must embed the data into your HTML document, then JSON-encode it, HTML-escape it, put it into a hidden div element and parse it with JavaScript using JSON.parse().

  • Like 1
Link to comment
Share on other sites

Even if the data I am using in this case is simply filenames and values that I have collected from the file system?/

 

 


If you absolutely must embed the data into your HTML document, then JSON-encode it, HTML-escape it, put it into a hidden div element and parse it with JavaScript using JSON.parse().

 

Care to elaborate on what your last means, in code terms?
 

Link to comment
Share on other sites

 

Even if the data I am using in this case is simply filenames and values that I have collected from the file system?/

 

You should always write proper code and treat every dynamic value as if it was dangerous. Yes, sometimes you may get away with a buggy implementation. But why take the risk?

 

 

 

Care to elaborate on what your last means, in code terms?

<?php

$data = array(
    'foo' => 42,
    'bar' => 123,
);

?>
<!DOCTYPE HTML>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>JSON test</title>
        <style>
            .hidden {
                display: none;
            }
        </style>
        <script src="http://code.jquery.com/jquery-1.11.1.min.js"></script>
        <script>
            $(function () {
                var data = JSON.parse($('#my-data').text());
                console.log(data);
            });
        </script>
    </head>
    <body>
        <div id="my-data" class="hidden"><?= htmlspecialchars(json_encode($data), ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') ?></div>
    </body>
</html>

Note, however, that this is not the recommended solution and should only be used if Ajax is defnitely not an option.

Edited by Jacques1
Link to comment
Share on other sites

 

 

Congratulations, we now have a cross-site scripting vulnerability, because any occurence of the term “</script>” within the JSON data terminates the current script context and allows the user to create a new one with arbitrary JavaScript code.

Whoa! Interesting it also applies to style tags too. I guess the dom doesn't know the context of the tags until the document tree is parsed.

Link to comment
Share on other sites

Jacques1 - Sorry but I don't read jquery.  Just looking at it I don't get how you have taken my php-generated data and turned it into a js array.

 

I know you have lectured me before on security concerns, but I still fail to see how data that I am  generating without user input or interference can end up being a security concern for my appl. 

 

And yes - ajax is available but what is that going to do for me when all I want to do is load a js array.  How does one ajax call to a php script do what I want to do better than my current approach?

Link to comment
Share on other sites

Moving ahead here is what I have done.  Remember - this data is derived from using glob() on a given path to get image files and then using getimagesize() to get the size and building a php array element for each file found.  In my mind - this data is not suspect.

 

in my php I did:

 

 $json_imgs = json_encode($ar_imgs,JSON_UNESCAPED_SLASHES);

to create my JS input.

 

In my html output I did:

 

 
$code=<<<heredocs
 <div style='display:none;'>
 <script type="text/javascript">
 img_data = $json_imgs;
 </script>
 </div>
heredocs;
 echo $code;

 

And now I have a js array which can be read by my js function which was my goal.   What I don't see is why Jacques1 used a display:none in his example since my results are clearly visible to anyone viewing the source of the web page.

Link to comment
Share on other sites

If you don't know jQuery yet, you should learn it. It will save you a lot of time, and it's simply one of those tools every web developer should know.

 

The two techniques I explained earlier are standard practices and part of the OWASP recommendations. Of course you don't have to follow best practices. If you think you can get away with a quick hack, well, go ahead. But if you're interested in a proper solution, it's a good idea to adhere to one basic rule:

 

Escape all dynamic input. Don't try to distinguish between “safe” and “unsafe” values. It's a waste of time, it's unreliable, and it's missing the point. Even if a value is “safe”, it may still lead to bugs.

 

Applying this rule to the problem at hand leads to one of the two solutions: Either you escape the data properly and put it into a hidden div container. Or you fetch it with Ajax and avoid the need for escaping altogether.

Link to comment
Share on other sites

I understanding your concern, but I am at a loss how this situation is a concern.  It is a compendium of data from image files on my site which I have uploaded.  Period. 

 

What Exactly would I need to do to escape this data?  And how (again) would Ajax be a solution?

 

I did put this into a hidden div container, but it still shows up in the source code when viewed in the browser. 

Link to comment
Share on other sites

Do you get these backslashes if you try to decode the object at server side level, something like:

<?php

$json = json_encode(
    array(
        'http://calendar.com/content/' => array(
            'English' => array(
                'One',
                'January'
            ),
            'French' => array(
                'Une',
                'Janvier'
            )
        )
    )
);

var_dump(json_decode($json,true));
Edited by jazzman1
Link to comment
Share on other sites

I understanding your concern, but I am at a loss how this situation is a concern.  It is a compendium of data from image files on my site which I have uploaded.  Period.

 

Where the data comes from is irrelevant.

 

 

 

What Exactly would I need to do to escape this data?  And how (again) would Ajax be a solution?

 

Well, the first step is to let json_encode() do its job and not interfere with the default escaping. As I've explained in #3, there's actually a reason why the PHP core developers escape all forward slashes by default: Without this, the content of a JSON string may terminate the script element. This can happen accidentally or as part of a cross-site scripting attack.

 

I'm not saying that you will run into problems if you leave out the escaping this time. Do we immediately crash into a tree and die if we don't put on the seatbelt? No, but we still put it on, because it's a sensible precaution. 

 

Frankly, I don't understand this obsession with hacky workarounds which only do the bare minimum required for the task. If your goal is to write robust code, you want the exact opposite: A solution which always works reliably, not just under specific circumstances.

 

In that sense, Ajax is a very reliable solution, because it lets you avoid the escaping issue altogether: You echo the data with PHP, you parse it with JavaScript (which automatically removes the backslashes), and that's it.

 

 

 

I did put this into a hidden div container, but it still shows up in the source code when viewed in the browser. 

 

So? The data isn't secret, it's just not part of the website layout (unless you have a particularly geeky site).

Link to comment
Share on other sites

From your last - the data isn't secret, no.  But why put it in a div tag?  I had it as part in the JS portion of my output at first and only moved it to a div tag at your suggestion and don't know why.

 

So - if I do not use the escape option on my php json_encode call, how do I read the damn stuff in js when it has all those backslashes in it?

 

And for the third time - how would ajax alleviate all of your concerns?  Please, please show me.

Edited by ginerjm
Link to comment
Share on other sites

Maybe it helps to compare this problem with a more common task: Let's say you want to pass a PHP value to a query.

 

You have three options now:

  • You just drop the value directly into the query. This is not a good idea, because the value may interfere with the query and cause bugs or even an SQL injection vulnerability.
  • You escape the value and then insert it into the query. This is much better but somewhat fragile.
  • You use a prepared statement and avoid the underlying problem altogether. This is by far the best solution.

Going back to your actual problem, you're currently using the first option: You just drop the JSON string directly in your JavaScript code. This is not a good idea, because the data may interfere with the JavaScript code or the HTML markup (as demonstrated above).

 

The second option would be to escape the data. Unfortunately, we cannot use standard HTML escaping within script elements, because HTML has special parsing rules for those. What we do instead is use a simple HTML element like div or span as a data container: We HTML-escape the JSON string, put it into the container and then read and parse it with JavaScript. I've demonstrated this in reply #5.

 

By far the best option (the “prepared statement”) is to load the data with a separate Ajax request:

<?php

/*
 * The script for providing the image data.
 */

header('Content-Type: application/json');

echo json_encode(array(
	'foo' => "/",
	'bar' => 123,
));
<!DOCTYPE HTML>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>JSON test</title>
        <script src="http://code.jquery.com/jquery-1.11.1.min.js"></script>
        <script>
            $(function () {
                // load the data with Ajax
                $.getJSON('image_data.php', function (image_data) {
                    console.log(image_data);
                });
            });
        </script>
    </head>
    <body>

    </body>
</html>

Now there's no code insertion whatsoever. The PHP scripts provides the data as a JSON object, and JavaScript loads and parses it. This includes removing the backslashes.

Edited by Jacques1
Link to comment
Share on other sites

Is this the ajax implementation?  If it is how is it different than what I'm doing?  The same php code will produce my data again, it is merely the js that is doing something different, yes?  Since I haven't learned JQ in the last 3 hours, I don't understand what you are showing me.

Link to comment
Share on other sites

I'm not understanding the difference either; they are both coming from php. Either a json object embedded in the php page, or retrieving the json object via ajax.

 

I understand that if you are using user-supplied data from the public that it could be harmful to just "include" it, but that would be true whichever way you retrieved it, wouldn't it?

Link to comment
Share on other sites

Not sure why this is so hard to understand.

 

You know how Ajax works, right? You make an HTTP request with JavaScript and get back the response as a JavaScript string. Well, and this string contains your JSON data. You didn't have to make any dubious code injections from PHP into some script element to get the data. You just fetched it with pure JavaScript.

 

Why is this important? Because now the data cannot cause any trouble. No matter what it contains, it's just a JavaScript string.

 

Think about my comparison again: Fetching the data with Ajax is similar to what a prepared statement does in the context of database queries. Yes, you can insert your PHP values directly into an SQL query. But then you have to be very careful that you escape the values correctly, and if you don't get it right, you end up with an SQL injection vulnerability (or at least a bug). A prepared statement doesn't have those problems, because you don't have to insert any values in the first place.

 

So using Ajax is a good idea for the same reasons why prepared statements are a good idea – and I hope we all agree that prepared statements are far superior to manual escaping.

 

  • Like 1
Link to comment
Share on other sites

When I do an ajax process, I use js to make the call, but it is calling a PHP script which returns a "string".  WTH is a 'javascript string'?  And (as mentioned by others in this conversation) how is receiving this string any different than pasting it into the code like I'm doing now?

 

Yes I use prepared queries but I don't see your example as being helpful to me here.

 

Thank you for your continued patience too.

Link to comment
Share on other sites

@ginerjm, I've got a question to you. What is the difference in these two snips?

<?php

$html = <<<EOT

 <!DOCTYPE html>
<html>
    <head>
        <title>TODO supply a title</title>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
    </head>
    <body>
        <div>TODO write content</div>
    </body>
</html>       
        
        
EOT;

And simple HTML:

<!DOCTYPE html>
<html>
    <head>
        <title>TODO supply a title</title>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
    </head>
    <body>
        <div>TODO write content</div>
    </body>
</html>
Edited by jazzman1
Link to comment
Share on other sites

Yes, there is and it's called performance. Few years ago I made a simple benchmark parsing a large html content by php made 100 000 requests or so similar and the result of this was surprised me very much. In the first snip, the php parser should parsed all content between <?php ...?> tags spitting the data to apache, as for the second one - no.

PHP developers strongly recommend to do not parsing any html/javascript or "client-side" content by php. You want to call javascript in the server side - that's not good - use AJAX as Jacques already mentioned. This is my point.   

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.