Jump to content

scandir() fails for directory names contains spaces (Linux)


MutantJohn

Recommended Posts

Hey guys, I'm having some trouble creating a small file explorer.

 

I'm designing a site and so far, everything was working very well. But if directory names contains spaces, scandir() fails everytime.

 

Here's my current PHP code : 

 

 

<html>
 
    <head>
        
        <style>
        /* TODO : Add awesome styling here */
        </style>
 
    </head>
 
    <body>
 
<?php
 
    $uri = $_SERVER[ "REQUEST_URI" ];
    
    // Begin sanitation of the URI because I suck at Apache's
    // rewrite module. I guess this is more server agnostic, in
    // that sense
    $num_items = 3;
    $split_uri = explode( "/", $uri, $num_items );
    $uri = "./" . $split_uri[ $num_items - 1 ];
 
    // var_dump( $uri );
 
    // now, loop through directory and build table contents...
 
    $files = scandir( $uri );
 
    if ( $files === false )
    {
        echo "Requsted URI could not be converted into a valid directory name" . PHP_EOL;
        return;
    }
 
    // Use DOMDocument because I think it's cleaner, more modular and overall
    // more maintainable than just raw echo calls
    $dom = new DOMDocument();
    $table = $dom->createElement( "table" );
    $tbody = $dom->createElement( "tbody" );
 
    foreach ( $files as $file )
    {
        if ( $file == "." || $file == ".." )
        {
            continue;
        }
 
        $row = $dom->createElement( "tr" );
        $cols = array( "name" => $dom->createElement( "td" ),
                       "size" => $dom->createElement( "td" ) );
 
        if ( is_dir( $uri . $file ) === true )
        {
            $a = $dom->createElement( "a", $file );
            $a->setAttribute( "href", $file );
            $cols[ "name" ]->appendChild( $a );
        }
        else
        {
            $cols[ "name" ]->nodeValue = $file;
 
            $filesize = filesize( $uri . $file );
 
            if ( $filesize !== false )
            {
                $cols[ "size" ]->nodeValue = $filesize . " bytes";
            }
            else
            {
                echo "An error occurred while trying to read filesize" . PHP_EOL;
            }
        }
 
        foreach ( $cols as $col )
        {
            $row->appendChild( $col );
        }
 
        $tbody->appendChild( $row );
    }
 
    $table->appendChild( $tbody );
    $dom->appendChild( $table );
 
    $dom->formatOutput = true;
    echo $dom->saveHTML();
?>
 
    </body>
 
</html>

 

Basically, I'm using a .htaccess file in the site's root directory handle all directory requests made by users exploring their files. So that may be why the design is "weird".

Edited by MutantJohn
Link to comment
Share on other sites

Yes, but you should do a bit more than that. Like validate the path: make sure it doesn't go places you don't want it to go (like via a "../"), and make sure the path exists before trying to call scandir() because that's the polite thing to do.

 

And to be honest,

$num_items = 3;
$split_uri = explode( "/", $uri, $num_items );
$uri = "./" . $split_uri[ $num_items - 1 ];
that bothers me. Why 3? What were the two items before it? Will those change? Can you just use $_GET or even the entire query string instead?
Link to comment
Share on other sites

Ah, yes. I think I just suck at Apache XD

 

Okay, here's the whole shebang :

 

I'm using the basic LAMP stack because I'm stuck in 1974. I have a site in my web server directory. The root folder is ditacms.com.

 

In ditacms.com, I have my-awesome-php-script.php and a .htaccess file that looks like this:

 

DirectoryIndex index.html my-awesome-php-script.php

 

ditacms.com also contains a "users" directory which, guess what, contains a list of user directories and files therein.

 

No other sub-folder of ditacms.com contains an index.html file so instead, the PHP script is called. I'm trying to use this PHP script to generate the index listing. I want one awesome PHP script to handle all the building of the indexes and I only want this file to exist in one place.

 

So I was using REQUEST_URI but it kept  giving me this if I were to click a link to the users directory from the home index.html page from the site's root directory :

 

/ditacms.com/users/ (I can't remember if there was a slash at the end or not)

 

PHP kept telling me this directory didn't exist. I think this is because the script is seeing everything from where it's located. So I suck at the rewire module for Apache so I decided to re-write the URI using PHP and the explode() function. That's why there's 3 items, because it's split twice (the first slash and then second).

 

Using this, I just rewrite the URI to be this instead :

 

./users

 

This works. And it also works for further nested directories because I've limited the number of explosions.

 

I think this isn't the most elegant but it works.

Link to comment
Share on other sites

Here's something that should work a little better.

 

1. Rewrite all requests to directories to go to your indexing script.

RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ my-awesome-php-script.php [L]
2a. Theoretically, as long as you don't do anything more complicated than that, the DOCUMENT_ROOT + REQUEST_URI will be the directory requested. In practice it might not be - it could be my-awesome-php-script.php itself, for example. So

$path = $_SERVER["DOCUMENT_ROOT"] . $_SERVER["REQUEST_URI"];
if (!is_dir($path)) {
// invalid path
}

// show files in the $path directory
2b. It's possible someone could somehow (don't worry about trying to imagine exactly how they do it) manipulate the path you check. This is where you deal with . and .. and then make sure that the final resulting path is allowed. Easiest way is to work with just the REQUEST_URI by replacing /./ with / (string replacing can do this), as well as replacing /foo/../ with just / (regular expressions will make that much easier) and then making sure there aren't any ..s left in place.

3. After step 2, let's say you have

$requesturi = "/" . trim($_SERVER["REQUEST_URI"], "\\/") . "/"; // clean up slashes
$requesturi = str_replace("\\", "/", $requesturi); // windows' slashes -> regular slashes
$requesturi = str_replace("/./", "/", $requesturi); // remove . components
$requesturi = preg_replace('#/[^/]+((?R)+|/)\.\./#', '/', $requesturi); // recursively remove .. components
$requesturi = preg_replace('#//+#', '/', $requesturi); // repeated slashes

// if there are any more ..s then the path is trying to go above the DOCUMENT_ROOT
if (strpos($requesturi, "/../") !== false) {
// invalid path
}

// the path is relative to the DOCUMENT_ROOT
$path = rtrim($_SERVER["DOCUMENT_ROOT"], "\\/") . $requesturi;
if (!is_dir($path)) {
// invalid path
}

// show files in the $path directory

The "base" directory is according to the REQUEST_URI (which was cleaned up to $requesturi) and you would use this when constructing links.

foreach (scandir($path) as $file) {

$filepath = $path . $file;

$uripath = $requesturi . $file . (is_dir($filepath) ? "/" : ""); // trailing slash. not required but helps distinguish files vs directories

[code]

If you gave listings for . and .. then they should get special treatment: don't show .. in the root directory, leave the link alone for just ., and remove a directory for ..

 

(All the code is untested but should be at least close to accurate.)

  • Like 1
Link to comment
Share on other sites

Okay one thing, every time I put "../" anywhere in the URL, the PHP script doesn't seem to get called.

 

For example, if I try /ditacms.com/users/christian/.., the PHP script seems to be ignored and I'm brought back the /ditacms.com/users/christian

 

Is there a way to prevent that? Because I tried the code you posted and it doesn't seem to be working... It's like the path is resolved by the server before the PHP script is even invoked.

Link to comment
Share on other sites

Okay one thing, every time I put "../" anywhere in the URL, the PHP script doesn't seem to get called.

 

For example, if I try /ditacms.com/users/christian/.., the PHP script seems to be ignored and I'm brought back the /ditacms.com/users/christian

 

Is there a way to prevent that? Because I tried the code you posted and it doesn't seem to be working... It's like the path is resolved by the server before the PHP script is even invoked.

Apache is probably doing it (if not your browser). The exact behavior isn't as important as the fact that it is being handled without causing problems, even if that means by Apache and not the PHP script.
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.