Jump to content

Getting Directory Names Loop


MoFish

Recommended Posts

Hi All,

 

I'm trying to create a dropdown which will write out the directory structure names of another one of my servers (Remote URL)

 

I have a script working (ish) however it keeps writing out "Parent Directory" as an option in my dropdown list.

 

Could someone advise me on how to remove this? I've been looking into it for a while now. The structure of the HTML is.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
 <head>
  <title>Index of /myurl.com</title>
 </head>
 <body>
<h1>Index of /myurl.com</h1>
<ul><li><a href="/"> Parent Directory</a></li>
<li><a href="design1/"> design1/</a></li>
</ul>
</body></html>

Ok - so first i set the URL and attempt strip out the ULs and the LI including "Parent Directory" - This doesnt seem to work :)

<?php
$url = file_get_contents("http://www.myurlwhichhasdirectorylisting.co.uk");

// do some line removing
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($url));

// attempt to remove the ul and li (doesnt work)

$start = strpos($content,"<ul><li><a href='/'>Parent Directory</a></li>");
$end = strpos($content,"</ul>",$start) + 8;
$table = substr($content,$start,$end-$start);
preg_match_all("|<li(.*)</li>|U",$table,$rows);
?>

Then i attempt to loop around the dropdown

                	<?php 
	                	foreach ($rows[0] as $row){
	                		preg_match_all("|<li(.*)</li>|U",$row,$cells);
	                		$var = strip_tags($cells[0][0]);
	                		echo "{$var}\n";
	                ?>
	                	<option value="<?=$var;?>"><?=$var;?></option>
	                <?php } ?>

I'm probably doing this a very long winded way :)

 

Thanks!

Link to comment
https://forums.phpfreaks.com/topic/283953-getting-directory-names-loop/
Share on other sites

You could possibly remove it by just using a preg_replace.

 

$table = preg_replace("~.*<ul><li><a href='/'>Parent Directory</a></li>(.*)</ul>.*~" ,"\${1}", $content);

 

I have no idea if that will work, I haven't tested it, but the theory is that it will match all text between the end of the first <li> to the start of the </ul> closing tag, and then put that into the $table variable. The other stuff will effectively be removed.

 

Hopefully it gives you an idea

 

Denno

DOMDocument. Forget regular expressions, forget breaking it apart with string functions, and just use DOMDocument.

 

getElementsByTagName() to get all the links, then loop through those and grab their href attributes.

Hi Thanks for your help.

 

I finally got rid of that 'Parent Directory'. It is now writing out the following:

<a href="design1/"> design1/</a>

Ideally i would like it to only return "design1" without the anchors or trailing forward slash.

 

Could anyone help?

 

My code is:

// declare the folder
$html = file_get_contents("http://www.mywebsite.com/folderlist/");
preg_match_all('|<li>(.*)</li>|U', $html, $uu);
$files = $uu[1];
print_r($files[1]);

Thanks

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.