Jump to content

Recommended Posts

Hi All,

 

I'm trying to create a dropdown which will write out the directory structure names of another one of my servers (Remote URL)

 

I have a script working (ish) however it keeps writing out "Parent Directory" as an option in my dropdown list.

 

Could someone advise me on how to remove this? I've been looking into it for a while now. The structure of the HTML is.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
 <head>
  <title>Index of /myurl.com</title>
 </head>
 <body>
<h1>Index of /myurl.com</h1>
<ul><li><a href="/"> Parent Directory</a></li>
<li><a href="design1/"> design1/</a></li>
</ul>
</body></html>

Ok - so first i set the URL and attempt strip out the ULs and the LI including "Parent Directory" - This doesnt seem to work :)

<?php
$url = file_get_contents("http://www.myurlwhichhasdirectorylisting.co.uk");

// do some line removing
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($url));

// attempt to remove the ul and li (doesnt work)

$start = strpos($content,"<ul><li><a href='/'>Parent Directory</a></li>");
$end = strpos($content,"</ul>",$start) + 8;
$table = substr($content,$start,$end-$start);
preg_match_all("|<li(.*)</li>|U",$table,$rows);
?>

Then i attempt to loop around the dropdown

                	<?php 
	                	foreach ($rows[0] as $row){
	                		preg_match_all("|<li(.*)</li>|U",$row,$cells);
	                		$var = strip_tags($cells[0][0]);
	                		echo "{$var}\n";
	                ?>
	                	<option value="<?=$var;?>"><?=$var;?></option>
	                <?php } ?>

I'm probably doing this a very long winded way :)

 

Thanks!

Edited by MoFish
Link to comment
https://forums.phpfreaks.com/topic/283953-getting-directory-names-loop/
Share on other sites

You could possibly remove it by just using a preg_replace.

 

$table = preg_replace("~.*<ul><li><a href='/'>Parent Directory</a></li>(.*)</ul>.*~" ,"\${1}", $content);

 

I have no idea if that will work, I haven't tested it, but the theory is that it will match all text between the end of the first <li> to the start of the </ul> closing tag, and then put that into the $table variable. The other stuff will effectively be removed.

 

Hopefully it gives you an idea

 

Denno

DOMDocument. Forget regular expressions, forget breaking it apart with string functions, and just use DOMDocument.

 

getElementsByTagName() to get all the links, then loop through those and grab their href attributes.

Hi Thanks for your help.

 

I finally got rid of that 'Parent Directory'. It is now writing out the following:

<a href="design1/"> design1/</a>

Ideally i would like it to only return "design1" without the anchors or trailing forward slash.

 

Could anyone help?

 

My code is:

// declare the folder
$html = file_get_contents("http://www.mywebsite.com/folderlist/");
preg_match_all('|<li>(.*)</li>|U', $html, $uu);
$files = $uu[1];
print_r($files[1]);

Thanks

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.