Jump to content

Sockets and remote downloading


marklarah

Recommended Posts

I made a topic a while ago, in which some kind guru helped me out loads.

 

This is the code I got from that topic (I can't find the topic now though)

 

<?php
define('BUFFER_SZE', 8192);
define('ALERT_SIZE', 1024); //alert each KB

//$sock = fsockopen('localhost', 80);
$sock = fsockopen('www.remotesite.com', 80);

//$headers = "GET /somefile.php HTTP/1.1
$filepath = "/files/brochure_1.zip";
$headers = "GET $filepath HTTP/1.1
Host: localhost
Connection: close
\r\n";

$hl = strlen($headers);

if(fwrite($sock, $headers) == $hl) {
//All content was written successfully.
$content = $line = '';
//First get the headers using fgets.
$length = -1;
while(!feof($sock)) {
	$line = trim(fgets($sock));
	if(empty($line))
		break;
	if(stripos('Content-Length:', $line)) {
		//Woo, there's a content-length header.
		$length = preg_replace('~[^0-9]~', '', $line);
	}
}

//Done with headers....  Get content now.
if($length != 0) {
	//Uhmm... if length is 0, that means that Content-Length: 0 was sent, in which case, there is not content.
	$size_so_far = 0; //number of bytes downloaded.
	$len = ($length > 0) ? $length : 'N/A';
	$factor = 0; //No better name for this x.x.
	while(!feof($sock)) {
		$tmp = stream_get_contents($sock, BUFFER_SIZE);
		$content .= $tmp;
		$size_so_far += strlen($tmp);
		if($length != -1 && $size_so_far == $length)
			break;
		if(floor($size_so_far/ALERT_SIZE) > $factor) {
			$factor = floor($size_so_far/ALERT_SIZE);
			$bytes = $factor*ALERT_SIZE; //yeah, this could be wrong, but it will look prettier ;p.
			echo "{$bytes} bytes downloaded of {$len} bytes.\n";
		}
	}
	echo "Done downloading {$size_so_far} bytes.\n";
}

}
?>

 

Essentially, what needs doing, is on one of my friends websites, there is a directory of files (brochures and things, nothing exciting, which change dynamically), and so I need to be able to download those files through my server to the users computer. A tunneler of sorts I guess.

 

I've been doing this with cURL for a while now, and whilst it sort of works, about a third of the time, it will corrupt the download, and it doesn't show the time remaining on the download.

 

So this is the code I've got for doing it with sockets. I have been scouring the internet for ways to do this, but I've come up with very little. I've been trying to get this to work for ages now, but I'm just stuck.

 

So how would I get this to work?

 

Thanks.

 

Edit: The page (download.php), just does not stop loading, so there is no error message I can give you :(

Link to comment
Share on other sites

Okay, so how would I do this with wget?

 

I just tried using some other sockets script, and I got the fatal error: allowed memory exhausted etc after leaving it load for bloody ages

 

I've tried googling this, but I've yet to find a script to do what I want.

Link to comment
Share on other sites

Don't hate me if this doesn't work.. I am currently using linux right now (My o/s) but I'm not sure how exec() uses it..

 

<?php
exec("wget http://localhost.com/FILENAME.ZIP");
?>

 

It generally saves to whatever folder it is currently in so I would suggest first to do:

 

<?php
exec("cd /home/user/htdocs/downloads");
exec("wget http://localhost.com/FILENAME.ZIP");
?>

 

Try that (But of course change the 'cd' directory and the 'wget' file..

Link to comment
Share on other sites

I get that bit, I found that on google, but I want it so the user can download it, not so t downloads to the server.

 

I'd need it so the file just gets tunneled through the server, like in cURL...

 

<a href="http://OtherServer.com/file.zip">Download!</a>

 

What is the point of this script..?

 

Just download the file to the server, link directly to it..

Link to comment
Share on other sites

You can think of wget as a command line browser that will get one page.  It has numerous options to deal with the idiosyncracies of website security, cookies, user agents, etc. but basically you run it as wget target_url.

 

 

[david@penny ~]$ wget --help
GNU Wget 1.10.2 (Red Hat modified), a non-interactive network retriever.
Usage: wget [OPTION]... [url]...

Mandatory arguments to long options are mandatory for short options too.

Startup:
  -V,  --version           display the version of Wget and exit.
  -h,  --help              print this help.
  -b,  --background        go to background after startup.
  -e,  --execute=COMMAND   execute a `.wgetrc'-style command.

Logging and input file:
  -o,  --output-file=FILE    log messages to FILE.
  -a,  --append-output=FILE  append messages to FILE.
  -d,  --debug               print lots of debugging information.
  -q,  --quiet               quiet (no output).
  -v,  --verbose             be verbose (this is the default).
  -nv, --no-verbose          turn off verboseness, without being quiet.
  -i,  --input-file=FILE     download URLs found in FILE.
  -F,  --force-html          treat input file as HTML.
  -B,  --base=URL            prepends URL to relative links in -F -i file.

Download:
  -t,  --tries=NUMBER            set number of retries to NUMBER (0 unlimits).
       --retry-connrefused       retry even if connection is refused.
  -O,  --output-document=FILE    write documents to FILE.
  -nc, --no-clobber              skip downloads that would download to
                                 existing files.
  -c,  --continue                resume getting a partially-downloaded file.
       --progress=TYPE           select progress gauge type.
  -N,  --timestamping            don't re-retrieve files unless newer than
                                 local.
  -S,  --server-response         print server response.
       --spider                  don't download anything.
  -T,  --timeout=SECONDS         set all timeout values to SECONDS.
       --dns-timeout=SECS        set the DNS lookup timeout to SECS.
       --connect-timeout=SECS    set the connect timeout to SECS.
       --read-timeout=SECS       set the read timeout to SECS.
  -w,  --wait=SECONDS            wait SECONDS between retrievals.
       --waitretry=SECONDS       wait 1..SECONDS between retries of a retrieval.
       --random-wait             wait from 0...2*WAIT secs between retrievals.
  -Y,  --proxy                   explicitly turn on proxy.
       --no-proxy                explicitly turn off proxy.
  -Q,  --quota=NUMBER            set retrieval quota to NUMBER.
       --bind-address=ADDRESS    bind to ADDRESS (hostname or IP) on local host.
       --limit-rate=RATE         limit download rate to RATE.
       --no-dns-cache            disable caching DNS lookups.
       --restrict-file-names=OS  restrict chars in file names to ones OS allows.
       --ignore-case             ignore case when matching files/directories.
  -4,  --inet4-only              connect only to IPv4 addresses.
  -6,  --inet6-only              connect only to IPv6 addresses.
       --prefer-family=FAMILY    connect first to addresses of specified family,
                                 one of IPv6, IPv4, or none.
       --user=USER               set both ftp and http user to USER.
       --password=PASS           set both ftp and http password to PASS.

Directories:
  -nd, --no-directories           don't create directories.
  -x,  --force-directories        force creation of directories.
  -nH, --no-host-directories      don't create host directories.
       --protocol-directories     use protocol name in directories.
  -P,  --directory-prefix=PREFIX  save files to PREFIX/...
       --cut-dirs=NUMBER          ignore NUMBER remote directory components.

HTTP options:
       --http-user=USER        set http user to USER.
       --http-password=PASS    set http password to PASS.
       --no-cache              disallow server-cached data.
  -E,  --html-extension        save HTML documents with `.html' extension.
       --ignore-length         ignore `Content-Length' header field.
       --header=STRING         insert STRING among the headers.
       --proxy-user=USER       set USER as proxy username.
       --proxy-password=PASS   set PASS as proxy password.
       --referer=URL           include `Referer: URL' header in HTTP request.
       --save-headers          save the HTTP headers to file.
  -U,  --user-agent=AGENT      identify as AGENT instead of Wget/VERSION.
       --no-http-keep-alive    disable HTTP keep-alive (persistent connections).
       --no-cookies            don't use cookies.
       --load-cookies=FILE     load cookies from FILE before session.
       --save-cookies=FILE     save cookies to FILE after session.
       --keep-session-cookies  load and save session (non-permanent) cookies.
       --post-data=STRING      use the POST method; send STRING as the data.
       --post-file=FILE        use the POST method; send contents of FILE.
       --no-content-disposition  don't honor Content-Disposition header.

HTTPS (SSL/TLS) options:
       --secure-protocol=PR     choose secure protocol, one of auto, SSLv2,
                                SSLv3, and TLSv1.
       --no-check-certificate   don't validate the server's certificate.
       --certificate=FILE       client certificate file.
       --certificate-type=TYPE  client certificate type, PEM or DER.
       --private-key=FILE       private key file.
       --private-key-type=TYPE  private key type, PEM or DER.
       --ca-certificate=FILE    file with the bundle of CA's.
       --ca-directory=DIR       directory where hash list of CA's is stored.
       --random-file=FILE       file with random data for seeding the SSL PRNG.
       --egd-file=FILE          file naming the EGD socket with random data.

FTP options:
       --ftp-user=USER         set ftp user to USER.
       --ftp-password=PASS     set ftp password to PASS.
       --no-remove-listing     don't remove `.listing' files.
       --no-glob               turn off FTP file name globbing.
       --no-passive-ftp        disable the "passive" transfer mode.
       --retr-symlinks         when recursing, get linked-to files (not dir).
       --preserve-permissions  preserve remote file permissions.

Recursive download:
  -r,  --recursive          specify recursive download.
  -l,  --level=NUMBER       maximum recursion depth (inf or 0 for infinite).
       --delete-after       delete files locally after downloading them.
  -k,  --convert-links      make links in downloaded HTML point to local files.
  -K,  --backup-converted   before converting file X, back up as X.orig.
  -m,  --mirror             shortcut for -N -r -l inf --no-remove-listing.
  -p,  --page-requisites    get all images, etc. needed to display HTML page.
       --strict-comments    turn on strict (SGML) handling of HTML comments.

Recursive accept/reject:
  -A,  --accept=LIST               comma-separated list of accepted extensions.
  -R,  --reject=LIST               comma-separated list of rejected extensions.
  -D,  --domains=LIST              comma-separated list of accepted domains.
       --exclude-domains=LIST      comma-separated list of rejected domains.
       --follow-ftp                follow FTP links from HTML documents.
       --follow-tags=LIST          comma-separated list of followed HTML tags.
       --ignore-tags=LIST          comma-separated list of ignored HTML tags.
  -H,  --span-hosts                go to foreign hosts when recursive.
  -L,  --relative                  follow relative links only.
  -I,  --include-directories=LIST  list of allowed directories.
  -X,  --exclude-directories=LIST  list of excluded directories.
  -np, --no-parent                 don't ascend to the parent directory.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.