Jump to content

Help using sockets


NotionCommotion

Recommended Posts

Never used sockets, and would like some advise whether I am on the right track.

 

PHP Client A (or maybe will be changed to Python or something else) is behind a fire wall with reasonable settings, has a dynamic IP, and periodically makes cURL or sockets requests to PHP Server B.

 

Later, Web client C makes a request to PHP Server B.

 

How can I forward Web Clients C's request to PHP Client A?

 

Also, should I be using the stream_ sockets functions?

 

Thanks

PHP Client A
    |
FIREWALL
    |
PHP Server B
    |
Web Client C
Link to comment
Share on other sites

Your options are basically

 

a) A polls B for information

b) A establishes a persistent connection to B and waits for activity

c) A opens up a port on the firewall and lets B connect

 

The stream functions are a bit higher-level than the raw socket functions so I suggest you try to use them.

Link to comment
Share on other sites

Thanks requinix,

 

I agree with your three options, and wanted to investigate your option b (A establishes a persistent connection to B and waits for activity).

 

Would a "persistent connection" be a socket?  Should A establish a stream_socket_client() or stream_socket_client() connection?  Would some sort of endless loop on A be need to allow it to "wait for activity"?  When A sends a request to B, typical firewall configurations will allow B to reply to A, and will this behavior persist?

 

This is all very new territory for me, and I would appreciate any advice.

Edited by NotionCommotion
Link to comment
Share on other sites

To be clear, a "stream" is what PHP calls its abstract representation of various things - mostly used with file handles, but it can be used with sockets too. They have a set of functions to operate on streams in a more-or-less general fashion regardless of the underlying entity. "Sockets" (BSD sockets at least) is the networking concept available in pretty much every language, where you deal with TCP/IP and connections and such. They have their own set of functions whose names tend to borrow from the C versions, and documentation for how to use them in one language tends to port over fairly well to other languages.

 

So,

Would a "persistent connection" be a socket?

Yes: a socket opened between two hosts that tries to stay open for pretty much ever. Realistically the A host would need to be able to reestablish the connection if/when it drops for whatever reasons.

 

Should A establish a stream_socket_client() or stream_socket_client() connection?

Yes. In networking terminology, the "client" is the one doing the connecting and the "server" is the one being connected to.

 

Would some sort of endless loop on A be need to allow it to "wait for activity"?

If it weren't for the fact that hardware allows for not needing to do that, yes. What you actually do is called a "select", which blocks (the function call doesn't return) until something about the connection changes - such as there being incoming data or the connection goes down. With an optional timeout, if you want to be able to do other stuff while waiting.

 

When A sends a request to B, typical firewall configurations will allow B to reply to A, and will this behavior persist?

Unless poorly configured, firewalls allow traffic to go both ways through previously-established connections. When people talk about blocking stuff with the firewall it really means blocking the initial connection attempt.

 

This is all very new territory for me, and I would appreciate any advice.

Read up on TCP/IP networking (to know about what's going on) and BSD sockets (to know about how to do stuff in code).
  • Like 1
Link to comment
Share on other sites

I created a chat server using sockets as an example for someone on IRC a few years back. It might be helpful as an example for you to see how to deal with sockets.

 

If you decide to create your own server/client setup you should probably look into some libraries to help with development rather than trying to do it all from scratch as in my example.

Link to comment
Share on other sites

If it weren't for the fact that hardware allows for not needing to do that, yes. What you actually do is called a "select", which blocks (the function call doesn't return) until something about the connection changes - such as there being incoming data or the connection goes down. With an optional timeout, if you want to be able to do other stuff while waiting.

 

I've since read how http://php.net/manual/en/function.socket-accept.php will halt the code until a response is received.  Ah, this makes sense.  I assume http://php.net/manual/en/function.stream-socket-server.php is the same even the documents do not say so?

 

But to implement this, there seem to be so much more required.  How are multiple connections dealt with?  If communication is lost, how is it reestablished?  How does the client be able to continue operating as a client (through the use of a timeout?) instead of just entering an eternal state of listening?  Will I just be reinventing the wheel, or do you know if there are existing classes?

Link to comment
Share on other sites

Oh. I didn't think to mention one pretty big drawback with using such long connections: there's only a finite number of connections that a server can maintain, so this solution does not work for a "large" number of clients. I didn't get the impression that was the case so maybe that's why I didn't remember it.

 

I've since read how http://php.net/manual/en/function.socket-accept.php will halt the code until a response is received.

You're going to need to start adopting the proper terminology lest people be confused about what you're saying and what you may know about what you're saying. "Response" is an HTTP thing that may be relevant to your application but isn't part of the general networking paradigm.

 

A server does a listen/accept loop: it listens for a new connection, accepts the new connection (which involves negotiating a new port so the original isn't tied up for long), does whatever it is supposed to do (which almost always involves handing the connection to a different process so the main server one can remain focused on accepting connections), and repeats.

 

 

Ah, this makes sense.  I assume http://php.net/manual/en/function.stream-socket-server.php is the same even the documents do not say so?

Uh no, it's not the same. The docs for stream_socket_server() even say that you need to then call stream_socket_accept().

 

s_s_s() basically wraps the "bind" call while s_s_a() wraps the "listen" and "accept" calls.

 

But to implement this, there seem to be so much more required.

Not a whole lot, actually. Check out the examples.

 

How are multiple connections dealt with?

Answered earlier.

 

If communication is lost, how is it reestablished?

That's up to your application. There is no automatic reconnect at this low of a level.

 

How does the client be able to continue operating as a client (through the use of a timeout?) instead of just entering an eternal state of listening?

Sounds like you're confusing the client and server.

 

Will I just be reinventing the wheel, or do you know if there are existing classes?

The stream functions help, but if you want more then you'll probably have to find a third-party library for it. Personally, I'd just do it myself: beyond the initial learning experience there's not too much code involved. Like I said, check the examples. Maybe whip up a couple scripts to prototype the client/server communication process and run them on your own computer.
Link to comment
Share on other sites

I created a chat server using sockets as an example for someone on IRC a few years back. It might be helpful as an example for you to see how to deal with sockets.

 

If you decide to create your own server/client setup you should probably look into some libraries to help with development rather than trying to do it all from scratch as in my example.

 

Hi kicken,

 

Your reply snuck by my and I didn't see it until now, but do feel it will be useful going over, and appreciate you sending it over.

 

Why do you recommend using a library instead of creating from scratch, and if a library, any recommendations of a good one?  EDIT.  I see you referenced http://reactphp.org/ in this post: https://forums.phpfreaks.com/topic/302371-php-client-to-periodically-make-curl-requests-to-other-servers/

 

Thanks

Edited by NotionCommotion
Link to comment
Share on other sites

Thank you requinix for your comprehensive reply.  I also agree my confusion results in poor terminology.

 

You earlier indicated that In networking terminology, the "client" is the one doing the connecting and the "server" is the one being connected to.  I originally felt a client requests something and a server provides something regardless of who connects.  So, if computerA connects to computerB and computerA waits for computerB to do something, computerA is still considered the client?

 

Also, I am still confused regarding HTTP servers and socket servers.  Originally, I was going to ask the question on how sockets deals with virtual hosts.  But now I am starting to realize that both a webserver and a socket server can't be bound to the same IP and port else a client would get two different responses (okay, not "response" since this is a HTTP thing, but what do I call it for the socket server?).   And then there are clients...  Yes, I understand that a webbrowser is a HTTP client.  I think that cURL is also a HTTP client, but it could also be a sockets client, right?  And then one can build a sockets client using  stream_socket_client() or the like.  I guess this makes sense, but I am still confused when one transitions from one technology to another.

 

Maybe if I better explain why I am doing this and what I am hoping sockets can provide, it will better uncover my misunderstandings.

  • I have ComputerLocal located behind my customer's firewall and ComputerRemote located outside of it.
  • ComputerLocal needs to regularly send data to ComputerRemote.
  • ComputerRemote needs to every great while send data to ComputerLocal.  Note that ComputerRemote also operates a HTTP server with several virtual hosts for other purposes.
  • The firewall is maintained by my customer's IT group, and while they are friendly, they are not willing to change any configuration settings as a mater of principle.  Outgoing on Port 80 is allowed, however, direct access of ComputerLocal from outside the firewall is not.

My first thought was to add a new virtual host to ComputerRemote.  ComputerLocal would periodically perform a cURL request to https:// NewVirtualHostOnComputerRemote.com:80 and send the data, and if ComputerRemote happens to have something queued up which it needed to send to ComputerLocal, it would take the opportunity to send it in the reply.

 

Okay, this works, however, it is kind of a kludge.  Furthermore, ComputerRemote needs to wait to be polled by ComputerLocal before sending the data.

 

Is this a good opportunity for sockets?  By the way, a finite number of connections is okay.  How would one implement it?  For instance...

  1. ComputerLocal still periodically perform a cURL request to https:// NewVirtualHostOnComputerRemote.com:80 and sends the data?
  2. ComputerRemote has a sockets server bound to Port 80 and receives the request and stores the data?  Note that should a HTTP server be require, a new server will be required to host it since Port 80 is now being used by the sockets server.
  3. Every time ComputerLocal is sending the data, it ensures that a sockets server? client? is working and bound to Port 80?
  4. ComputerRemote sends a message to ComputerLocal whenever necessary?

Yikes, those last 4 steps are confusing me more!

Link to comment
Share on other sites

You earlier indicated that In networking terminology, the "client" is the one doing the connecting and the "server" is the one being connected to.  I originally felt a client requests something and a server provides something regardless of who connects.  So, if computerA connects to computerB and computerA waits for computerB to do something, computerA is still considered the client?

Same words, multiple meanings. Computer A connecting to computer B could mean that A wants something from B, but that isn't always the case. The networking layer is only concerned about establishing connections, so it doesn't matter who wants what from whom but rather who is going out to connect to the other. As such "client" is connecting and "server" is receiving the connection. In higher layers than that, such as with HTTP or generic applications, the two terms still mean the same thing - they just have added connotations, such as "the client requests a URI from a server".

 

Client connects, server is connected to, and there may be more to it than that.

 

Also, I am still confused regarding HTTP servers and socket servers.  Originally, I was going to ask the question on how sockets deals with virtual hosts.  But now I am starting to realize that both a webserver and a socket server can't be bound to the same IP and port else a client would get two different responses (okay, not "response" since this is a HTTP thing, but what do I call it for the socket server?).

Correct! The operating system running on the server can't know whether a connection to port 80 should go to one application or another, thus the port can only be in use ("bound") by one at a time.

A few years ago Skype was using port 8080 (I think?) by default, however some WAMP/XAMPP/et al. bundles had default configurations to use port 8080 as well, and that caused many new developers to ask why they couldn't start up Apache when they had Skype running.

 

And then there are clients...  Yes, I understand that a webbrowser is a HTTP client.  I think that cURL is also a HTTP client, but it could also be a sockets client, right?

It's both: it has code to deal with managing connections and it has the ability to act as an HTTP client.

 

And then one can build a sockets client using  stream_socket_client() or the like.  I guess this makes sense, but I am still confused when one transitions from one technology to another.

Client connects, server is connected to, maybe other stuff too.

 

Maybe if I better explain why I am doing this and what I am hoping sockets can provide, it will better uncover my misunderstandings. [...]

ComputerLocal can connect to ComputerRemote with no problem so that's fine. However if they don't want to change the firewall (a stance I agree with) then the only way to get a connection between Local and Remote is if Local initiates. This is the standard "polling" approach: if A cannot inform B of new information then B has to periodically ask A if there is anything.

 

My first thought was to add a new virtual host to ComputerRemote.  ComputerLocal would periodically perform a cURL request to https:// NewVirtualHostOnComputerRemote.com:80 and send the data, and if ComputerRemote happens to have something queued up which it needed to send to ComputerLocal, it would take the opportunity to send it in the reply.

Yup.

 

Okay, this works, however, it is kind of a kludge.

Yeah, but your hands are tied.

 

Furthermore, ComputerRemote needs to wait to be polled by ComputerLocal before sending the data.

It doesn't really have to "wait" to send data. You can follow the normal HTTP approach here: Local connects, sends a request for updated information, Remote responds.

 

Is this a good opportunity for sockets?

Maybe. As far as I'm concerned it comes down to one question: How quickly does ComputerLocal need to know about new information after ComputerRemote has learned about it? Immediately? Within a few minutes? Sometime that day?
Link to comment
Share on other sites

Thanks for your patience and all your help.  Sorry for being a little dense, but I've never needed to think through these workflows before and the inner workings of TCP/IP and BSD are still a mystery to me (yes, just like regex, I know I need to master them one day).

 

It doesn't really have to "wait" to send data. You can follow the normal HTTP approach here: Local connects, sends a request for updated information, Remote responds.

Okay, I think I understand.  More later.

 

Maybe. As far as I'm concerned it comes down to one question: How quickly does ComputerLocal need to know about new information after ComputerRemote has learned about it? Immediately? Within a few minutes? Sometime that day?

In a perfect world, instant, however, a few minutes would be acceptable.  Current implementation....   Human utilizes a browser to interact to ComputerLocal indirectly by going through ComputerRemote.  ComputerLocal polls ComputerRemote every 60 seconds to either send some data or ask how things are going.  If ComputerRemote says human wants to talk, poll rate goes up to every 5 seconds until no new news for 300 seconds.

 

----------------------------------------------

 

Maybe I don't need sockets, and this is more simpler than I thought.

 

ComputerLocal makes a cURL POST request whenever it wants to to ComputerRemote to send data, and ComputerRemote just replies back that it was received (and doesn't try to reply back that it has something new to say).

 

ComputerLocal also independently makes a cURL GET request every 5 minutes? to ComputerRemote to ask whether ComputerRemote wants ComputerLocal to do anything.  If ComputerRemote doesn't reply within 299 seconds, the request is somehow canceled, and the next GET request goes out one second later.  If ComputerRemote responds, ComputerLocal processes the response and then initiates another cURL GET request without waiting for the 5 minutes to expire.

 

Maybe???

Link to comment
Share on other sites

Also, I am still confused regarding HTTP servers and socket servers.

Just to clarify something, as I'm not sure from your posts if you understand or not, an HTTP Server is a socket server. It's just a specific implementation that happens to use the HTTP protocol to communicate data between the client and the server.

 

 

(okay, not "response" since this is a HTTP thing, but what do I call it for the socket server?)

When dealing on the socket level all that exists is packets of data. Either data being sent or data being received. The protocol in use would determine what is considered a full request or response which might involve several actual data packets.

 

In a perfect world, instant, however, a few minutes would be acceptable.  Current implementation....

As requinix what setup you choose mostly depends on what you'd consider to be an acceptable delay in either side receiving information. If you poll once every 5 minutes for example then you have to assume a full 5 minute delay for any data being sent between the computers. You might occasionally get data sooner but you can't rely upon it.

 

If you maintain a persistent connection then you can get the data almost instantly, but you'll need to use extra resources and do some extra coding to maintain this persistent connection.

 

If you were dealing with mobile users there might also be data and/or battery usage to consider when choosing the best setup but that's got it's own list of pros/cons. Polling every X minutes may use extra data constantly checking but conserve battery by allowing the device to enter low power mode, for example.

 

Why do you recommend using a library instead of creating from scratch

Mostly just because it'll let you get going easier and faster. Handling sockets correctly can also be a bit involved though, especially if you get into dealing with blocking vs non-blocking sockets. For example using non-blocking sockets can help keep your program responsive by not waiting around for slow/hung clients but means attempts to read from or write to the socket may miss some data and you have to try again later. My chat example sort of handles this using read/write buffers.

 

 

Essentially:

  • For maximum easy: Run a script every X minutes with cron that talks to some standard web server
  • For maximum control: Develop a custom server and client that can be run continuously.
There are various middle-ground approaches depending on how much easy you want to trade for control.
  • Like 1
Link to comment
Share on other sites

Thanks kicken and requinix,

Good point kicken about an Apache server being a specialized socket server.  Maybe I knew that, but I didn't realize that :)

Both of you recommend initially just a simple polling strategy.  Okay, that is what I will do.  But why make the user wait?  I gave a narrative at the end of my last message, and started implementing it below.  Do you see any issues?
 

<?php
// Script running as a service on computer located behind firewall

$poll_time=300; //seconds

$url='http://example.com/remote.php';
$data=['id'=>123];
if ($data) {$url = sprintf("%s?%s", $url, http_build_query($data));}

$options=[
    CURLOPT_URL=>$url,
    CURLOPT_RETURNTRANSFER=>true,
    CURLOPT_CONNECTTIMEOUT => $poll_time,
    CURLOPT_TIMEOUT=> $poll_time
];
$ch      = curl_init();
curl_setopt_array( $ch, $options );
while (true) {
    if($rsp = curl_exec( $ch )) {
        if($error=curl_errno($ch)) {
            $rsp='Error: '.$error;
        }
        else {
            $code=curl_getinfo($ch, CURLINFO_HTTP_CODE);
            if($code=200) {
                //process results
                $rsp='Valid response from remote: '.$rsp;
            }
            elseif($code=204) {
                $rsp='No action required by remote';
            }
            else {
                $rsp="Error $code response from remote: $rsp";
            }
        }
    }
    else {
        $rsp='No response from remote';
    }
    echo($rsp);
    //curl_close( $ch );    //should I close?
}
<?php

// http://example.com/remote.php
// This script is run by the webserver and is located outside of the firewall
// If the user wants to send something to the computer located behind the firewall, the database will be updated with the appropriate content

function getCommand($id)
{
    /*
    Query DB and see if there is a command.
    If $id is invalid, return false
    If no commands, return NULL
    If command is pending, delete it from database and return it.
    */
    // Fake database for example
    $file=__DIR__.'/database.json';
    $database = json_decode(file_get_contents($file),true);
    if(!array_key_exists($id,$database)) {
        $rsp=false;
    }
    elseif(!$database[$id]){
        $rsp=null;
    }
    else {
        $rsp=$database[$id];
        $database[$id]=null;
        file_put_contents($file, json_encode($database));
    }
    return $rsp;
}

$poll_slow=300; //seconds
$poll_fast=250000; //microseconds
if(!empty($_GET['id'])) {
    $code=204;  //Maybe something else?
    $rsp=null;
    for ($i = 0; $i <= 1000000*$poll_slow/$poll_fast; $i++) {
        $rsp=getCommand($_GET['id']);
        if(!is_null($rsp)) {
            if($rsp===false) {
                $code=404;
                $rsp='Invalid identifier';
            }
            else {
                $code=200;
            }
            break;
        }
        usleep($poll_fast);
    }
}
else {
    $code=404;
    $rsp='Missing identifier';
}
header('Content-Type: application/json;');
http_response_code($code);
json_encode($rsp);

Edited by NotionCommotion
Link to comment
Share on other sites

Yes, it does have a couple of faulty =s.  Thanks.  I kinda came up with this strategy myself (with you guys pointing me in the right direction), and whenever I solely come up with something, it always proves to have issues.  Is what I am attempting to do fairly common?

 

Never used inotify before, but looks interesting.  I was just using a file as an example, however, and planned on using a database instead.  What do you think would be best?  I will have a maximum of 1,000 clients located behind firewalls and one server.  The server will receive ajax requests which will contain the applicable client identifier and the desired action, and the server must get the desired action to the appropriate client.

Link to comment
Share on other sites

Then rather than watching a file try message queues: process that receives the AJAX requests sends a message to a queue while the processes that talk to clients wait (again, with blocking) for a message to arrive on their particular queue. Software like Redis and (IIRC) Memcached can also do blocking queues, if you have anything like that already in place. If the data is small then the message could contain it, otherwise all it has to do is act as a signal that the receiving process needs to check the database.

 

Point is to avoid polling.

Link to comment
Share on other sites

Then rather than watching a file try message queues: process that receives the AJAX requests sends a message to a queue while the processes that talk to clients wait (again, with blocking) for a message to arrive on their particular queue. Software like Redis and (IIRC) Memcached can also do blocking queues, if you have anything like that already in place. If the data is small then the message could contain it, otherwise all it has to do is act as a signal that the receiving process needs to check the database.

 

Point is to avoid polling.

 

There is so much I don't know about PHP!  Thanks, let me look into it.

Link to comment
Share on other sites

Someone may have wrapped memcache and created a queue implementation, but there's nothing specific in it that helps with queue creation, whereas Redis has the List datatype and push and pop operations that work with a List.

 

If you need a simple queue, then go for Redis rather than memcached.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.