Get hash or id of socket connection

NotionCommotion · January 13, 2018

SocketServer and SocketClient1 has an established persistent connection and utilize JSON-RPC. SocketServer maintains an incrementing list of JSON-RPC ID's on a per-connection basis. SocketServer saves data and sends a message to SocketClient1.



class SocketServer

{

    private $storage;


    public function __construct()

    {

        $this->storage = new \SplObjectStorage();

    }


    public function start() {

        getSocketServer()->on('connection', function ($conn) {

            $this->storage->attach($connStream,(object)['callbacks'=>[],callbackCount=>0]);

            //....

            // Send a message to a given client

            $jsonRpcId=++$this->storage[$conn]->callbackCount;

            $this->storage[$conn]->callbacks[$jsonRpcId]=getSomeDataObject();

            //....

        });

    }

}

SocketClient1 gets the request and initiates a HTTP request to HTTPServer.

HTTPServer gets the request, but needs the data which SocketServer stored in SplObjectStorage when initiating the original request. To get this data, I am thinking that Client1's HTTP request needs to include both the jsonRpcId plus either:

Something to identify the callbacks array stored in SplObjectStorage. Maybe SocketServer include SplObjectStorage::getHash in its message to SocketClient1 which then gets forwarded in the HTTP request to HTTPServer?
Something to identify the connection between SocketServer and SocketClient1, and if so, would this need to be identified by SocketServer and passed to SocketClient1 or can it be identified by SocketClient1?

Any recommendations how I should be proceeding? Thank you

NotionCommotion · January 15, 2018

You may be asking why not just use the persistent socket connection and eliminate the need for the separate http connection.

The application on client1 (and 2, 3, ...) will not directly be returning the applicable content, but will initiate a shell script which creates a file which much be sent to the server. This client application is not PHP but c++, and the developer feels it simplifies matters decoupling these scopes of work.

PS. I just realized I meant to title this post "Get hash or id of socket connection or SplObjectStorage".

Thanks

requinix · January 16, 2018

I think my answer would depend on the nature of the data that HTTPServer needs...

I'm inclined to say that SocketClient should get/request/otherwise obtain the data from SocketServer, then send it to HTTPServer. The idea being that HTTPServer should not have to know that data it needs is located through the socket it has and is not directly obtainable from the client.

If that doesn't make sense then I'd go for a shared secret: SocketServer issues a secret key to SocketClient which can be used to obtain whatever data it has stored, then SocketClient can send it to HTTPServer which can then request the data - even if it's on another socket.

The key can be something random that corresponds to an internal array that contains whatever information is necessary for SocketServer to find the data regardless of the socket that requested it. If all machines connecting to SocketServer are trusted then that key could simply be the RPC ID.

NotionCommotion · January 16, 2018

The idea being that HTTPServer should not have to know that data it needs is located through the socket it has and is not directly obtainable from the client.

Exactly! I think…. Maybe more background will help. Intent is to generate tshark report logs and there are three machines:

User web client (not previous described).
Main Server which contains HttpServer, SocketServer, and SocketClient.
PLC machine client(s) which contains the SocketClient and HttpClient.

Workflow is as follows:

PLC SocketClient connects to main SocketServer.
Web client initiates log generation for a given PLC (request #1).
HttpServer gets the web clients request and uses SocketClient to forward to the SocketServer which in turn sends the request to the appropriate PLC client via the persistent socket connection (request #2).
PLC client invokes a shell script to initiate the log generation, and responds back to SocketServer (response to #2) and then ApiHttpServer and eventually web client (response to #1) that the request was received.
After log report is generated, shell script in PLC Machine Client uses its HTTPClient to send the log file to the Main HTTP Server (push request #3).
Results are stored in the db and maybe a websocket is used to update the webclient (not shown).

Somehow the main server must associate the log file to the initiating request. Is your "If that doesn't make sense" reply reflect this scenario? The RPC ID in itself can't be used as each PLC Machine Client has its own RPC ID series (maybe I should have made it common to all?). Also, if the Main Server is restarted, the RPC ID series will all restart at 1 (maybe need to store in the DB and restore series upon server restart?). Does this change your answer?

Thanks!

requinix · January 16, 2018

You have an inherent identifier for the PLC SocketClients: which PLC it corresponds to. If the Web Client is initiating log generation for a particular PLC then it must necessarily be able to identify that PLC, right? So that is the identifier by which you can associate requests and messages and whatnot.

The rest of the "associate the log file to the initiating request" part is that the HTTPServer must have knowledge that the Web Client's request creates (eventually) a log file somewhere. When it receives request #1 it internally (or not?) tracks that request and the identified PLC. Though the log generation is async to everything else, the HTTPServer will eventually learn of the log file at which point it can look up if there was a corresponding request for that PLC's log and, if so, do whatever it needs to do. This also means the HTTPServer would be able to prevent duplicate log requests from multiple clients and instead just track that multiple clients need the same log (when it's eventually ready).

NotionCommotion · January 16, 2018

You have an inherent identifier for the PLC SocketClients: which PLC it corresponds to. If the Web Client is initiating log generation for a particular PLC then it must necessarily be able to identify that PLC, right?

Yes, each PLC has a unique GUID. When the PLC connects to the SocketServer, it gives the server its GUID and the server associates it with the connection. Then, when the web client asks interface with a given PLC, the server can identify which socket to forward the request to.

So that is the identifier by which you can associate requests and messages and whatnot.

Thanks requinix. I think I might have just been confusing myself. Just something like the following?

Webclient sends HTTPServer a request to get and save a log. Request is routed to SocketServer where the desired file name and websocket connection is saved with some identifier to the forthcoming RPC ID. Then the request less the desired filename and websocket connection is sent to PLC, and the PLC responds that the command has been accepted and the reply is routed back to the user.
When the log has been generated, the PLC performs a HTTP request which includes both the GUID and the RPC ID which is routed to the SocketServer allowing the report to be associated with the original request.

While not much of a concern, my only tiny concern would be if the socket server was restarted and the RPC IDs were reset. Guess I could (should?) store the last RPC ID for each PLC/GUID in a DB?

Off topic, but you've already taken the time to understand the scenario, and I would appreciate your recommendations for an endpoint.

POST /logfiles
Host: example.com
Content-Length: 2740
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryWfPNVh4wuWBlyEyQ


------WebKitFormBoundaryWfPNVh4wuWBlyEyQ
Content-Disposition: form-data; name="guid"


d482ed56-55be-415e-9531-65f8fe31c4e8
------WebKitFormBoundaryWfPNVh4wuWBlyEyQ
Content-Disposition: form-data; name="rpcId"


1234
------WebKitFormBoundaryWfPNVh4wuWBlyEyQ
Content-Disposition: form-data; name="fileName"; filename="mylog.cap"
Content-Type: application/cap


[file content goes there]
------WebKitFormBoundaryWfPNVh4wuWBlyEyQ

POST /logfiles/d482ed56-55be-415e-9531-65f8fe31c4e8/1234
Host: example.com
Content-Length: 2700
Content-Type: application/cap


[file content goes there]

Edited January 16, 2018 by NotionCommotion

kicken · January 16, 2018

Replacing all your various Client/Server terms with some real names, describing the problem in plain english, and focusing on the connections you have can help one wrap their head around the problem.

It sounds like you're dealing with a setup like this:

So you can see a bit of what you need to track. Your web clients need to be identified, so they each have a name. This could be the client's session ID or some random ID you assign to them when they connect.

Your PLC's need to be identified, so they each have a name. You mention you already have GUID's for this, so there you go.

You can see from the data conversations that you client ID needs to be passed along from the server to the PLC so that when it responds on the results connection it can indicate who the results are for and the server can pass them along accordingly.

Now just design your communications protocol with all that in mind. You can simplify things some, for example you don't really need to pass along the client ID to the PLC, just a generic request identifier that the server could then map back to the appropriate client via a lookup table..

In some rough pesudo code, your main server's http connection would operate something like this.

<?php

// $serverHttp = main server http socket
// $serverPLC = main server PLC socket
// $request = your protocol data
$requestMap = [];
$serverHttp->on('connection', function($client) use (&$requestMap){
    $request = $client->getRequest();
    if ($request['type'] == 'log-request'){
        $plcRequest = new PLCRequest($client, $request['plc-id']);
        $requestMap[$plcRequest->getId()] = $plcRequest();
        $serverPLC->sendRequest($plcRequest);
    } else if ($request['type'] = 'log-result'){
        $plcRequest = $requestMap[$request['id']];
        $plcRequest->getClient()->sendResult($result['log']);
    }
});

PLCRequest::getId would just generate some random unique identifier, such as via uniqid.

As far as what to do if the server crashes/restarts between when a log is request and is returned, you'll just have to put some thought into that. The easiest thing to do would be to simply ignore the result and make the client re-request the log. If the server restarts all your persistent connections will have to be re-established anyway so you can just have the web client repeat the request. If you let the web client know what the request ID is after they request logs, then you could have it poll for those results periodically and when the server gets them it could pass them along.

You could move the storage of pending request from memory to a database or external cache server such as memcache / redis, but that sort of just shifts the problem (what if they crash/restart?).

I don't know if any of the above helps, I might just be rambling about stuff you already know. Even after all your threads connected to this I don't really understand your setup and feel like it's overly complicated compared to what it could be.

Some of how this needs to work depends on things that you seem to have not figured out yet, or just are not saying. For example

Results are stored in the db and maybe a websocket is used to update the webclient (not shown).

Whether or not your end user clients make their requests and get their results over a websocket or via ajax polling (or both) can affect how best to design the overall system quite a bit. If you don't know which you're doing yet you should figure it out, or design a system that can do both.

NotionCommotion · January 17, 2018

Thanks Kicken, I must say your images looks way better than mine! What software did you use?

For the restarting/crashing server part, originally I was planning on having the PLC send only its own GUID plus the originating server-to-plc RPC ID in the HTTP request. Potentially, however, the server can be restarted, new RPC IDs are generated, and then an old PLC HTTP request is received which happens to have the same RPC ID. Is it common to log the times a service is started in a DB? My new thoughts are to have the socket server also send this integer to the PLC and the PLC include it in the HTTP request so it will always be unique. I will likely just keep the current $map[serverStartCount][guid][rpcId]=$commandMetaData in memory, and use a deconstructor to save it in the DB and a constructor to populate the map. Maybe the constructor will also take the opportunity to delete any old orphaned messages.

No, I am not "not saying", but just haven't figured out what I want to do with websockets. My intent would be build it so it can do both, and initially implement polling.

Thanks for help

Sign In

Get hash or id of socket connection

Recommended Posts

NotionCommotion

Link to comment

Share on other sites

NotionCommotion

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

NotionCommotion

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

NotionCommotion

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

NotionCommotion

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information