NotionCommotion Posted February 16, 2017 Share Posted February 16, 2017 I have the following script. When echoing $data, it included some strange character at the beginning of the message <?php require 'vendor/autoload.php'; $loop = React\EventLoop\Factory::create(); $socket = new React\Socket\Server($loop); $socket->on('connection', function (\React\Socket\ConnectionInterface $client) use ($loop){ $client->on('data', function($data){ echo($data."\r\n"); }); }); $socket->listen(1337,'0.0.0.0'); $loop->run(); Turns out that $data has 4 bytes pretended on it to represent the length of the message. On the receiving connection, how can I get those 4 bytes and also remove them from $data. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/ Share on other sites More sharing options...
ginerjm Posted February 16, 2017 Share Posted February 16, 2017 If you are positive that it is always a 4 byte length field, you could use substr() on the string. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542862 Share on other sites More sharing options...
Jacques1 Posted February 17, 2017 Share Posted February 17, 2017 I'm still now sure if you understand the concept of streaming. The data you receive is not the message. It's a portion of a byte stream, which means it could contain anything: a message fragment, a single complete message, multiple message. You never know. That's why the “strange length bytes” exist: They tell you where the messages end. You must process this information and write an assembly logic for the messages. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542873 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 Thanks ginerjm, Would it be substr with 2 or 4? 2 bytes per character, right? Funny but I seem to get the same results using both 2 or 4. Don't think I need http://php.net/manual/en/function.mb-substr.php, do you? If I do the following and echo $firstFourBytes, it displays some unprintable text. $firstFourBytes=substr($data,2); $remainingContent=substr($data,0,2); The four bytes are added with the following c++ script. uint32_t size = message.size(); unsigned char mSize[sizeof(size)]; memcpy(&mSize, static_cast<void*>(&size), sizeof(size)); data.insert(data.begin(), mSize, mSize + sizeof(size)); How could I get the numerical value of those four bytes using PHP? Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542874 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 I'm still now sure if you understand the concept of streaming. The data you receive is not the message. It's a portion of a byte stream, which means it could contain anything: a message fragment, a single complete message, multiple message. You never know. That's why the “strange length bytes” exist: They tell you where the messages end. You must process this information and write an assembly logic for the messages. Your right, I still don't completely understand the concept of streaming, but understand way more than I did a while ago. What I have been doing is using the JSONStream class kicken posted under https://forums.phpfreaks.com/topic/302840-http-server-with-two-hosts-and-same-port/?p=1540924. It is my understanding that I can either use the length to extract the message as you just indicated, or do it the kicken way which looks for an end of line. If I have the length (which I do), think that is the better way? How do I actually get these four bytes as a number? Can you point me in the right direction to implement? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542875 Share on other sites More sharing options...
Jacques1 Posted February 17, 2017 Share Posted February 17, 2017 The two hosts first need to agree on a byte order for the length. How integers are stored is machine-dependent, so there must be a common format on the wire (a good candidate is network byte order, i. e. big endian). C has the htonl() function for the conversion. Then the PHP script can unpack the bytes: <?php // test: 16 in big endian $lengthField = "\x00\x00\x00\x10"; $length = unpack('Nlen', $lengthField)['len']; var_dump($length); Whether you use length prefixes or delimiters is a design choice. Appearently you (or whoever wrote the C++ code) decided against delimiters. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542877 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 Whether you use length prefixes or delimiters is a design choice. Appearently you (or whoever wrote the C++ code) decided against delimiters. I had just naively assumed delimiters as the C++ author and I never discussed it. If necessary, the C++ code can be changed to use delimiters. Do you feel one design choice is better than the other, or is the better approach based on the situation? If the situation, what aspects influence the decision? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542880 Share on other sites More sharing options...
Jacques1 Posted February 17, 2017 Share Posted February 17, 2017 Delimiters must be chosen specifically for the data format (e. g. JSON) -- and may not even be possible. Length prefixes are generic and always work. On the other hand, it's slightly simpler to look for delimiters than extract and process lengths. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542881 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 Well, then delimiters it is! I've gone through too much just to settle for "slightly simpler".As far as implementation, I am thinking of something like the following. Before going down the path to invent something new, has this requirement been implemented many times before resulting in a better solution? <?php require 'vendor/autoload.php'; $loop = React\EventLoop\Factory::create(); $socket = new React\Socket\Server($loop); $socket->on('connection', function (\React\Socket\ConnectionInterface $socket) use ($loop){ $superSocket = new DealWithSocket($socket); $superSocket->on('data', function($message) use ($superSocket){ echo($message.PHP_EOL); $superSocket->send('thank you!'); }); }); $socket->listen(1337,'0.0.0.0'); $loop->run(); <?php class DealWithSocket implements Evenement\EventEmitterInterface{ // Should Evenement be used? use Evenement\EventEmitterTrait; private $socket, $buffer='', $messageLength, $messageLengthPointer=0; public function __construct(React\Stream\DuplexStreamInterface $socket){ $this->socket = $socket; $this->socket->on('data', function($data){ $this->buffer .= $data; $this->parseBuffer(); }); } public function send($string){ $this->socket->write(strlen($string).$string); // I don't think this is right } private function getLength($string, $start=0){ //My understanding is that "N" represents an unsigned long (always 32 bit, big endian byte order) and "len" is just what ever you want the array index name to be return unpack('Nlen', substr($data,$start,4))['len']; } private function parseBuffer(){ // And this needs help... if(is_null($this->messageLength)) { //Save the first time data is received? $this->messageLength=$this->getLength($this->buffer); } while (strlen($this->buffer)>($this->messageLength+4)){ $message = substr($this->buffer, 4, messageLength); $this->buffer = substr($this->buffer, messageLength+4); $this->emit('data', $message); } } } Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542883 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 (edited) A little better with my DealWithSocket class... <?php class DealWithSocket implements Evenement\EventEmitterInterface{ // Should Evenement be used? use Evenement\EventEmitterTrait; private $socket, $buffer='', $messageLength; public function __construct(React\Stream\DuplexStreamInterface $socket){ $this->socket = $socket; $this->socket->on('data', function($data){ $this->buffer .= $data; $this->parseBuffer(); }); } public function send($string){ $this->socket->write(strlen($string).$string); // I don't think this is right } private function getLength(){ //My understanding is that "N" represents an unsigned long (always 32 bit, big endian byte order) and "len" is just what ever you want the array index name to be return strlen($this->buffer)>=4?unpack('Nlen', substr($this->buffer,0,4))['len']:0; } private function parseBuffer(){ // Is using string functions like strlen() appropriate? if(!$this->messageLength && strlen($this->buffer)>=4) { //Save the first time data is received or it happened to end perfectly at the end? $this->messageLength=$this->getLength(); } while ($this->messageLength && strlen($this->buffer)>=($this->messageLength+4)){ $message = substr($this->buffer, 4, messageLength); $this->emit('data', $message); $this->buffer = substr($this->buffer, messageLength+4); $this->messageLength=$this->getLength(); } } } Edited February 17, 2017 by NotionCommotion Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542884 Share on other sites More sharing options...
kicken Posted February 17, 2017 Share Posted February 17, 2017 So are you going to switch to delimiters or keep the length prefix? Your post suggests delimiters but your code is still using lengths. If you want to use delimiters, refer back to the JSONStream class and see how it handles parsing out individual items. Use strpos to search for the delimiter and then extract the message. If you want to continue with length processing then you could try looking at the Gearman code I wrote as an example. It doesn't use react but it will demonstrate parsing out a packet. Areas of interest may be Connection::readPacket and Packet::fromString public function readPacket(){ if (!$this->stream){ $this->connect(); } $header = $this->read(12); $size = substr($header, 8, 4); $size = Packet::fromBigEndian($size); $arguments = $size > 0?$this->read($size):''; return Packet::fromString($header . $arguments); } The gearman protocol specifies that each packet begins with a 4-byte magic code, 4-byte packet type and a 4-byte packet size (in network byte order). So first 12-bytes are read (4*3) then the last 4-byte group is extracted and turned into an integer to determine how much additional data is read. public static function fromString($data){ $magic = substr($data, 0, 4); $type = substr($data, 4, 4); $type = static::fromBigEndian($type); $size = substr($data, 8, 4); $size = static::fromBigEndian($size); $arguments = substr($data, 12, $size); $validSize = strlen($arguments) === $size; if (!$validSize){ throw new UnexpectedPacketException; } $arguments = explode(chr(0), $arguments); $packet = new static($magic, $type, $arguments); return $packet; } Once the packet is read in whole it's sent here and broken down into individual fields for easy consumption. Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542885 Share on other sites More sharing options...
NotionCommotion Posted February 17, 2017 Author Share Posted February 17, 2017 Thanks kicken, Yeah, I've gone back and forth a couple times. The implications of the two approaches really didn't sink in until recently. I get the JSONStream class and appreciate your sharing it. I'll take a look at the Gearman code and ask questions if necessary. Night! Quote Link to comment https://forums.phpfreaks.com/topic/303211-remove-bytes-off-of-a-string/#findComment-1542886 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.