Is there a function that will automatically count the bytes in a variable?

blurredvision · May 28, 2008

I'm in need of the ability to be able to count the bytes of a variable to use in an equation. I'd imagine something exists to do this, I just don't know what it is.

Thanks for any help!

MadTechie · May 28, 2008

try this *untested*

    /**
     * Count the number of bytes of a given string.
     * Input string is expected to be ASCII or UTF-8 encoded.
     * Warning: the function doesn't return the number of chars
     * in the string, but the number of bytes.
     *
     * @param string $str The string to compute number of bytes
     *
     * @return The length in bytes of the given string.
     */
    function strBytes($str)
    {
      // STRINGS ARE EXPECTED TO BE IN ASCII OR UTF-8 FORMAT
     
      // Number of characters in string
      $strlen_var = strlen($str);

      // string bytes counter
      $d = 0;
     
     /*
      * Iterate over every character in the string,
      * escaping with a slash or encoding to UTF-8 where necessary
      */
      for ($c = 0; $c < $strlen_var; ++$c) {
         
          $ord_var_c = ord($str{$d});
         
          switch (true) {
              case (($ord_var_c >= 0x20) && ($ord_var_c <= 0x7F)):
                  // characters U-00000000 - U-0000007F (same as ASCII)
                  $d++;
                  break;
             
              case (($ord_var_c & 0xE0) == 0xC0):
                  // characters U-00000080 - U-000007FF, mask 110XXXXX
                  // see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
                  $d+=2;
                  break;

              case (($ord_var_c & 0xF0) == 0xE0):
                  // characters U-00000800 - U-0000FFFF, mask 1110XXXX
                  // see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
                  $d+=3;
                  break;

              case (($ord_var_c & 0xF8) == 0xF0):
                  // characters U-00010000 - U-001FFFFF, mask 11110XXX
                  // see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
                  $d+=4;
                  break;

              case (($ord_var_c & 0xFC) == 0xF8):
                  // characters U-00200000 - U-03FFFFFF, mask 111110XX
                  // see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
                  $d+=5;
                  break;

              case (($ord_var_c & 0xFE) == 0xFC):
                  // characters U-04000000 - U-7FFFFFFF, mask 1111110X
                  // see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
                  $d+=6;
                  break;
              default:
                $d++;   
          }
      }
     
      return $d;
    }

blurredvision · May 28, 2008

Woah! Thanks MadTechie, didn't know that it would be so much work, I was hoping there would be a built-in function, otherwise I would've done more research myself instead of you typing all that out. I'll give that look, though, thanks.

Sign In

Is there a function that will automatically count the bytes in a variable?

Recommended Posts

blurredvision

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

blurredvision

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information