Jump to content


Photo

Efficient array storage for php


  • Please log in to reply
3 replies to this topic

#1 btherl

btherl
  • Staff Alumni
  • Advanced Member
  • 3,893 posts
  • LocationAustralia

Posted 05 September 2006 - 07:50 AM

PHP's arrays have high overhead, particularly for complex structures such as

$a[0] = array(
  'host' => 'www.phpfreaks.com',
  'path' => 'forums',
  'file' => 'index.php',
  'args' => 'action=post',
  'count' => 5,
  'potatoes' = true,
  ... etc etc
);

Are there any extensions for php which allow for more efficient data storage, at the expense of reduced flexibility?  I am looking for something similar to a C structure, where data is tightly packed, and names for the elements do not need to be stored.

The structure needs to support:

  • Inserting at the end of the array
  • Fetching data from any location in the array (indexed by integers, like a C array)
  • Sorting (this could be complex)

My main concern is that the labels for each data item (such as 'host', 'path') are repeates for EVERY element in my array $a.  This is a painful waste of space.  My secondary concern is that I do not want to use an entire zval to store a simple boolean, or a simple integer.  I would like to pack these values more tightly, even if accessing them becomes more costly.

Thanks for any advice :)

#2 Jenk

Jenk
  • Members
  • PipPipPip
  • Advanced Member
  • 778 posts

Posted 05 September 2006 - 07:52 AM

in a word.. No.

Besides, PHP uses C directly for all primitives, only exception to the rule is string key is allowed. (Uses a secondary array for the keys)

Why is it such a concern? What exactly are you doing with arrays to make them so huge?

#3 Zane

Zane
  • Administrators
  • Advanced Member
  • 4,134 posts

Posted 05 September 2006 - 08:02 AM

I am looking for something similar to a C structure, where data is tightly packed, and names for the elements do not need to be stored.

Well, you don't HAVE to store the names of the elements

I think the default structure of an array is sorted by index
unless one happens to have a stringed key.


I don't know if I'm on the right track exactly to what you're asking but....
if you don't want an associative array, one with string indeces
and want to avoid the 'repeats'

maybe have a shema element at the beginning of the array to
aesthetically please you're coding....or whatever

but overall...you don't need the keys
and as for your question about adding to the end of the array
it should still work

btn_donate_SM.gif Want to thank me? Contribute to my PayPal piggy-bank
 

172938.png

#4 btherl

btherl
  • Staff Alumni
  • Advanced Member
  • 3,893 posts
  • LocationAustralia

Posted 05 September 2006 - 09:00 AM

Thanks for the comments!

The reason it's a concern is that I want to process large amounts of data, and I want to sort the entire data set by various keys.  The larger data sets are 250k rows, and possibly larger.  The data would fit in memory, but the overhead of storing the data is currently much larger than the data itself.

Jenk, I'm not sure what you mean by "PHP uses C directly for primitives".  As I understand, php uses hash tables for arrays.

Zanus, there looks to be a small memory benefit to using integers instead of strings, based on small tests.  I will try it and see if it measurably improves memory usage, thanks :)

I don't think the underlying structure is different for numerically indexed arrays however.  It would be nice if it was :)  But memory usage is the same for numerically indexed arrays and arrays indexed by short strings.  I would expect lower memory usage if a simple array was being used.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users