[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

Gzz proposal: Make cell IDs lazy



Hi Gzz hackers,

I've just done a memory dump with Blackdown's xprof on client starting.
The results are interesting: By far most of the memory we allocate are
strings, byte arrays, and character arrays. And SpanSpacepart alone
allocates 35 MB of character arrays during space loading while creating
IDs for its cells, in 54'000 objects, of which only 2.5 MB or 3'500
objects remain alive after loading. Part of this comes from
PlainVStreamDim, but a lot comes from GZZ1SpaceHandler, too; remember
that a LOT of the connections we have to load are d.vstream connections.

To handle this, I propose that Cells should not have to contain String ids:

* Cell.id becomes private. If you need to access a Cell's String ID,
  you have to call Cell.getId(). (As virtually the only places where
  the IDs are really needed are saving and printing cells out, none
  of which are *truly* big tasks, this should be fine.)
* When constructing a cell, you use one of two constructors,
  passing either a String ID or an inclusion object and index.
  Like now, you have to be careful that for each ID, you only ever
  create cells with String IDs or inclusion objects.
* Space and Spacepart get a new method, getId(Cell), that
  computes the ID if it wasn't passed to the constructor.
* If a String ID is passed, it's still interned.
* When comparing two cells, if they both have String IDs,
  these are compared by identity; if only one has a String ID,
  the cells are treated as not equal; if neither has a String ID,
  spaceparts, inclusion objects, and inclusion indices are
  compared. (Parts are compared by identity, inclusion objects
  by equality; it's the parts' responsibility to ensure that
  comparing their objects by equality is fast.)

(Note on saving: only the IDs that appear in the diff should have to be generated.)