[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Object query language?



On Sun, 19 Mar 2000, William Uther wrote:
> --On Saturday, March 18, 2000 4:51 PM +0100 Falko Braeutigam 
> <falko@softwarebuero.de> wrote:
> 
> > You can use any collection classes in ozone. In fact, ODMG collections
> > have exactly the same interface as java.util collection.
> 
> I agree.  I was expecting the ODMG and normal collections to have slightly 
> diferent semantics however.
> 
> Normal collections are loaded into memory as a block.  They cannot be any 
> larger than available RAM on the server.  This is exactly the reason that 
> Ozone uses a disk based map for it's index.
Indeed.

> 
> I was expecting the ODMG collections to be disk based and hence of 
> arbitrary size.

This is definitely a goal of the ozone collections but this is not defined by
ODMG. In fact, it seems that no other ODMG vendor has real scalable (disk
based) ODMG collections.

> 
> > Is this the reason that let you try to use proxies as keys, Will - avoid
> > to store the key value twice if it is also a direct attribute?
> 
> I guess you could view it that way.  (In my case the key is not an 
> attribute.  The key is a value calculated using the attributes.  A separate 
> key would not be a duplicate of the attributes, but a cache of the 
> calculated value.)
> 
> > In most cases it should
> > be no problem to sync the index with the attribute values of the indexed
> > objects.
> 
> It is not too much extra work.  It would complicate the code though.
> 
> > This
> > is much faster *and* less space consuming (although you avoid doubling
> > the key) than haveing millions of 10 byte objects stored in the database.
> 
> In my case the objects used as keys are already proxied objects.  I 
> wouldn't be saving on the number of proxied objects.
> 
> The big speed hit is going to be when loading or unloading the hashtable 
> from memory, not accessing it once in memory.  The extra work of creating 
> the cache object is going to eat up advantage you might have during normal 
> operation.
Did you actually compare both cases? I'm very interested in the results.

The work of creating the 'cache' object needs to be done ones, where the
proxies effects each method call!

> 
> In terms of storage space, we are comparing an extra proxy (referencing an 
> already proxied object) to a new object duplicating a pair of strings each 
> up to 100 bytes long (say 30 bytes average).  (I moved from chars to lines 
> of text because char-by-char compression was too slow.)  I would be 
> extremely surprised if the extra proxy was costing me 60 bytes in storage.

If your keys are database objects (proxies) already, then of course the picture
changes. Adding one new proxy for an existing object costs less than 60 bytes.

However, if the database object is used as a key its hashCode() value must not
change. So you may also compute the hashCode() once and use this non-database
object as key.

I can add some code to a maybe generated hashCode() method of the proxies that
checks if the objects is just serialized and return the default hashCode in
that case. Sounds extremly hackish but should work.


Falko
-- 
______________________________________________________________________
Falko Braeutigam                         mailto:falko@softwarebuero.de
softwarebuero m&b (SMB)                    http://www.softwarebuero.de