[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ozone & modularity

Jean-Marc, Andreas and all, this gets more and more related to the ozone
development. Should we take this to the ozone-dev list? (BTW: cross posting does
not work)

On Mon, 05 Feb 2001, Jean-Marc Vanel wrote:
> Falko Braeutigam wrote :
> > On Mon, 29 Jan 2001, Jean-Marc Vanel wrote:
> > > Ozone needs more documentation, especialy a high-level description of
> > > the architecture.
> > I believe that it would be a
> > good think if other programmers would try to understand the design and the
> > code to find my mistakes and giving their fresh, new ideas. Making the
> > requested documentation could be a good starting point for this. Of course, I
> > would be more than happy to help with this by answering related questions on
> > the list. This would start a lot of discussions about all aspects of ozone
> > which would be another good thing.
> I'll try to add my stone to the building.

> > We have 2
> > implementations of that (although one is not supported currently) which proves
> > that it is actually possible to plug-in new back-end stores.
> > The kernel just handles communication, transaction/thread management,
> > permissions and object invocation. Its entire interface is defined by the Db*
> > command objects, which represent a client->server request.
> > The client side layer does connection pooling and API related things (ODMG, XA,
> > etc.)
> > The XML storage works on top of this. It is just an application of ozone. No
> > changes to any ozone interface has been done in order to store XML in ozone!
> > ...
> OK, I'm convinced about the overall architecture. What worries me is a detail like
> the Env class that is ubiquitous and does too many things.
Oh yeah, the Env class has been there since the very first versions of ozone
(first CVS check-in: 1997/08/15 12:37:49 :) I don't think that it does to much
but at the same time I know that it is not the greatest design ever ;) I'm not
crazy about the current code. We may and should dicuss all those issues. And if
the conclusion is to drop the Env class, or even half of the current code,
well, then let's do it!

> > > The limitations of the current implementation should be listed.
> > I agree. I will maintain this list. Just send your list of limitations and we
> > will discuss about.
> If ozone is to succeed, it must become more user-oriented. Listing current
> limitations (and functionalities) is part of this.
> - keeping all in memory during store of large object (see below)
maybe I lost you here. Objects are always atomar.

> - total number of objects < 2^32 , because of ObjectID used ; 
not exactly true. 2^48 objects per ozone database node plus 2^16 nodes per
cluster. (clusters are of course not yet implemented)

> I know that it is
> easy to change; by the way, does every DOM node "consume" one ObjectID ?
yes. every DOM node is an database object. But this introduces most of the
XML related problems we are facing currently. So this may and should be changed
in the future.

> > > The are many good free software projects, but API's are lacking, which
> > > would enable us to take the best of each.
> > Maybe I lost you here. Do you mean "standard" APIs, like ODMG and JDO, that
> > would (eventually) allow to be vendor independent or do you mean APIs in
> > general?
> I mean standard ones , like SAX  , that have a big architectural role. Maybe soon
> TRAX and the xmldb effort can be added to the list. Also I remember a discussion on
> xml-dev about API's for indexing XML; I don't know what was the outcome.
Ok, but all this is already there in ozone.

> > > But anyway it seems that the Sax storage strategy should be enhanced to
> > > avoid memory limitations. Is it easy to change the storage policy so
> > > that memory problem are
> > > avoided during load of large document? Parhaps by commiting from times
> > > to times?
> > > Based on available memory, a chunk size for partial transactions could
> > > be computed.
> > > Ozone as it stands now can only store small XML files, about 1Mb, even
> > > lots of them. But with many files, you cannot do a global XPath request
> > > on all those documents without Prowler.
> > This is true. ozone/XML is currently not suited for XML files > 1MB. As stated
> > on the toDo list this needs to be reworked. I do have some ideas in this
> > regard others hopefully too, we should start discussing this effort.
> My above idea to do partial commits on the original object is obviously not good,
> because the integrity of the object could be lost. For large transactions, the
> whole transaction should be in a persistant storage before it is commited. Is it
> this sort of thing you mean by "paging server for store back-end" in the CHANGES
> file ?
No. By "paging server" I mean a store back-end that operates on pages of a big
file or a disk prtition instead of distinct files than the current solution.

> > ozone is an objectbase. It is possible to use it to store
> > XML. Today it doesn't do this in the most efficient way. I do have many ideas
> > how to improve this.
> Please tell us !

Currently ozone uses a straight forward solution to store XML: one DOM node is
one database object. We encountered several problems with this. The questions
is, waht is our goal with ozone/XML? If we just want to provide a foundation on
which all possible XML tools should be able to run, then a persistent DOM is a
must. If we decide to provide also query and other advanced features, then we
are free to choose whatever kinf of storage seems to be suited. Taking in
acount the great development speed of XML and related specs, IMO option two is
not realistic. So we need a persistent DOM again. Am I right so far?

Today each node is an database object. The other extreme is to put one entire
document into one database object. Which will lead to other problem IMO. So the
best way to go is to split up one document into clusters of DOM nodes. The
problem here is to find a way to make the crossover from one to another cluster
transparent to the consumer of the DOM.

> > But at the end this always will be an application of the
> > objectbase ozone. Of course it is possible to make a persistent DOM in other
> > ways, like IPSI PDOM (without transactions, client-server, proxies) for example,
> > but this has nothing to do with ozone then.
> I tried IPSI PDOM, it works well, but it doesn't seem to be opensource. And it
> doesn't implement XPath , just XQL, and I absolutely need the contains() XPath
> function.
Yes, IPSI is very fast. Anyway, IMO there is no way to convince them to make
it open source.

BTW: they use yet another approach. They have their own DOM implementation,
which surely gives the best integration of DOM requirements and persistency.
However, I always wanted to avoid this since it seemed to be to much work.

> > > After that, I could work on text indexation. Maybe for this it is
> > > possible to rely on Ozone infrastructure:
> > > create the words index as an extra XML document, this way:
> > > <index__><word1><id>objectID1</id>... etc
> > > and then translate internally an XPath request with contains() into
> > > another without, provided that we have internally a function
> > > objectID(node) .
> No reaction on this .... ?!
> Is it a realistic implementation?

we have to discuss the overall points above first.

> > > ______________
> > >
> > > Exploration of the code:
> > >
> > > org.ozoneDB.core.Env (environment of a ozone database server)
> > > instanciates an object of interface:
> > > org.ozoneDB.core.Store
> > > which is implemented by:
> > > org.ozoneDB.core.wizardStore.WizardStore
> > > which is implemented by:
> > > org.ozoneDB.core.wizardStore.ClusterStore
> > ClusterStore does not implement anything
> Yes, I just meant by this that WizardStore delegates its implementation to
> ClusterStore, since it does :
>  clusterStore = new ClusterStore( _env );

Yes, it does. WizardStore handles the ID table where ClusterStore deals with
the actual clusters on disk. Again, is something wrong with this?

Falko Braeutigam                              mailto:falko@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com