[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Handling large XML document



On Fri, 01 Dec 2000, you wrote:
> I think I may have misunderstood the problem.
> 
> For my issue, the worst case scenario I can think
> of is this. A typical organic chemical analysis can run into
> the thousands of results.  My understanding (and
> I did a fair amount of work programming o-chem lims)
> is that you would be running less than 100 different
> analyses on a given item. We should be safe
> with a 270,000 node limit. So I think in
> terms of data storage Ozone will safely
> handle 97%+ of all my needs.
Great!

Note that 270,000 is _not_ the limit. In my tests I ran up to 3M objects
without any problems. Anybody more? ;)


Falko

> 
> Sorry for the confusion!
> RobK
> 
> -----Original Message-----
> From: Falko Braeutigam [mailto:falko@smb-tec.com]
> Sent: Friday, December 01, 2000 10:38 AM
> To: Knapp, Robert (CAP, CMC)
> Cc: 'ozone-users@ozone-db.org'
> Subject: RE: Handling large XML document
> 
> 
> On Fri, 01 Dec 2000, Knapp, Robert \(CAP, CMC\) wrote:
> > >----Original Message-----
> > >From: Falko Braeutigam [mailto:falko@smb-tec.com]
> > >Sent: Friday, December 01, 2000 8:59 AM
> > >To: Knapp, Robert (CAP, CMC); 'ozone-users@ozone-db.org'
> > >Subject: RE: Handling large XML document
> > 
> > 
> > >On Fri, 01 Dec 2000, Knapp, Robert \(CAP, CMC\) wrote:
> > >> Falko,
> > >> 	I'm still _very_ new to ozone so I'm going to 
> > >> ask a newbie question. Would it be possible for Adrian to split the
> > >> the document into 4.4 meg segments. Sort of how infinite arrays
> > >>  or tags in a TIFF are done?  
> > >> 
> > >> 	I'm asking because this has a direct effect on my
> > >> evil plans for world domination err..... I mean my project,
> > >> GLIMS(http://glims.sourceforge.net).  Scientific data (especially
> spectra
> > or
> > >> health-care 
> > >> case histories) can run into 100's of megs in some cases. And
> > >> into Terabytes on long-term astronomy studies or litigation
> > >> studies.  
> > 
> > >Wait! What are we talking about? 100's of megs in _one_ XML document?
> > 
> > Well, I'm still in the design phase so it is hard to answer this
> > 100%.  The class of interest (GlimsItem) can contain 100's of
> > megs or more.  Depending on the implementation, there
> > could be references to other XML files for some of the properties.
> > In that case the finally constructed object may contain 100's of megs
> > of data in one object. [This obviously is a very extreme case,
> > typical would be 1M or so.] Even in this case, we may be looking
> > at instances where a single GlimsResult may be very large. 
> > 
> > Again, it's an implementation problem. The reason that I asked
> > the question is that it will need to be something I take into
> > account with my design.  
> > 
> > >> 
> > >> 	Ozone is the reference database for GLIMS, 
> > >Why then isn't it in the list of "ozone powered projects"??? ;)
> > 
> > Ummm, frankly I don't know. I wrote to you about it about it
> > privately on Oct 30th, the subject was "ozone usage."  I wasn't
> > aware that there was sush a list.
> > 
> > >> it's set up
> > >> so that other dB's can be plugged in with (relative) ease.[Write
> > >> a driver] So I need to be able to either 1) Have a workaround in place
> or
> > >> 2) be able to tell users when ozone is not the best choice.
> > 
> > >It depends on the answer of the questions above. Or in general, what are
> > you
> > >actual going to store in ozone?
> > 
> > I think that the best answer to that is "Scarily large amounts of data."
> 
> Large amount of data, of course. ;) The questions is: are the objects as
> tiny
> as XML nodes?
> 
> The 4.4MB XML sample mentioned today in another thread contains 270,000
> nodes.
> 100MB of such XML data would then contain 610M nodes....
> 
> > Due to the nature of LIMS, it could be anything for a the bytecode
> > for a 15-minute mpeg4, to a TIFF, and so on and so forth.  While
> > 55% or so of the potential users my never exceed a meg per GlimsItem,
> > the remaining 45% or so may be storing large amounts of data per
> GlimsItem.
> 
> Ok, but a BLOB to store a 100MB mpeg produces only a few database objects
> compared to XML.
> 
> 
> Falko
> -- 
> ______________________________________________________________________
> Falko Braeutigam                              mailto:falko@smb-tec.com
> SMB GmbH                                        http://www.smb-tec.com
-- 
______________________________________________________________________
Falko Braeutigam                              mailto:falko@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com