[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Handling large XML document



>----Original Message-----
>From: Falko Braeutigam [mailto:falko@smb-tec.com]
>Sent: Friday, December 01, 2000 8:59 AM
>To: Knapp, Robert (CAP, CMC); 'ozone-users@ozone-db.org'
>Subject: RE: Handling large XML document


>On Fri, 01 Dec 2000, Knapp, Robert \(CAP, CMC\) wrote:
>> Falko,
>> 	I'm still _very_ new to ozone so I'm going to 
>> ask a newbie question. Would it be possible for Adrian to split the
>> the document into 4.4 meg segments. Sort of how infinite arrays
>>  or tags in a TIFF are done?  
>> 
>> 	I'm asking because this has a direct effect on my
>> evil plans for world domination err..... I mean my project,
>> GLIMS(http://glims.sourceforge.net).  Scientific data (especially spectra
or
>> health-care 
>> case histories) can run into 100's of megs in some cases. And
>> into Terabytes on long-term astronomy studies or litigation
>> studies.  

>Wait! What are we talking about? 100's of megs in _one_ XML document?

Well, I'm still in the design phase so it is hard to answer this
100%.  The class of interest (GlimsItem) can contain 100's of
megs or more.  Depending on the implementation, there
could be references to other XML files for some of the properties.
In that case the finally constructed object may contain 100's of megs
of data in one object. [This obviously is a very extreme case,
typical would be 1M or so.] Even in this case, we may be looking
at instances where a single GlimsResult may be very large. 

Again, it's an implementation problem. The reason that I asked
the question is that it will need to be something I take into
account with my design.  

>> 
>> 	Ozone is the reference database for GLIMS, 
>Why then isn't it in the list of "ozone powered projects"??? ;)

Ummm, frankly I don't know. I wrote to you about it about it
privately on Oct 30th, the subject was "ozone usage."  I wasn't
aware that there was sush a list.

>> it's set up
>> so that other dB's can be plugged in with (relative) ease.[Write
>> a driver] So I need to be able to either 1) Have a workaround in place or
>> 2) be able to tell users when ozone is not the best choice.

>It depends on the answer of the questions above. Or in general, what are
you
>actual going to store in ozone?

I think that the best answer to that is "Scarily large amounts of data."
Due to the nature of LIMS, it could be anything for a the bytecode
for a 15-minute mpeg4, to a TIFF, and so on and so forth.  While
55% or so of the potential users my never exceed a meg per GlimsItem,
the remaining 45% or so may be storing large amounts of data per GlimsItem.

Like I said I'm still in the design phase, and there are hooks
for using other dbs, but I would like to get ozone to
address as much of the bell-curve as possible. 

I think the most distrubing thing is that this project is
my hobby. :)  

Thanks,
RobK
>Falko