[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Handling large XML document
On Fri, 01 Dec 2000, Knapp, Robert \(CAP, CMC\) wrote:
> Falko,
> I'm still _very_ new to ozone so I'm going to
> ask a newbie question. Would it be possible for Adrian to split the
> the document into 4.4 meg segments. Sort of how infinite arrays
> or tags in a TIFF are done?
>
> I'm asking because this has a direct effect on my
> evil plans for world domination err..... I mean my project,
> GLIMS(http://glims.sourceforge.net). Scientific data (especially spectra or
> health-care
> case histories) can run into 100's of megs in some cases. And
> into Terabytes on long-term astronomy studies or litigation
> studies.
Wait! What are we talking about? 100's of megs in _one_ XML document?
>
> Ozone is the reference database for GLIMS,
Why then isn't it in the list of "ozone powered projects"??? ;)
> it's set up
> so that other dB's can be plugged in with (relative) ease.[Write
> a driver] So I need to be able to either 1) Have a workaround in place or
> 2) be able to tell users when ozone is not the best choice.
It depends on the answer of the questions above. Or in general, what are you
actual going to store in ozone?
Falko
>
> Thanks,
> RobK
>
> -----Original Message-----
> From: Falko Braeutigam [mailto:falko@smb-tec.com]
> Sent: Friday, December 01, 2000 8:00 AM
> To: Adrian
> Cc: ozone-users@ozone-db.org
> Subject: Re: Handling large XML document
>
>
> Adrian,
>
> I did some more testing but wasn't able to really improve results. IBM
> jdk1.3.2, which in some cases runs ozone somewhat faster than Sun jdk1.3,
> was
> not able to handle the 4.4MB document. So here are my best results for
> storing.
>
> [2 * PII 350, 256MB, Sun jdk1.3]
>
> ozone server params:
> ozone -ddb -udaniela -DozoneDB.wizardStore.tableBufferSize=150
> -DozoneDB.wizardStore.clusterSizeRatio=100 -Xmx128000000
>
> client params:
> ojvm Client store 4.4M.xml
>
> results:
> store=127s; commit=58s ==============> 185s to store the 4.4MB XML
> document
>
> xpath:
> query="/Results/Result[@id=0]/Row/Field[@name='QTY_INV'][self::*='207']/../F
> ield[@name='VENDOR_NAME']"
> 12s (warm database)
>
> query="/Results/Result[@id=0]/Row[@num='0']/Field[@name='QTY_INV']"
> 1.2s (warm database)
>
> On Fri, 01 Dec 2000, you wrote:
> > >%_Dear Falko,
> >
> > Thank you very much.
> >
> > Actually this XML document is for testing the performance of Ozone to
> handle large
> > XML document.
>
> Your document looks like an exported SQL database. Why do you try to use XML
> to
> handle such data? XML and especially XPath are not very suited to handle
> such
> data.
>
> > Later we will have some other large XML documents (the XPath query is
> > not yet known) to process. Anyway, does the XPath query affect the
> required
> > parameters of Ozone?
>
> Yes. In case of the the second XPath query (see above) a real "path" is
> selected out of the entire XML tree. This allows ozone to just activate the
> clusters that are along this path. If your application always uses such
> XPath
> queries is does not need to be able to hold the entire document in memory.
> The first XPath is something like a SQL selection. (again, XPath is not
> suited
> to do such queries) This results in many cluster activation events. So you
> need
> more memory to handle such queries efficiently.
>
>
> Falko
> --
> ______________________________________________________________________
> Falko Braeutigam mailto:falko@smb-tec.com
> SMB GmbH http://www.smb-tec.com
--
______________________________________________________________________
Falko Braeutigam mailto:falko@smb-tec.com
SMB GmbH http://www.smb-tec.com