[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Handling large XML document



On Wed, 29 Nov 2000, Adrian wrote:
> When I am trying to store a large XML document of 4.4MB by using the
> "Client" example, the following messages appear in the console starting
> ozone:
> 
> [error](315) Transaction: blockedBy()
>     java.lang.NullPointerException
>         at java.util.Hashtable.get(Hashtable.java:320)
>         at org.ozoneDB.DxLib.DxHashMap.elementForKey(DxHashMap.java:61)
>         at
> org.ozoneDB.core.wizardStore.WizardStore.containerForID(WizardStor
>             e.java:315)
>         at org.ozoneDB.core.Transaction.blockedBy(Transaction.java:344)
>         at
> org.ozoneDB.core.TransactionManager.checkDeadlocks(TransactionMana
>             ger.java:578)
>         at org.ozoneDB.core.DeadlockThread.run(DeadlockThread.java:29)
> [error](315) Transaction: blockedBy()
>     java.lang.NullPointerException
>         at java.util.Hashtable.get(Hashtable.java:320)
>         at org.ozoneDB.DxLib.DxHashMap.elementForKey(DxHashMap.java:61)
>         at
> org.ozoneDB.core.wizardStore.WizardStore.containerForID(WizardStor
>             e.java:315)
>         at org.ozoneDB.core.Transaction.blockedBy(Transaction.java:344)
>         at
> org.ozoneDB.core.TransactionManager.checkDeadlocks(TransactionMana
>             ger.java:578)
>         at org.ozoneDB.core.DeadlockThread.run(DeadlockThread.java:29)
> 
> Although the storing process can be finally finished after an hour, the
> query process is very slow when I tried to query the large document.
> Also, querying other documents are slow too even for very small document
> after storing the large documents.

ozone uses 32MB heap size by default. This is not enough to keep your entire
document in the cache. This leads to tons of cluster activation/passivation
events, which are by far the most time consuming operations. A very important
rule to get good ozone performance is to have enough RAM available to keep
the "working set" of the data in the cache. (this is true for all ODBMS I think)

A related problem is the size of the central b-tree that object IDs into
objects. Since the performance of this b-tree is very important for the entire
ozone performance there are cache levels for it. The first level is a small
direct mapped cache. The second level caches the LRU pages of the b-tree in
memory. The size of both caches is adjustable via the following properties:

- ozoneDB.wizardStore.tableCacheSize: number of significant bits of a
hash value for the 1st level direkt mapped cache; default = 12 -> 4096 entries

- ozoneDB.wizardStore.tableSubtableSize: number of significant bits for one
b-tree level; default = 11 -> 2^11/10 -> ca. 1700 entries per b-tree page

- ozoneDB.wizardStore.tableBufferSize: number of b-tree pages in the second
level cache; default = 15 -> 15 * 1700 = 25500 entries

Your 4.4MB document probably has much more nodes than 25500, which leads to many
b-tree disk accesses which are, like all serialization operations, slow.

Adrian, can you send my your (zipped!!!) document and a set of XPaths which
you want to do on it? I will try something and get back with a set of tuned
ozone parameters.


Falko
-- 
______________________________________________________________________
Falko Braeutigam                              mailto:falko@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com