[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Property objects bug fix

To: ozone-users@ozone-db.org
Subject: Re: Property objects bug fix
From: Falko Braeutigam <falko@softwarebuero.de>
Date: Sun, 5 Mar 2000 20:28:17 +0100
In-Reply-To: <1551952.3161183196@turtlecoral.rem.cs.cmu.edu>
Organization: SMB
References: <1551952.3161183196@turtlecoral.rem.cs.cmu.edu>

On Sun, 05 Mar 2000, William Uther wrote:
> Hi,
>   So I thought about this some more, and while the previous fixes I gave
> fix the problem, they might introduce a more subtle bug.  Here is the
> explanation:
> 
> The original problem was that data was being DEcoded from bytes into chars,
> using the platform's default encoding, and then stored as chars.  When it
> was being read back in, it was ENcoded from chars to bytes by just trimming
> the high-byte.  This mismatch in encoding styles was a problem.
> 
> My original fix was to just make the coding styles match up by using the
> platforms default encoding in each case.
> 
> This may have introduced a new problem.  Bytes are generally considered the
> lower level representation.  The java classes are there to encode char
> strings as byte strings and then decode the byte strings back into char
> strings.  The problem is that I DO NOT KNOW if all byte strings represent
> legal character stings (it will probably vary by the encoding used anyway,
> and the default encoding varies by platform).  My fix used the standard
> encoding backwards.  If it tried to store a byte string that does not
> correspond to the encoding of a char string it is unclear what will happen.
> 
> The fix is make sure that the coding chosen has a char string for every
> byte string.  This is possible if we fix the original problem in the other
> direction.  Instead of making both encode and decode use the default
> character encoding, we should make neither use the default character
> encoding.

Yes Will, my fault. I missed that ByteArrayOutputStream.toString() uses default
character encoding. A Base64 encode would produce a 7 bit stream but we
have 8 bit bytes available. Stuffing two bytes in one char should be possible.
            
            byte[] bytes = buf.toByteArray();
            char[] chars = new char [bytes.length / 2];
            for (int i=0; i<chars.length; i++) {
                char low = (char)bytes[i*2];
                char high = (char)(bytes[i*2+1] << 8);
                chars[i] = (char)(low | high);
                }
            
            setStringProperty (_key, new String(chars));

and reverse when decoding. This should prevent the string from being encoded
using one of Javas encoding styles and it is memory efficient. I did not test it
yet. Other ideas? 


Falko
-- 
______________________________________________________________________
Falko Braeutigam                         mailto:falko@softwarebuero.de
softwarebuero m&b (SMB)                    http://www.softwarebuero.de

Follow-Ups:
- Re: Property objects bug fix
  - From: William Uther <will+ozone-db@cs.cmu.edu>

References:
- Re: Property objects bug fix
  - From: William Uther <will+ozone-db@cs.cmu.edu>

Prev by Date: Re: Ozone Bug?
Next by Date: Re: Property objects bug fix
Prev by thread: Re: Property objects bug fix
Next by thread: Re: Property objects bug fix
Index(es):
- Date
- Thread