[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Property objects bug fix



On Sun, 05 Mar 2000, William Uther wrote:
> --On Sun, Mar 5, 2000 8:28 PM +0100 Falko Braeutigam
> <falko@softwarebuero.de> wrote:
> 
> > On Sun, 05 Mar 2000, William Uther wrote:
> > [snip - encoding problem]
> 
> > A Base64 encode would produce a 7 bit stream but we have 8 bit bytes
> > available. Stuffing two bytes in one char should be possible.
> [snip - sample code]
> > and reverse when decoding. This should prevent the string from being
> > encoded using one of Javas encoding styles and it is memory efficient. I
> > did not test it yet. Other ideas? 
> 
> So I was thinking about this some more.  You are converting bytes into
> chars, storing the chars in a property file, then reading them back in as
> chars and converting back to bytes.  This expands to the following:
> 
> 1) bytes into chars (using whatever encoding Ozone uses)
> 2) chars into bytes (using the default encoding.  The happens when the
> property file is written out.)
> 3) bytes into chars AND TRIM (conversion is using the default encoding as
> the property file is read back in.  The trim is because I believe leading
> spaces are ignored in a property file, but I'm not sure.  Certainly, if the
> string of chars contains '\r' or '\n' then there'll be problems.)
> 4) chars back into bytes (using whatever encoding Ozone uses)
> 
> Disk space: I think we can rely on the default encoding to encode the 0x00
> - 0xFF chars efficiently, so I don't think that is a concern.  It might be
> more of a concern in memory, but I'm not too worried by it.
> 
> The advantage of Base64 encoding is that
>  - it is standard,
>  - we know it avoids whitespace and other problematic characters.
> 
> We could do something fancier, but it is going to have to avoid problem
> chars.  Also, anything that uses the high-order byte might be encoded
> relativly inefficiently using the default encoding.  In fact, are the
> encodings guaranteed to encode all characters?  Or can they skip some?
> 
> I'm leaning more and more towards Base64 as it avoids these problems.

Ok Will, I just found mozilla.org Base64 encoder/decode somewhere in the
castor code and incorporated them. So far it works great for our properties
encoding/decoding. Last not least the state.properties file is human
readable. You are right Will, this is the best solution. 


Falko
-- 
______________________________________________________________________
Falko Braeutigam                         mailto:falko@softwarebuero.de
softwarebuero m&b (SMB)                    http://www.softwarebuero.de