Catalina

I've written another library, Catalina. It started as an example for using the threading library Iris and turned into what I think is a useful library. Catalina is an object data-store for glib and gobject. It provides access through a natural key/value pair interface.

Transparent serialization is supported to and from storage for types that can be stored in GValue's. A tight binary format is provided with the library. It supports basic types such as integers, doubles, floats and strings as well as GObjects in an endian-safe manner. However someone should go double check to call my bluff (and verify its correctness). A JSON serializer would be a quick hack if someone was interested.

In addition to serialization, Catalina supports buffer transformations to and from storage. Included is CatalinaZlibTransform which can apply compression using zlib. It will avoid compression on buffers smaller than the watermark property. This will help on data-sets that are occasionally small and compression would in fact enlarge them.

Catalina is an asynchronous data-store by design. The optimal way of accessing it is the same.

Everything is built upon Trivial DB (TDB) from the samba project. It was chosen over Berkeley DB because of its license. Like Catalina, it is LGPL and does not impose extra restrictions on linking applications such as BDB.

However, the one downside to using TDB is its lack of concurrent transactions. This means that if you have multiple threads doing work and updating storage the transactions would interleave. Since we are using iris, we can use message passing as a way to manage concurrent transactions. (This is done by queuing messages until the commit phase.)

Here is a short example using Vala to asynchronously open, serialize and store a bunch of "Person" GObjects. All the while compressing each buffer with zlib. Don't be scared by the mutex/cond, it's there to negate the need of a main loop.

I intend to add indexes soon, however that is going to take a bit of planning.

So there you have it, my newest hack.

git clone git://git.dronelabs.com/catalina

Comments (13)

  1. Ed Ropple wrote:

    Might want to consider a rename; the name “Catalina” is used by Apache Tomcat. :)

    Sunday, June 21, 2009 at 11:33 pm #
  2. chergert wrote:

    they are quite different, so i think ill keep it for now.

    Sunday, June 21, 2009 at 11:53 pm #
  3. davidcl wrote:

    Is there any relation between glib-couchDB and your own library ?

    Monday, June 22, 2009 at 7:00 am #
  4. chergert wrote:

    @davidcl

    There is no relation. CouchDB glib really doesn’t do much for you at this point other than handle getting/setting data elements (no serialization management or anything).

    Monday, June 22, 2009 at 8:38 am #
  5. James Mansion wrote:

    Surely tdb is GPL, not LGPL.

    Monday, June 22, 2009 at 10:18 am #
  6. chergert wrote:

    @James

    TDB userspace utilities are GPL’d. However the library itself is LGPL.

    Monday, June 22, 2009 at 10:52 am #
  7. James Mansion wrote:

    Hmm – well, silly me, I was looking here:

    http://tdb.cvs.sourceforge.net/viewvc/tdb/tdb/tdb.h?revision=1.27&view=markup

    But the sources on Samba do indeed seem to have been relicensed. That’s good news!

    Monday, June 22, 2009 at 12:35 pm #
  8. method wrote:

    These are really great libraries. I have FLOSS use cases for ethos and now this library. Keep it up, but don’t overstretch yourself :) .

    Monday, June 22, 2009 at 12:48 pm #
  9. Adam Tauno Williams wrote:

    Anyone know of something similar for .NET (Mono)? I have a Gtk# app that gets [lots] of data from a server and keeps a local cache. A thread-safe data-store of some kind would be immensely useful; currently I just use a Hashtable and LINQ (for searching). I don’t want to require a SQL database and that would also mean I’d have to map to an SQL schema which really isn’t optimal. I tried DB4o but found it gets REALLY slow, a big Hashtable and LINQ is much faster.

    Monday, June 22, 2009 at 1:16 pm #
  10. Seth wrote:

    Oh but I _am_ scared by the mutex/cond. Next time write a main loop. See if you can get it right.

    Use IDisposable for storage resources.

    Use try finally for unlocking mutexes.

    *ONLY* use cond.wait(…) in a loop _and_ check the exit code.

    Your code is not threadsafe (or at least relies on known calling patterns – robust code comes from adhering to good standard practice at all times. Always wear a hard hat at construction sites).

    It is not exception safe (by a mile). That’s ok, but maybe you should not be creating the illusion by having exceptionhandling in the code.

    I, however, am _not_ tempted to try Catalina for anything storage related if the sample itself is … sketchy like this.

    Monday, June 22, 2009 at 7:23 pm #
  11. Seth wrote:

    Important correction: I should have made it clear I was referring to the *SAMPLE* code. I haven’t looked at the library.

    Monday, June 22, 2009 at 7:25 pm #
  12. chergert wrote:

    @Adam

    I wrote something similar for people using .NET last winter.

    http://github.com/chergert/adroit/tree/src/Adroit.Data/Adroit.Data

    Monday, June 22, 2009 at 7:38 pm #
  13. chergert wrote:

    @Seth

    Thanks for the warm and fuzzy comments.

    There is no IDisposable in Vala. How would locking/unlocking in a finally block help. They don’t throw exceptions (in Vala). Perhaps you have confused this with a runtime such as .NET.

    For more information on Vala and what it is, see http://live.gnome.org/Vala.

    Please point out what you believe is not thread-safe rather than just saying it.

    Monday, June 22, 2009 at 7:50 pm #

Trackback/Pingback (1)

  1. Christian Hergert: Catalina | Full-Linux.com on Monday, June 22, 2009 at 6:09 am

    [...] available information related to your search Christian Hergert: Catalina is now available in this link…: News [...]