BDB46 and GtkTreeModel

Over the vacation, I've been hacking on a few different projects, mostly Mono related this time. I've updated Joshua Tauber's BDB wrapper from 4.3 to 4.6. It has many hacks to make DB_RECNO databases more friendly. If you truly are interested in using it with 4.6, I suggest you diff the two versions and remove a few hacks I have.

Now, to the GtkTreeModel.

Having implemented many GtkTreeModel's over the last few years, I'm certainly not naive to the fundamental flaws it has for large data sets. However, we work with what we have, so I decided to make a BdbListStore which implements GtkTreeModel and fronts a BDB DB_RECNO database.

The reason I chose to front a DB_RECNO instead of DB_BTREE, which many people use, is so that GtkTreeIter iteration overhead and GtkTreePath conversions were simple and fast. You can see how bad GtkTreeIter iteration can kill you when you initially attach a GtkTreeModel to a GtkTreeView. Anyway, DB_RECNO is still a btree, it just uses record numbers instead of user defined keys. The other caveat is that you use the flag DB_RENUMBER. This keeps the keys within the tree juxtapose.

To keep deserialization overhead down, I used the generics based LRUCache found on Code Project. BDB already rocks at cache management, so this really was simply to not have to do the deserialization of objects on every render.

The original hack was in C which can be found here.

The C# version (BdbListStore<T>) can be found here. You will also need this for BdbCache<T>, this and this for the LRUCache implementation.

As an aside, I added highlighting to my gitweb. Email me if you want the horrible hack it contains. Its based on another hack floating on the web with some fixes to handle non-highlightable files.

GTask

A few weeks ago I started writing an asynchronous toolkit for GObject, inspired by NSOperation, CCR, Threading Building Blocks, and mostly, Python Twisted. Its implemented in C and has bindings for Python and Vala. I'll be adding Mono bindings in a minor release or two like I did with rss-glib.

You can get the tarball here.

I've written some documentation that covers how and why you would use the API. While doing so, I discovered how awesome txt2tags is. I highly recommend it for anyone who hasn't used it. (Thanks Dan for pointing it out).

Of course, there is the gtk-doc based API reference as well.

Lets take a quick look at the python bindings out of code succinctness. Of course, you could always just use Twisted in Python, but I've written bindings nonetheless.

  1. import gtask
  2. import urllib
  3. import gtk
  4. import webkit
  5.  
  6. def worker(url):
  7. return url, urllib.urlopen(url).read()
  8.  
  9. win = gtk.Window()
  10. win.connect('destroy', gtk.main_quit)
  11. web = webkit.WebView()
  12. win.add(web)
  13. win.show_all()
  14.  
  15. task = gtask.Task(worker, 'http://google.com')
  16. task.add_callback(lambda (url, data): web.load_html_string(data, url))
  17. gtask.schedule(task)
  18.  
  19. gtk.main()

The default scheduler (which can be overridden, mind you) performs the work on a regular GThreadPool. I hope to add a work stealing scheduler as soon as I complete the revamp. I'd like to pull thread management out of the scheduler so it can concentrate on whats important.

Why does the scheduler need to be tunable? Think about operations that use a certain resource. It might be beneficial to tag the task with an ID so that your scheduler can pin it to a given CPU, thus maximizing potential for a cache hit.

You might think that updating the GUI from the callback isn't safe, as you do not own the GDK thread lock. However, I assure you it is. By default, callbacks and errbacks are performed from the main loop so that you do not need to worry about it. You can disable this with the "main-dispatch" property on the scheduler instance.

You may also return a new task from a callback or errback. This will pause the post-processing chain until that task has finished. At which point the result of the new task will become the new result for the task which yielded it.

I've also added task dependencies so that you may have a task which will not execute prematurely until dependent tasks have been completed. I'll be building some neat helpers around this later to do things such as; Go do these three tasks, let me know when they are done. This is a good idea for web applications as if you need to do multiple database calls, they should be done in parallel.

Most of what I've written here is available in the docs, go check them out.

monodevelop-python revisited

Development on my python addin for monodevelop has gone so quickly that I've gotten pretty excited about working on it. It has come from realization to working code in just a few short days. So I thought I'd put together a little overview of what I have added this week and where I'd like to go.

As an aside, I've noticed frequently that people do not understand that monodevelop can be used for more than writing .NET. Like many other IDE shells, it is a framework that facilitates development tools. It happens to be written and deployed on Mono. I think because of this, we may want to alter naming of addins in a way that make it less confusing. Michael Hutchinson suggested PyDevelop/RubyDevelop. Of course, RubyDevelop doesn't exist yet. IronRuby's AST should make that doable in a relatively short time period if someone is interested.

Its worthwhile to note that I only have support for python2.5. However, the core is built already to support more versions of the python runtime. I just need to do it.

Compiling of Code

Often times I forget that python code is compiled because it works so well. You can pre-compile your code to save a marginal amount of time during startup. It also lets you be sure that your code is syntactically correct. Debug mode will generate .pyc's by default and Release mode will generate .pyo's.

Parsing of Python

The parsing of python happens in a python subprocess. I originally did the interaction with the host process over stdin/stdout. I moved away from this as it became a source of contention as there is inherently no concurrency. Both python and .NET happen to come with HTTP utilities and I get concurrency for free! Therefore, the python process sets up an embedded HTTP server and the .NET code POST's to it. The result of the parsing comes back as an XML representation of the python AST along with some extra sauce. For example, I also analyze the code with pyflakes if it is available, allowing for warnings to be displayed inline with errors. You can see this in the image below. Currently, these are highlighted in red as well with errors, but I believe this will be changed in monodevelop before too long. Any of yellow, green, or blue would be a good choice in my opinion.

Python Source Regions

Now that we have an AST representation of the python source code we can start to do fancy things with it. We know how big blocks of code are, where they are located, and what modules they interact with. Using this knowledge, we build DomRegions to be used by the editor. This gives us code folding as you can see below.

You will notice throughout the rest of this overview, how this AST is fundamental to providing language support.

From the AST, I currently have information about:

  • imports
  • classes
  • functions (and their arguments)
  • attributes
  • locals
  • pydoc's
  • comments (well, not done through the AST, yet)

Code Completion

With both regional and contextual information available, we can take the current editor cursor position to know where we currently are within the AST. This allows us to provide code completion for locals, attributes, functions, modules and whatnot. Note that I just have basic functionality here, and I intend to really beef this up in the short term. The proof that it works is there, so thats a decent start.

Smart Indentation

Its important for your text editor to not slow you down from your current thought process. One of the easiest ways to keep your flow moving is to always have your cursor in the right position. The editor will flow with your movements for blocks. This is really quite simple in python, as blocks are prefixed with a line ending in a colon “class Hi:”. My implementation of this may not be the correct way to go, but “it works”. Also note, until per project code formatting arrives in monodevelop, you should set your tab vs. spaces mode to 4 spaces.

Class Browser

Monodevelop includes an extensible class browser pad. The implementation really is a piece of art. I've been fortunate enough to write code for it a few times in the past. Anyhow, we use the AST objects directly within this tree to render a hierarchy of the modules within your project. I'd like to include imported modules outside your project as well.

Document Outline

In similar spirit to the class browser, the document outline pad provides a module hierarchy for the currently open source file. The implementation here is currently hack'ish do to the interfaces being very .NET specific. It sounds like this may change before monodevelop 2.0 is released.

Editor ComboBox JumpTo

Many developers have come to love the combobox at the top of the editor to jump to regions within their source file. I personally like to keep my individual sources as succinct as possible, but I understand why its useful. Therefore, I support that as well. Again, this is sort of hack'ish do the the current interfaces.

Where would I like to go?

I have lots of ideas and diminishing time to implement them. So if anyone feels like helping, its time to speak up.

The debugger interfaces appear to be stabilizing, so this sounds like a feature worth implementing. The bundled python debugger should help us a lot here. We can again work as a subprocess to perform the necessary hooks for step/locals/watches and whatnot. While we are at it, how about remote debugging over ssh.

I'm also excited to start on a profiler. The interfaces for the profiling API are thankfully generic enough to support this relatively easy. We again simply need a subprocess that writes out the profile snapshot to a file. Using the output from the profiler, I will build a view that includes (Module/Function, N Calls, Avg per call, Total time, and Total % of time). I haven't used profilers too much, so if you have suggestions on what you want to see, please don't hesitate to chime in.

An interactive python shell is a must. Will most likely defer to Vte and python/ipython for this.

UML generation is really quite easy when you have the AST already available. I would love to see someone make a gtk canvas that can read graphviz dot files. Making it look sexier than graphviz is a must. I remember using omnigraffle years ago and it looked incredible.

Templates. We need a bunch, simple as that. I'm thinking gtk examples, unit tests, twisted plugins, qt demos, clutter, GNOME, and more.

Code coverage is also useful and simple enough to record during runtime. We can use the output of this to render over the icons on the class pad. Essentially light up code paths that are never executed.

Setuptools integration could be worthy as lots of good projects in python are available from the python cheese shop. (“easy_install processing”).

Refactoring seems doable as the python language makes that fairly easy. I haven't looked at the refactoring API's within monodevelop yet, so its hard to say. At the minimum, I'd think renaming of methods, classes, and modules should be doable.

This is the end of me wasting your time, continue with your hacking and thanks for reading this far.

Update

git://git.dronelabs.com/git/users/chris/monodevelop-pybinding.git

If you have monodevelop from trunk, you can install from the Addin Manager using http://audidude.com/python-addin/

Update

This has been merged into monodevelop trunk. It is in extras/PyBinding/.

monodevelop-python

Back in February, while at Brainshare, I started putting together python support for monodevelop. It was just basic project file support and a compiler. After Eric mentioned to me that we aren't running enough of our own software every day, I figured I should make this something I can use.

Therefore, I added a couple new features over the last few days. The core of the new features comes down to a python subprocess which builds AST's for the python code and serializes that to XML. Back in Mono-land I can convert that to the internal format for parsed data within MonoDevelop.

Feature 1 based on the python parser; Code Completion

Feature 2 & 3 based on the python parser; Code Folding and Error Highlighting

The code is in a git tree private to the addin for the time being. I will most likely move this to github in the near future.

git://git.dronelabs.com/git/users/chris/monodevelop-pybinding.git

Walker, FileSystem Ranger

As a way to learn a bit of GIO from the green pastures of Python, I threw together a functional programming replacement to Python's os.walk(). The WalkerTexasRanger class takes 3 functions. One for results, one to clear the results, and one when the walk is finished.

Lets look at how we use this.

  1. import texas
  2. import gtk
  3.  
  4. def onResult(walker, dirname, dirs, files):
  5. print dirname
  6. print 'dirs:', dirs
  7. print 'files:', files
  8.  
  9. # 3 funcs, onResult, onClear, onFinish
  10. walker = texas.WalkerTexasRanger(onResult, None, lambda *a: gtk.main_quit())
  11. walker.walk('/tmp')
  12.  
  13. gtk.main()

You can get the module from here.

python decorator and gtk mainloop

I just put together a neat little decorator to force a method to be called via the default gobject main loop. It does this by wrapping a method call via an immediate timeout of the main loop.

  1. def callFromMain(f):
  2. import gobject
  3. def dispatch(items):
  4. args, kwargs = items
  5. return f(*args, **kwargs)
  6. def wrapper(*args, **kwargs):
  7. gobject.timeout_add(0, dispatch, (args, kwargs))
  8. return wrapper
  9.  

This works well for callbacks that come from other threads.

  1. @callFromMain
  2. def resultCallback(*args):
  3. print args

gnome-do relevance for python

I ported the text relevance engine from gnome-do to Python. You can download it here.

>>> import relevance
>>> relevance.score('hi there dude', 'hi dude')
0.53480769230769232
>>> relevance.formatCommonSubstrings('hi there dude', 'hi dude')
'<b>hi </b>there <b>dude</b>'

Meme(me)

Via Chris Blizzard.

1. Take a picture of yourself right now.
2. Don’t change your clothes, don’t fix your hair…just take a picture.
3. Post that picture with NO editing.
4. Post these instructions with your picture.

Lenovo x300

So I don't rave about hardware often, but I've had this x300 thinkpad for about a month now. I would like to mention how awesome it is to have working suspend in Linux that is damn near flawless. Oh, 802.11n, gig-e, usb2, 4 hours battery with lightweight 6 cell, dvd-rw, solid state disk, led backlight, and low-voltage totally rocks too. (and right about 3 lbs)

One of those things …

that I always liked about OS X was fading between backgrounds on the desktop. So I quickly hacked up a 20 line prototype for libeel to do something similar. Screencast below:

Cairo sure does make life easier these days.