Adventures trying to use open-source libraries

It seems that whenever I want to use an open source library, I run into problems because of various kinds of dependencies. I’ve run into this with Java and C++ libraries. Most recently, I had one of these adventures with a Common Lisp library.

Babel is an excellent open-source Common Lisp library for converting between string representations, such as the different encodings of UNICODE, as well as EBCDIC characters and so on. It’s portable and efficient. We use it to decode UTF-8 into full 32-bit UNICODE.

Recently we suspected that it might be running more slowly than we’d like, and that we might be able to get a measurable speedup by optimizing it. So I thought I’d write a simple benchmark and try some changes that might speed it up, such as adding fixnum declarations.

Babel includes a regression test. Obviously, I needed to make sure any speedups that I put in would not break Babel, so running the regression test would be important. This is where the fun began.

Babel’s regression test depends on a Common Lisp unit test framework called stefil (which I’d never heard of). I found stefil on the web, but there was no source distribution. The only way to get it was to use darcs.

The machine on my desktop uses an old version of Linux. (The reasons are too boring to go into here, and it’ll be upgraded soon.) It does not have darcs already installed on it. No problem, I said to myself, and proceeded to obtain darcs. It turns out that darcs comes in source form, so you have to compile it.

Darcs is written in Haskell, and my Linux machine does not already have the Haskell compiler. So I downloaded the compiler (GHC), and tried to compile it. But I got weird error messages about missing C header files. I could not figure this out, because the build mechanism for GHC is rather complicated, using tools that I would have had to figure out, etc.

Finally I gave up, and found someone with a more modern version of Linux that already had darcs. He got stefil for me.

Next, I found that stefil depends on several other Common Lisp libraries: Swank, alexandria, iterate, and metabang-bind. We already had Swank (which is part of Slime), and alexandria, so I found and downloaded iterate and metabang-bind.

I got error messages trying to compile stefil. It eventually turned out that stefil depends on a non-standard version of Swank, and will not compile with any other version. Since I did not need the feature that integrates stefil with Slime/Swank, I had to comment out the dependency on stefil in its asdf file (which is like a makefile).

Compiling stefil still failed, because it uses the iterate library, and iterate includes a Common Lisp code walker, and in the version of Clozure Common Lisp that we use at ITA, assert macroexpands to a non-portable form that the code walker does not understand. This feature in assert was added for us in order make the code coverage tool know that it’s OK that we do not cover assert forms, but, of course, iterate’s code walker didn’t know about it. (A code walker must know about every Lisp “special form”.) I fixed this by learning how the code walker is organized, and extending it to know about assert as a primitive special form.

Finally, the babel regression tests turned out to have bugs. They depend on char-code always returning a fixnum, which is a violation of the Common Lisp standard. I had to fix various things and comment out other things in order to make the unit test work properly with Clozure Common Lisp (which was not at fault).

After all this, I was able to run the regression test, and so I could proceed to make changes to Babel with some assurance that I didn’t introduce bugs. But it all took so much time that I fell behind in my work schedule, which was, to say the least, annoying.

The problem partly lies with my using such an old version of Linux, but this kind of problem seems to be common with open source libraries in all languages and domains. If they’re not used very widely, and not maintained, they often don’t work well together.

34 Responses to “Adventures trying to use open-source libraries”

  1. attila lendvai Says:

    hey Dan,

    let me add a few bits’n'pieces to the story…

    first of all, library/module handling in CL is awful, ASDF is really “like a makefile”… :) and CL’s package system does not advertise obvious ways to load two versions of the same codebase. this is the source of most of the CL library problems, but i stop whining without having an alternative in store… in the lights of this and some other things, we’ve chosen to be living on the bleeding edge when it comes to CL libs: use the repo checkouts of each and every dependency, and it was working so much better than using releases that we don’t waste time packaging versioned releases of our libraries either.

    about stefil (i’m a co-author): after you’ve written a mail to the list, i’ve pushed a patch (the same day) that disables the (non-crucial) swank integration completely until we’ll move back to slime head (if ever). one “darcs pull” away from your desktop machine (although i think i have forgotten to drop you a mail about this, sorry for that. i have a google reader entry for the repos of the more important dependencies we use). try to ask for such support from a closed-source vendor… :)

    and i can’t see your patches in the mentioned repos nor on the mailing lists yet… when it comes to opensource, then you are always the one who spare the suffering of the upcoming knights… ;)

  2. John Cowan Says:

    I’m familiar enough with this sort of problem, which is still preferable to binary DLL hell (of which it is the analogue). In the proprietary software world, I’m told, upgrading software frequently downgrades your system libraries to the particular version the upgrade was bundled with.

    But good Ghu, using a whole big library just to translate UTF-8 to UTF-32?? That’s a couple of lines of bit-twiddling. Pseudo-code looks like this:

    if byte < #x80
    then byte
    else if byte < #xC2
    then error
    else if byte < #xE0
    then (byte & #x1F) << 6 + (nextByte & #x3F)
    else if byte < #xF0
    then (byte & #xF) << 12 + (nextByte & #x3F) << 6 + (nextnextByte & #x3F)
    else if byte < #F5
    then (byte & #x7) << 18 + (nextByte & #x3F) << 12 + (nextnextByte & #x3F) << 6 + (nextnextnextByte & #3xF)
    else error

    Insert the necessary casts between character and integer and go for it.

  3. Stelian Ionescu Says:

    How can char-code not return a fixnum on ClozureCL ? On the 32-bit CCL version I have here, (integer-length char-code-limit) => 21 and (integer-length most-positive-fixnum) => 29; therefore given this, the definition of CHAR-CODE(«char-code returns the code attribute of character») and 13.1.3(«A character’s code is a non-negative integer») I’d say that char-code always returns a fixnum on CCL.

    Anyway, I’m one of the developers of Babel: we’d be very interested in a bug report.

  4. Faré Says:

    Upgrading Linux.

    How much time did you waste because of an old Linux installation? Installing a new one takes about an hour. Replicating one of your colleague’s recent installation on same hardware is even quicker.

  5. Dan Weinreb’s blog » Blog Archive » Adventures trying to use open … | Open Hacking Says:

    [...] the original post here: Dan Weinreb’s blog » Blog Archive » Adventures trying to use open … This entry was posted on Sunday, April 5th, 2009 at 9:54 am and is filed under Linux, Software, [...]

  6. Shane Says:

    @ Faré, Upgrading Linux is not the answer. That wouldn’t have solved this guy’s problems save perhaps the darcs one. The problems he describes I’ve run into various incarnations when trying to do just about anything with open source software. These problems are preferable to the problems one finds when trying to use closed source software, but they’re still problems that you don’t get to waive away with a magic upgrade wand.

  7. Marcin Says:

    Do you have a feeling as to the kinds of conventions that might help make this sort of thing less common?

  8. guillaume Says:

    You could have downloaded a static version of darcs on their wiki : http://wiki.darcs.net/index.html/Binaries

  9. lele Says:

    When you talk about other languages, that’s OK.

    When you talk about CL, that’s very funny. How it comes that a tool targeting AI can’t manage to track down and download required libraries on its own? AI really failed.

  10. Jered Says:

    Dan,

    I’ve had this sort of problem frequently, although I’ve also found it to be prevalent with non-free libraries too. The only difference with the latter is that usually I can get a developer to help solve the issue… although it usually takes longer than if I went and did it myself. And, of course, with a closed library sometimes that’s not possible.

    Mostly, free or pay, it comes down to software maturity. Developers put different levels of effort into packaging, especially of libraries, and more widely used libraries are more likely to have the kinks worked out. Clisp puts you in a tough spot, in that it’s a relatively small community compared to the C or Java world…

  11. Dan Weinreb Says:

    @Attila: What you say about packages and asdf is true, although in this particular scenario, those were not the issues. Thank you for making the fix! I didn’t submit anything since I didn’t think that there was anything I knew that the maintainers did not already know.

    @John: No, we use babel for many kinds of translating. Hunchentoot uses it to be able to handle arbitrary character codes. So as long as we had it anyway, we use it for a bunch of purposes. It was just the UTF-8 decoding that seemed to be a performance issue. We figured that adding the fixnum declarations was pretty easy, and also suitable for everyone else and therefore something to submit to the maintainers for inclusion in the official sources (I’ll do that shortly).

    @Fare: (Fare is in the office next to mine at work): You know the issues. Now that our internal support people have a path for a more modern Linux that they’ll support, I’ll be upgrading to it shortly.

  12. Robert Goldman Says:

    FWIW, I use a Mac, and had a similar unpleasantness with darcs. At the time I needed it last, ghc didn’t compile on MacPorts, and I had to grovel around until I finally found a binary copy that someone had kindly left around… I’m not convinced that forcing every possible user to compile things with tall dependency trees is a great idea….

    A second point is that this suggests that open source libraries that are not commonly used (alas, this means CL libraries in particular) are especially vulnerable in their dependency trees. I try to avoid using libraries that have long dependency trees for this very reason — it raises the odds that someone will have broken something below the level of the library I need. CL libraries are particularly vulnerable to this, because ASDF doesn’t have the clearest way of specifying which version you need; because there are many libraries that are not versioned carefully or versioned at all (e.g., you have to pull them from revision control); etc. At the expense of forgoing some possibly useful libraries, I simply avoid ones that have big dependency trees. And whenever I build them into a system, I pull a copy into our own version control system and work from there.

    Even with this, though, it’s often hard to tell which libraries are going to save you work, and which are going to cost you work….

    Best

  13. Dan Weinreb Says:

    @Stelian: Sorry, I meant code-char. I’ll send you email with the details. Thanks very much for your comment and reply!

  14. Mark Hoemmen Says:

    On Linux, I usually have a lot more trouble with closed-source libraries than open-source libraries, because the closed-source libraries usually come as a binary blob that doesn’t work with my package manager. If it does work, then the software itself usually doesn’t, because it requires some system library whose version is different than mine, and it’s not willing to be flexible about it.

    Anyway, an easy way for library vendors to solve the “I don’t want to / am not able to install your version control software” problem is to post tarballs of releases. It’s a friendly thing to do for people in your situation, who don’t have the freedom or time to install arbitrary software on your computer. (When I’m in that situation, it’s usually not _my_ computer, but an account on a cluster for which I most emphatically do NOT have root.)

  15. David House Says:

    Compiling GHC is generally considered to be an “only if you really know what you’re doing” kind of thing. It itself is written in Haskell, so unless you have an older version of GHC already installed, you need to do a complicated installation involving semi-compiled C bootstrapping files which are architecture-specific. This is mentioned on the GHC’s download page.

    Compiling an entire compiler (a huge sprawling application), let alone one which is as complicated to install from source as GHC, just to get a single application seems like huge overkill. As another commenter pointed out, the sensible thing to do would have been to use a binary distribution of darcs.

    As for the rest, it sounds like a sane module distribution system (e.g. Ruby’s gems or Haskell’s Cabal) which can install a module and all its dependencies with a single command would have simplified the process significantly. I’m not familiar with Common Lisp but I’m somewhat surprised something like that doesn’t exist.

  16. Mark Hoemmen Says:

    @lele: It’s a problem that’s hard even for humans to solve. It’s like trying to use an industrial standards manual written in East Elbonian, to determine how many mm to shave off your metric bolts to make them compatible with Elbonian bolts measured in prime-numbered fractions of the current Elbonian Emperor’s left thumb.

  17. Jeff Read Says:

    Haskell is cool and all, but Darcs being written in it was what drove me to a) drink; and b) embrace Git completely, albeit the motivating factors were slightly different. I wanted a modern RCS that I could use on the OLPC. Darcs relies on a bunch of Haskell libraries and compiling the lot of them taxes that poor CPU and meager memory allotment (256MB) like you wouldn’t believe. Git compiles fine.

  18. Anton Says:

    > No, we use babel for many kinds of translating. Hunchentoot uses it to be
    > able to handle arbitrary character codes. So as long as we had it anyway,
    > we use it for a bunch of purposes.

    In fact Hunchentoot uses flexi-streams for encoding handling. Hunchentoot depends on babel via cffi.

  19. Bob Says:

    I see the point here.

    Proprietary libaries always come with an excellent installer, full source, all the regression tests you could hope for, and the legal right to actually modify the code and distribute the modified version.

    In other news, dependancies suck, as does terminal not-invented-here syndrom. So you can’t really win either way. Either you havea stack of dependancies, or you have to wait another few years til they finish reimplementing the bits they needed.

    That said, try using a library written by a linux nerd on a Windows machine – even if the port is maintaned and official and generally perfect, the dependacies probably still include “install cygwin”. :)

  20. Dave Moon Says:

    Why does the world need dozens of incompatible unit test frameworks? Don’t programmers who create these have anything better to do?

    Actually, even though I am a big fan of unit testing, in my opinion even one “unit test framework” is one too many. Let the flames ignite.

  21. Daniel Weinreb Says:

    @Anton: I stand corrected. Flexi-streams has its own decoders and encoders. I was mis-remembering. We use babel in the layer of code that we use on top of Hunchentoot. For example, if are are looking at some XML, it starts with a preamble that tells you what character set the XML is in. To read that preamble we have to convert an octet string to an ASCII string at the beginning of the XML. Here’s another:

    (defmethod get-request-body ((handler qres-xml-http-handler))
    “Determine the content encoding by considering (in this order) the charset= specification in the
    incoming HTTP header, the Byte Order Mark in the incoming XML, the encoding= specification in the
    XML preamble of the incoming XML. If no encoding can be determined, fall back to US-ASCII.”
    (let ((octets (hunchentoot:raw-post-data :force-binary t)))
    (babel:octets-to-string
    octets
    :encoding (get-request-external-format (or (hunchentoot:header-in* :content-type)
    “text/xml”)
    octets))))

    We use it for URL encoding, X.891 (FastInfoSet), reading various fields out of the database, and lots of other miscellaneous places.

  22. attila lendvai Says:

    @Dave Moon

    we investigated several other unit test frameworks and used one (5am) for half a year.

    in that time we constantly discussed ideas about how it could be more helpful, then i got annoyed and started to implement them. then i realized that the extensive changes would be an arrogant rewrite if i pushed my changes into the 5am repo, so i created the nth+1 framework instead.

    > Actually, even though I am a big fan of unit testing,
    > in my opinion even one “unit test framework” is one too many.
    > Let the flames ignite.

    this is silly, no need for ignition… although stefil is not the biggest lib we wrote, and not the most complex one either (it’s basically an instrumented defun*), it does have essential features (imho, obviously).

  23. Slobodan Blazeski Says:

    @Dave Moon
    Because it’s relatively easy to write new hackbrary in lisp. http://tourdelisp.blogspot.com/2008/01/common-lisp-libraries-victims-of-drive.html Anyway siatuations is much better then a year ago when the post was written. Lisp is still a marginal language and there is no enough critical mass nor sugar daddy companies so lisp authors are doing the best they can. If it wasn’t for them we would all have to write from scratch or use something else.

  24. Dan Weinreb Says:

    @Slobodan: The blog entry is interesting, but I don’t think the fundamental problem is specific to Lisp. Lots of people write small, quick tools, that gradually grow into big, complicated tools. To turn that into something suitable to share with others, you’d often have to re-architect it, and rewrite a lot of it, as well as document it. Lots of people don’t do those things, Lisp or no Lisp. You can hack up quick libraries in lots of languages other than Lisp; Perl and shell-script meisters do this all the time.

    @Moon, @Attila: At ITA, my co-workers wrote their own little unit test framework. (One thing about those frameworks is that it’s easy to write a very little one, and so your initial motivation for seeking out a library is low.) There are some things about it that I don’t like: you define a test with a define-test macro, but that macro does not actually define a function named the name; instead of puts the function on some propery of the name’s property list. The designers had a justification for this, but I didn’t buy it; I no longer remember exactly what it was. If you write a test with define-test and use assert-xxx macros in it, you can’t just turn it into a defun because then the assert-xxx’s don’t work. So you can ONLY run the test via the framework, which is a little bit annoying. Not enough annoying for me to try to get it changed.

    But having the framework, even a small one, IS useful, so that the higher-level facilities that call the tests know which tests to call, know how to interpret the results, etc.

  25. Dave Moon Says:

    You start with a little unit test framework that took 5 minutes to write, and then you “improve” it, and before you know what you have done you end up with a total cancer like junit. Better to have stopped after 5 minutes. That’s my opinion anyway. I guess Common Lisp is fortunate not to have anything as bad as junit yet.

  26. Slobodan Blazeski Says:

    @Dan
    I agree about problem not being specific to lisp and I have nothing but hat down to authors of common lisp libraries (King, Weitz, Battayani, Antoniotti, Constanza and many other homage to you). But we must understand that we live in a lisp world. Those little things that make a difference from barely usable hackbrary to a quality library take a lot of time and grunt work to write. The payoff is that when they are written users could concentrate on making something useful instead of wasting time to make damn thing work or rise the white flag and start new hackbrary.
    Also it sends a signal to other lispers that barely usable is Ok. I know that lispers are a smart persons and that for every solution they could think of dozen architectural solution that would be far better. But unless you’re planning to spend time to bring your dream architecture to a real library please, please swallow your pride and help the second-best-available to become more polished.
    We know the joke about *real* programmers who are 50% done after 1 week, 90% done after 1 month, 95% done after 3 months, and leave the project after 6 months on 99% because remaining 1% needs 99% of the work. That’s just my humble opinion or I’m getting old :)

  27. Ben Hyde Says:

    Yeah, there certainly is a worse is better syndrome seen across open source software stacks.

    Clbuild is treating me reasonably well these days; but I’m currently using sbcl.

    @Faré you slay me! The further down your install software stack you upgrade the more things upstream must be upgraded, and the greater the chance one of them is going to surprise you. If that wasn’t true then you’d start each day by installing the latest version of the OS, it’s libraries et. al. Now that might be the best option but it’s not how most of work.

  28. Ascription is an Anathema to any Enthusiasm › the briar patch Says:

    [...] wrote up his travails getting some Lisp libraries working on.  Boy have I been [...]

  29. Richard Tibbetts Says:

    The technical term for this exercise of dependency following and unwanted technical exploration is “yak shaving”. See the original http://projects.csail.mit.edu/gsb/old-archive/gsb-archive/gsb2000-02-11.html and the more modern http://en.wiktionary.org/wiki/yak_shaving

    Part of the reason the problem persists in free software is because the community is so tolerant of it they don’t even notice it. See the comment above that you should just install a new version of linux, because it will “only take an hour”.

    The upside of yak shaving is that you generally learn some things you never would have otherwise, and often they come in handy later.

  30. Kaleberg Says:

    That’s hilarious, but horribly typical.

    It is no different in the C world. I remember trying to compile the gtk library and wound up having to download a dozen libraries, all needing to be recompiled from source. Amazingly, I got the entire thing to compile and link, but it never worked quite right, and I was damned if I was going to debug it.

  31. Rob Says:

    On Linux install taking an hour: Sure, if you include installing everything else you had before and completely from scratch, but I have found the basic Ubuntu install takes maybe 20 minutes at most. If you have a distro with good package management then upgrading to the newest version is trivially fast, maybe 5-10 minutes, and you still have your entire configuration intact.

  32. Dan Weinreb Says:

    @Rob: I’m sure you’re right. What will take some time is peripheral activities such as relocating my home dir from one place to another, getting my environment set up regarding networking and apps and files in the right place, and all that. It’s probably all straightforward but it will surely require some learning, trying out, and time. Fortunately I have many helpful and friendly co-workers who will be forgiving about helping a stumbling newbie like me.

  33. gwern Says:

    @David House: well, there used to be asdf-install http://www.cliki.net/ASDF-Install but I don’t know how polished or used it is compared to cabal-install.

  34. Scott L. Burson Says:

    There is a CL darcs client, cl-darcs, which looks like it would have been adequate for this purpose (I think it can pull but not push). I haven’t tried it, though.