My Summer Vacation in Ecuador

September 1st, 2008

I just returned from a two-week vacation in Ecuador, visiting the rain forest and the Galapagos.  There’s a lot you can read on the web about visiting Ecuador, so I won’t repeat any of that.  Here are some of my own experiences and hints.

The trip was booked by REI and they did a great job.  They were very helpful and informative, and all of their recommendations were good.

In Quito (the capital of Ecuador, where we spent a few days), we stayed at a hotel called Hotel de la Rabida. We loved it.  It’s small and has pleasant, cozy public spaces.  The rooms are very clean and functional, the food is good, the people are helpful, and the owners are very friendly. In the USA, we often are faced with the choice between a cheap chain motel, and a super-expensive business hotel; this is the kind of place we’re always looking for.

We spent several days at La Selva Jungle Lodge in the rain forest.  It’s as advertised: great!  Our guide has a degree in Ecotourism and has been doing it for nine years.  He was very friendly and helpful.  He spotted all kinds of animals and birds, and explained a lot about all the life in the jungle.  Our best sighting was of an armadillo, which was very exciting.  They’re rarely seen, particularly since they’re usually nocturnal.  We also visited a site where parrots and parakeets come to eat clay (scientists aren’t totally certain why they do that), and it was amazing to see hundreds (perhaps a thousand) of them warily and very gradually come down from the trees to where the clay is.

The rainforest is hot and very, very humid.  (We were there during the dry season.)  I had trouble sleeping.  I also had some sickness that I attribute to a reaction to Malarone, an anti-malaria medicine.  (Although my wife and son did fine on both scores.)  When I was there, I was told that there haven’t been any malaria cases in many years, so the Malarone wasn’t really needed.

Despite the economic problems in Ecuador of the past decade, the capital city of Quito looked to me as if it’s in very good shape.  It’s very clean, the roads are in excellent repair, there’s all kinds of business, and some very nice public sculpture, indicating a good amount of municipal care. We found the Ecuadoran people — not just the tourist-industry people but even those we just ran into — to be pleasant, and forgiving about our ignorance of Spanish. The only troublesome thing was that there were a lot of police and security guards around, with guns and bullet-proof vests. The area we were in didn’t look dangerous, but we were told not to talk around after dark.

Before I left, I heard varying reports about the water temperature around the Galapagos islands, and whether wetsuits were needed for snorkeling.  Most of us turned out to need full wetsuits.  The cruise boat I was on provided these, so it was no problem.  Once a sea lion decided to play with me, swimming straight at me and veering off at the last moment (twice), which was very cool.  A giant sea turtle swam below me, only about a foot down!  We also saw small sharks and, of course, lots of very pretty tropical fish.

We were very happy to see that The Galapagos National Park is being quite careful about taking care of the islands. These days there are a lot of visitors, and it’s important to make sure that they don’t disturb or harm things.  There are lots and lots of good rules: no food, no touching the animals, and so on. Visitors are always in small groups led by official natuarlists, who make sure that the rules are followed. Explicit paths are everywhere, to make sure you don’t trample nests and such.  The Ecuadorans are working hard to be good stewards, eradicate introduced species (every one of which damages the ecology), and do good science to understand the plants and animals better.  (Obviously they have a strong economic motivation here, as tourism is one of their major industries, but thery’re doing well by doing good.)

The cruise operation was run by Ecovertura (also see the very accurate review in the New York Times), and the guides were very helpful, friendly, and experienced.

The only really annoying part of the trip was going home.  Our flight had six legs (takeoff and landings)!  (This is mainly because the international part and the national part are set up separately.)  I’m not able to sleep on redeyes, and at the moment I have been awake for over 36 hours.  But it was worth it.

These Are a Few of my Favorite Sites and Applications

August 11th, 2008

The Web abounds with valuable free services. Here are some of my favorites, entirely free unless otherwise noted.

Jott.com: You call their toll-free number. A voice says “Who do you want to Jott?” You say “myself”, or a name that you have registed on their web site. The voice says “Jott yourself” or the name of the recipient. You speak a message. That’s it. Jott sends email to the recipient, containing a transcript of what you said, plus a link to the audio recording in case the transcription isn’t good enough. My car has a voice-activated feature to make phone calls — it talks Bluetooth to my cell phone. So when I’m listening to the radio on my commute, and I hear something intersting that I want to follow up on, I just press the “speak” button on my steering wheel, say “Dial Jott”, and talk to Jott. It sure beats trying to scribble notes during red lights.

AdBlock: The AdBlock Plus extension for Firefox really works. I block ads because they take too long to download (making my effective browser response time much worse), and the animated ones are much too distracting. It’s easy to turn ad blocking off selectively.

Google Toolbar for Firefox: This Firefox extension has a command called AutoFill, that can fill in my name and address and such in most web pages that ask for it. It saves a lot of tedium.

Google Calendar: I use this to track all my meetings and appointments. I can get at it from work and from home. Sometimes it seems to be somewhat “down”, not allowing new entries to be made, but this is rare enough to be acceptable. The Ajax UI is done very well. (I used to use the Lightning plugin for Thunderbird, but sharing over the web is important to me.)

Pandora: You tell it what music you like, and it provides a “radio station” that plays the sort of music you like. It’s amazingly good at choosing what to play; I would never have believed it. It has found new artists that I like a lot and otherwise would never have heard of. I almost always have this on when I’m working at home.

Xconomy: A news magazine covering hi-tech in the Boston (and now Seattle) area, with very high quality reporting. I read it every day to keep up with what’s going on around here. (Full disclosure: Xconomy is a Common Angels company.)

LinkedIn: This is the only “social network” that I value. I use it to keep track of where all my old friends and co-workers are, what they’re doing professionally, and what their latest email address is. And when I hear about someone in hi-tech, I often look them up to learn more about them.

TimeBridge: A free service that helps you set up meetings between many people, finding times that are available for everyone. It’s very easy to use.

Carbonite: Automatic backup over the web. It’s very, very easy to use. (Full disclosure: Xconomy is a Common Angels company. The best competitor is Mozy, which my friends say is also very good.) It’s not free, but it’s well worth it.

AxCrypt: Simple file encryption and decryption. I mainly use this to send encrypted email to a friend with whom I share a passphrase. One of these days I’ll learn how to use the OpenPGP facility provided by the Enigmail Thunderbird add-on, if I find anyone else who is using it and with whom I have secrets to discuss.

MPEG Streamclip from Squared 5: This one is a free utility for MacOS X. I use my Mac to edit video (with Final Cut) into DVD’s, for the North Cambridge Family Opera Company. MPEG Streamclip can “rip” video off (unencrypted) DVD’s and produce virtually any format, including the one that YouTube likes.

Kindle tips: There are lots of source of free books (legal) and other resources for the Amazon Kindle on this page. My family is about to leave for a vacation trip in which we can only bring a limited amout of luggage. We usually bring big piles of books on vacations, but it’s impossible this time. So we got a Kindle. In fact, we got two (his and hers).

xkcd.com: My favorite web comic, and the only one I follow. “A webcomic of romance, sarcasm, math, and language.” Computer hacking, too. Randall has more profound or funhy things to say about the intersection of science/math/technology and romance/relationships than anyone else. There’s an archive of all the past comics. He has three comics about Lisp, all hilarious. Buy stuff from his store: that’s his only source of income (no ads!).

Enjoy!

Advice About Sending Email

July 11th, 2008

I’ve been using email on the Internet (and its predecessor, the ARPAnet) for 32 year.  Here’s some advice from my experience.

The Prime Directive: Never send email when you’re angry.  Never, ever.  It always backfires and you always regret it. Trust me on this.

The rest of these recommendations apply primarily when you’re sending mail to anyone who isn’t a close friend.

Do not use sarcasm on mailing lists.  Remember your tone of voice is not available to indicate that what you’re saying is sarcasm.  Inevitably, a few people on the list will take what you say literally, and then you’ll have to underke the boring job of correcting everyone’s misimpression.

Be very polite.  You almost can’t be too polite.  Because your facial expressions and tone of voice are not present, it’s easy to write something that will seem demanding or commanding.

Know the difference between “Reply” and “Reply All”, and be careful to always use the appropriate one.

Be careful to address your mail to the right person!  The automatic name-completion feature in many of the good mail clients can sometimes complete to a name that’s not what you expected.

Some people have separate home and company email addresses.  Send personal mail to the home address.

Be careful about giving out someone else’s personal email address.  Some people do not like to have their email addresses be well-known.  So treat anyone else’s email address as if it were confidential information, until you get permission to distribute it.

When sending mail to many individuals, address the mail to yourself, and BCC it to everyone else.  This way, the recipients cannot see the email addresses of the other recipients, thus protecting their privacy.

Make your subject lines descriptive and clear.  If you’re replying, keep the same subject line (don’t worry about the “Re:”) so that mail readers can see which mail is grouped with which.  If you’re communicating with friends, clever subject lines can be quite an art form and source of innocent merriment.

Save away all of your interesting email.  It’s very handy to be able to refer to it when you subsequently communicate with the same person, or company.

Keep your email on your own computer.  Leaving it on a net server is too much of a risk to your privacy.  Even if you like Google (which I do) and trust them (which I pretty much do), you never know if conditions will change in the future, and by then it’s too late.

For very sensitive email, encryption is a good idea.  Sadly, there isn’t an easy-to-use standard.  I use the free version of AxCrypt from Axantum ; it’s only for Windows, unfortunately.  There are plenty of others.  Of course, the person to whom you are sending mail must also install the software, and you must have a shared passphrase.  As long as you’re going to the trouble to encrypt, use a long passphrase for better security.

Please feel free to use the Comments below to add other good advice.

Perst, An Embedded Object-Oriented Database Management System

June 8th, 2008

I just learned of Perst, which is described as an open-source embedded object-oriented Java database (meaning database management system, of course) which claims ease of working with Java objects, “exceptional transparent persistence”, and suitability for aspect-oriented programming with tools such as AspectJ and JAssist. It’s available under a dual license, with the free non-commercial version under the GPL. (There is a .NET version aimed at C# as well, but I’ll stick to what I know; presumably the important aspects are closely analogous.)

I have some familiarity with PSE Pro for Java, from what is now the ObjectStore division of Progress. Its official name is now Progress ObjectStore PSE Pro. I’ll refer to it as PP4J, for brevity. I was one of its original implementors, but it was substantially enhanced after I left. I also have a bit of familiarity with Berkeley Database Java Edition, a more recently developed embedded DBMS, which I’ll refer to as BDBJE.

PP4J provides excellent transparency, in the sense that you “just program in Java and everything works.” It does this by using a class-file postprocessor. However, Perst claims to provide the same benefit without such a preprocessor. It also claims to be “very fast, as much as four times faster than alternative commercial Java OODBMS.” Of course, “as much as” usually means that they found one such micro-benchmark; still, it would be uninteresting had they not even claimed good performance. And it has ACID transactions with “very fast” recovery.

Those are very impressive claims. In the rest of this post, I’ll examine them.

Who Created Perst?

I always like to glance at the background of the company. In particular, I like to know who the lead technical people are and where they worked before. Unfortunately, the company’s management page only lists the CEO, COO, and Director of Marketing, which is rather unusual. They’re in Issaquah, WA; could the technical people be ex-Microsoft? It’s important to note that McObject’s main product line, called eXtremeDB(tm), is technically unrelated to Perst.

But I found a clue. The Java package names start with org.garret. It’s usually hard to retroactively change Java package names, so they’re often leftovers from an earlier naming scheme. By doing some searches on “garret”, I found Konstantin Khizhnik, a 36-year old software engineer from Moscow with 16 or so years experience, who has written and is distributing an object-oriented database system called “GOODS” (the “G” stands for “Generic”). His most recent release was March 2, 2007. He has a table that compares the features of GOODS with those of other systems, including Perst. At the bottom it says: “I have three products GigaBASE, PERST and DyBASE build on the same core.” He also has an essay named Overview of Embedded Object Oriented Databases for Java and C# which includes an extensive section on the architecture of Perst. This page also has some micro-benchmark comparisons including Perst, PP4J, BDBJE, and db40, but not GOODS. Perst comes out looking very good.

He even has a table of different releases of several DBMS’s, including GOODS and Perst, saying what changes were made in each minor release! But at no point does he say that he was involved in creating the Perst technology.

He mentions the web site perst.org. There’s nothing relevant there now, but Brewster’s Wayback machine shows that there used to be, starting in October, 2004. It’s quite clearly the same Perst. And the “Back to my home page” link is to Knizhnik’s home page. Aha, the smoking gun! By December, 2005, the site now mentions the dual license, and directs you to McObject LLC for a non-GPL commercial license. In 2006, the site changes to the McObject web site. McObject has several other embedded database products and was founded in 2001. This strongly suggests that McObject bought Perst from Knizhnik in 2006.

I joined the Yahoo “oodbms” group, and there’s Knizhnik, who is apparently maintaining FastDB, GigaBASE, and GOODS. He also wrote Dybase, based on the same kernel as GigaBASE. He announced Perst Lite in October, 2006. Postings on this group are sparse, mainly consisting of announcements of new minor releases of those three DBMS’s

The Tutorial
The documentation starts with a tutorial. Here are the high points, with my own comments in square brackets. My comparisons are mainly with PP4J, which likewise provides transparent Java objects. BDBJE works at a lower level of abstraction. [Update: BDBJE now has a transparent Java object feature, called DPL.] I don’t know enough about the low-level organization of BDBJE or of the current PP4J to make well-informed qualitative comparisons.

Perst claims to efficiently manages much more data than can fit in main memory. It has slightly different libraries for Java 1.1, 1.4, and 1.5, and J2ME. There is a base class called Persistent that you have to use for all persistent classes. [This is a drawback, due to Java's lack of multiple inheritance of implementation. PP4J does not have this restriction.] They explain a workaround in which you can copy some of the code of their Persistent.java class. [That sounds somewhat questionable from a modularity point of view, and doesn't help you for third-party libraries unless you want to mess with their sources.]

Files that hold databases can be stored compressed, encrypted, or as several physical files, in no file at all for in-memory use, and there’s an interface allowing you to add your own low level representation. Each database has a root object in the usual way. They use persistence-by-reachability [like PP4J]. There is a garbage collector for persistent objects. However, there is also explicit deletion; they correctly point out that this can lead to dangling pointers. [The fact that they have it at all suggests that the garbage collector is not always good enough.]

There are six ways to model relationships between objects. To their credit, they have a “(!)” after the word “six”. You can use arrays, but they explain the drawbacks to this. The Relation class is like a persistent ArrayList. The Link class is like Relation but it’s embedded instead of being its own object [huh?]. The IPersistentList interface has implementing classes that store a collection as a B-tree, which is good for large collections but has high overhead for small ones. Similarly, there is IPersistentSet. And finally there is a collection that automatically mutates from Link to a B-tree as the size gets larger. [PP4J, I believe, offers equivalents of the array, the list, and the set, and the list and set do the automatic mutation.]

How can they do transparent loading of objects, i.e. following pointers? They give you two choices, which can be set as a mode for each object: load everything reachable from the object, or make the programmer explicitly call the “load” method. They claim that this is usually sufficient, since your Java data structures usually consist of clusters of Java objects that are reachable from one head object, with no references between such clusters.

They assume that you always want to read in the whole cluster when you touch the head object [often true, but not always]. Also, when you modify an object, you must explicitly call the “modify” method, unless this is one of Perst’s library classes, whose methods call “modify” on themselves when needed. They say “Failure to invoke the modifymethod can result in unpredictable application behavior.”

[This is not what I would call "transparent"! PP4J is truly transparent, in that there is neither a "load" nor a "modify". PP4J always does these automatically. The Perst tutorial does not say what happens if you forget to call "load" when you were supposed to. Not all Java data follows their cluster model. PP4J depends for its transparency on the class postprocessor. As I recall, the postprocessor runs quickly enough that it doesn't seriously impact the total compile time. The only problem I had with it, as a user, was that it doesn't fit with the model assumed by the interactive development environments such as IntelliJ, requiring some inelegance in setting up your project.]

Perst has collections with indexes implemented as B-trees and allowing exact, range, and prefix searches. The programmer must explicitly insert and delete objects from indexes; if you make a change to some object that might affect any indexes, you have to remove the object from the index and re-insert it. [So you need to know in advance which things might ever be indexed, or pessimistically assume that they all are, and so remove and re-insert whenever the object is changed in any way. [I am pretty sure that PP4J does this automatically.] You can index on fields directly, or you can create your own index values (since you’re inserting explicitly) that could be any function of the indexed object. [That's very useful, and I cannot remember whether PP4J provides this.] Keys can be compound (several fields). They provide R-tree indexes and KD-tree indexes, useful for 2-D operations such as finding a point within certain constraints. They also provide Patricia Tries, bit indexes, and more. [Wow, how do they fit all that into such a small footprint?]

Transaction recovery is done with a shadow-object mechanism and can only do one transaction at a time. (So ACID really means AD.) [Like PP4J, at least in its first version.] The interaction of transaction semantics with threads, always a sticky issue, can be done in several ways, too extensive to go into here. [This looks very good.] Locks are multiple-reader single-writer. Locking is not transparent in basic Perst [Bad!], but there’s a package called “Continuous” which does provide transparency, although it’s not described in the tutorial. [So beginning users also have to remember to do explicit locking?] Two processes can access a single database by locking the whole database at the file system level; it works to have many readers.

There is a class called “Database” that provides semantics more like a conventional DBMS. It maintaints extents of classes. [Note: that means instances of the class can never be deallocated.] It can created indexes automatically based on Java annotations, but you still must do manual updates when the indexed object changes. It uses table-level locking. It has a query lanauge called JSQL, but it’s unlike SQL in that it returns objects rather than tuples, and does not support joins, nested selects, grouping, nor aggregate functions. You can “prepare” (pre-parse) JSQL queries, to improve performance if you use them many times, just as with most relational DBMS’s. A JSQL query is like a SQL “where” clause, and it uses whatever existing indexes are appropriate.

Schema evolution is automatic, and done per-object as the object is modified. It can’t handle renaming classes and fields, moving fields to a descendant or ancestor class, changing the class hierarchy, nor changing types of fields that aren’t convertible in the Java language. There’s an export/import facility that you’d use for those changes. You can physically compact the database files. Backup and restore are just like files [you have to back up the whole thing, not just changes, which is probably true of PP4J as well.] You can export to XML.

Perst supports automatic database replication. There’s one master, where all writes are performed, and any number of slaves, where reads can be performed. This lets you load-balance reads. It’s done at page granularity. You can specify whether it’s synchronous or asynchronous. You can add new slaves dynamically. For high-availability, it’s your responsibility to detect node failure and choose a new master node. [PP4J did not have this, the last time I looked.]

Recent Press Releases from McObject

Version 3.0 has new features. There is a full-text search, much smaller in footprint than Lucene and geared specifically to persistent classes. The .NET version supports LINQ. They list CA’s Wily Technology as a successful reference customer, for a real-time Java application.

Perst is used in Frost, which is a client for Freenet. “Frost is a newsgroup reader-like client application used to share encrypted messages, files and other information over Freenet without fear of censorship.” They switched from “the previous used SQL database” [it turns out to be something called McKoi] because its recovery was unreliable (leaving corrupt databases), Perst’s schema evolution is easier to use, the databases are smaller (because Perst can store strings in UTF-8 encoding), and because they could get better performance from Perst as the database size grew.

Perst has been verified as compatible with Google’s Android. They provide a benchmark program comparing performance of Perst against Android’s bundled SQLList. It’s a simple program that makes objects with one int field and one String field, and an index on each field. It inserts, looks up, etc. [It would be easy to recode it for PP4J or for B2B4J.]

The Download

The basic jar file is 530KB. “continuous” (see above) is another 57KB.

There’s more documentation, which goes into great detail about the internal implementation of how objects are represented and how atomic commit works. [It's extremely similar to PP4J. (The original version, anyway; it might have changed since.)]

There are other features, for which I could not find documentation. For intsance, each persistent class can have a “custom” allocator that you supply. You could use this to represent very large objects (BLOB/CLOB) by putting them in a separate file. In the database, you’d store the name of this file, and perhaps an offset or whatever. Also, there is an implementation of the Resource Description Framework (RDF, used by the Semantic Web to represent metadata).

There are lots of parameters that you can set from environment variables. I was not able to find documentation for these. The one that interests me most lets you control the behavior of the object cache. The default is a weak hash table with LRU behavior. Other possibilities are a weak table without the LRU, a strong hash table (if you don’t want to limit the object cache size), and a SoftHashTable which uses a Java “soft” hash table.

The code is clearly written except that it’s extremely short on comments.

Overall Evaluation

Perst is a lot like PP4J. To my mind, the most important difference is the degree of transparency. I greatly prefer PP4J’s approach of providing complete transparency, i.e. not requiring the use of methods such as load and modify. This has two advantages. First, your code is clearer and simpler if isn’t interrupted by all those calls to load and modify. Second, without transparency, it’s far too easy to forget to call load or modify, which would cause a bug, in some cases a bug that’s hard to find. Another problem is that the reference documentation is clearly incomplete and needs work. The tutorial, though, is quite clear and professionally-written, and very honest about the tradeoffs, pros, and cons of the product design. Personally, if you want to my respect, that’s how to do it!

However, it has a bunch of features and package that PP4J doesn’t (as far as I know).

I don’t know anything about the pricing of either product.

On the whole, for what it’s aiming for, Perst appears to be a very good, and a real competitor in this space.

This Is Your Brain On Music

April 30th, 2008

This Is Your Brain On Music, by Daniel J. Levitin, is the most exciting science book that I’ve read in a long time. It’s all about music: what is music, how do we perceive music, why do we care about music, and, primarily, what do we know about how the mind and brain react, process, and create music.

Some facts that I learned:

If I put electrodes in your visual cortext, and then I showed you a red tomato, there is no group of neurons and will cause my electrodes to turn red. But if I put electrodes in your auditory cortext and play a pure tone in your ears at 440 Hz, there are neurons in your auditory cortex that will fire at exactly that frequency, causing the electrode to emit electrical activity at 440 Hz — for pitch, what goes into the ear comes out of the brain! I find this amazing.

If you’re familiar with the phenomenon of restoration of the missing fundamental, in which you perceive the fundamental pitch if you are played only overtones of the pitch: it turns out that you can put in an electrode, play music with the fundamentals missing, and the electrode actually shows energy at the fundamental frequency! The very fact that we can know things like this is exciting.

Ordinary people, when asked to sing a song (of which there is one well-known canonical recording, such as most pop songs), will sing back the song at almost the exactly correct tempo! (They are accurate within 4%, which is as good as most people can perceive anyway.) They also often get the key right, even though few people have “perfect pitch” per se. I would never have guessed this.

The brain stem and the dorsal cochlear nucleus — structures so primitive that all vertebrates have them — can distinguish between [musical] consonance and dissonance; this distinction happens before the higher level, human brain region — the cortex — gets involved.

The book is extremely readable and fun. It teaches you all the music theory you need to know. In fact, his basic music theory section is the best quick introduction to music theory I’ve ever read. The author has been a professional producer, so he knows a lot about how modern music recordings are made. He currently runs the Laboratory for Musical Perception, Cognition, and Expertise at McGill Unversity, and has published a lot in serious scientific journals. That’s a combination of expertise that may be unique. He knows several well-known musicians and quotes from them; what Stevie Wonder and Joni Mitchell have to say is quite interesting. The book is available in trade paperback for only $15 US.