Archive for December, 2007

The Technology and Business of ObjectStore

Monday, December 31st, 2007
news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

This is a follow-up to my previous article about the success of OODBMS’s, and ObjectStore in particular. For people interested in the more technical story behind the ObjectStore object-oriented database management system, here are some stories that you might enjoy. You’ll see why it was harder to do than we had originally anticipated. There are also stories about problems with the business, with some cautionary tales that you could take into account the next time you start a company.

I’ve been involved with or heard about many high-tech startups. Nearly always, the product turns out to appeal to a set of customers who aren’t the ones the founders originally had in mind. Smart founders dynamically adjust. We found that our customers’ technical requirements varied somewhat, and we had to make a lot of improvements and changes to the product to meet these new requirements. That took a lot of engineering talent.

This essay includes very substantial contributions by my colleagues, which I have tried to organize into a cogent whole. Contributors, in alphabetical order:

Gene Bonte: Co-founder, CFO
Sam Haradhvala: Co-founder
Guy Hillyer: Senior engineer
Charles Lamb: Co-founder
Benson Margulies: Senior engineer and head of porting (after Ed)
Dave Moon: Senior engineer
Jack Orenstein: Co-founder
Mark Sandeen: Senior salesperson
Ed Schwalenberg: Senior engineer and head of porting
Dave Stryker: Co-founder, VP of Engineering

Porting Was Hard

We knew that porting ObjectStore was going to be hard. Dave Stryker recalls: “That was the thing we talked about most during the crucial first three months when we were working out the implications of the architecture.” However, by the time all was said and done, it turned out to be more work than we had originally anticipated.

We ported ObjectStore to an amazing number of architectures: many versions of Windows, many flavors of Unix, OS/2, you name it. I can hardly remember them all. Worse, we often had to do a port simply because a vendor produced a new C++ compiler! So we’d have a version for Solaris on the SPARC with C++ version 4, and another for Solaris on the SPARC with C++ version 5, and so on. We did ports to hardware that never made it big, like the NeXT, and hardware that never even reached the market. (What, you don’t remember the Canon workstation? As Mark Sandeen, one of our best salespeople, points out: “We never should have spent the time to port to platforms with minuscule market share.”) And every so often our sales force would book a sale on a platform that we didn’t actually support. Quick guys, get to work! Our porting group pulled off miracles, but all this took up a lot of engineering talent.

Ed Schwalenberg reminds me that “another bane of our porting existence was the set of orthogonal choices to be made in compiling a library: threads vs. non-threaded, shared vs. static libraries, 32- vs. 64-bit instructions, exceptions vs. non-exceptions, etc. All of those were in addition to the choice of compilers.”

By the way, the first thing that would happen whenever we did an ObjectStore port is that we would discover bugs in the vendor’s C++ compiler. Every single time! As Ed Schwalenberg says: “We were the world’s C++ compiler quality assurance department for a decade.”

Dave Moon points out: “A lot of the early technical problems in ObjectStore were caused by our building on very immature products from other vendors. Since they weren’t open source, we could not work around problems, and had to wait for the vendors to fix them. This is inherent in working at the bleeding edge.”

Fun fact: In the early days of C++, the designers at Bell Labs came up with a specification for the first version of parameterized types. This was of great interest to us, since we wanted to support “a set of Transistors” so that we could query over such a set, and so on. At that time, there was only one C++ implementation, from Bell Labs, known as “cfront”, which translated C++ to C. The guys at Bell Labs apparently were not good enough compiler hackers to implement parameterized types in cfront. So we did it for them (I believe Sam Hardhvala did the work) and gave the code back to them and the world, in an early instance of de facto open source collaboration. We got a nice press release out of it. We were very much among the world’s C++ experts at the time.

We also kept finding operating system bugs. ObjectStore needed to be able to create a “cache” file, and map each page, page by page, into the appropriate virtual address, and control its access permissions, using the Unix “mmap” and “mprotect” and “munmap” system calls. Then the application program would attempt to read or write a no-access page, or write a read-only page, causing a SIGSEGV fault. Our SIGSEGV handler would then, analogously to a page fault handler, figure out what had occurred, and do whatever needed to be done: fetch the page from the server if necessary, map the page into address space if necessary, set the access permissions, wait for locks when necessary, and so on, finally resuming the program. This was supposed to work in Unix, but Ed Schwalenberg says: “Recovering from a SIGSEGV did not work in any of the first dozen or so platforms we tried it on: Sun’s SunOS, IBM’s AIX, HP’s HP/UX, Digital Unix, OS/2, and the analogous thing on Win16, Win32s, and Windows NT. Every last one of these required a conversation with the relevant kernel development team to get the operating system fixed. Win16 and Win32s didn’t even have the concept of user-mode interception of memory faults, so we had to write kernel-level device drivers to add that capability. Also, SIGSEGV handling did not work recursively, anything that had to work inside a SIGSEGV handler could not, itself, take a SIGSEGV (this is fixed in modern versions of Unix and Windows).”

Here’s a story of an operating system bug. Solaris writes out all modified pages every N seconds. The ObjectStore “cache” file could get pretty big, and had lots of modified pages, but there was no need to write them out to the disk, since the file was discarded after a crash anyway. We acquired a customer, Telstra in Australia, who needed real-time response: ObjectStore was invoked after a customer dialed a special phone number, to look up another phone number, and the phone switch had unforgiving time limits. Sun suggested that we put the cache into a special “tmpfs” file system. Files in “tmpfs” aren’t written out, because they’re known to be temporary. That made perfect sense. Unfortunately, we got rare and unrepeatable weird bugs, which finally turned out to be because the SIGSEGV/mmap/mprotect feature almost worked on “tmpfs” file systems, but not quite. We got around it somehow, but I can no longer remember how.

We found that Solaris was taking a very long time to execute mprotect system calls. It turns out that the architects of Solaris had apparently assumed that there would be very few mapped regions of memory. They had not anticipated our architecture, which mapped a huge number of pages independently. So they were using a simple linear search. Guy Hillyer wrote an improvement to Solaris, using skip lists to make the search run in O(log n) time. The hard part was the politics getting Sun to accept our changes to Solaris! We only did this for Solaris, which was then our primary platform. (Maybe it should be done for Linux?)

When the new Windows technology (which was OS/2 at the time; IBM and Microsoft were still working together on it) came out, it was crucial for us that it be able to support memory mapping. Dave Stryker and Tom Atwood, flew out to meet with Bill Gates in September of 1989. Dave Stryker recalls: “We originally had a 45-minute appointment, but Gates extended the meeting to a couple of hours, and called in Dave Cutler [the architect of OS/2]. At Tom’s urging, we told Gates and Cutler everything they wanted to know about ObjectStore. Gates was complimentary of the Object Design approach, but said, in a nice enough way, that if the Microsoft Empire ever needed such a thing, they would build it themselves. Still, Gates told Cutler to make sure that the OS/2 equivalent to mmap was powerful enough to run ObjectStore, and there were some changes made to make it so.” Later, this OS/2 technology turned into Windows NT. Dave Moon adds that it turned to have a bug: it doesn’t free up disk space when it ought to. For some reason Microsoft hasn’t fixed this, even after many years. We found a way around it.)

Speaking of industry luminaries, we also met with Steve Jobs when he was at NeXT, and Jobs made a big announcement praising our technology, which resulted in a nice press release. There was some discussion that NeXT might buy Object Design, but that never went anywhere.

It turned out to be hard to support customers who wanted to use the same ObjectStore database from many different client architectures. We had to support what we called “heterogeneity”. First there was “architecture hetero”: some machines have big-endian numbers and some have little-endian numbers, and we’d have to convert, for example. Much worse was “compiler hetero”: different C++ compilers represented C++ objects differently in memory, due to run-time compiler “dope”, padding, and so on. Objects were not even the same size in different compilers, which was a huge problem. We had to know every last thing about how objects were laid out, where the compiler put padding, where the compiler put “dope” information such as “vtbl pointers” and various displacement offsets, etc. Our engineers came up with clever solutions to these problems, but it was hard and used up a lot of engineering talent. I think if we had realized that we’d run into this problem, originally, we might have never started the company at all, thinking the technical issues too daunting. It’s a good thing we didn’t think about it then!

The Virtual Memory Mapping Architecture

Was the page-mapping, virtual-memory mapping architecture worth it? Mark Sandeen says: “In competitive situations, against the other OODB companies, we sold on performance, performance, performance. Plus the fact that you got that performance by using an elegant architecture that was fundamentally different from anything our competitors had or ever would have. We used our incredible engineering team to win the benchmark wars, and then told our customers that the reason we won the benchmarks was the 2nd generation OODB architecture.” Sam Haradhvala, says: I still find the architecture almost as appealing as on day one of the company, and feel very lucky that we had a chance to see it realized in a product.

It would have been easier to port had we not gone for transparent persistence, and the goal that dereferencing a pointer was done in one instruction, exactly as in a non-persistent program. None of our competitors did this; for C++, they used the “overloaded operator ->” approach, in which dereferencing a pointer did a software operation that usually consisted of going through an indirection in an object table. Our justification was that CAD people would never tolerate a slowdown in the time it took to redisplay a drawing. So once the pages were faulted in, C++ operations would run at full speed. This led to all kinds of pros and cons. Concurrency control was totally transparent and foolproof; on the other hand, it was at page granularity, causing unnecessary conflicts sometimes. We didn’t expect this to be a problem in the classic CAD scenario since we imagined designers would usually not be working on the very same drawing at the same time. But other scenarios did run into this sometimes. However, difficulty of porting was our own problem, not our customers’ problem, so they didn’t know or care.

Dave Stryker recalls some more reasons we stuck with our original idea of using the memory-mapping architecture. “First, our competitors had staked out the strategy of overloading C++ dereferences. Object Design came into existence after Versant (then Object Sciences) and Objectivity, and needed to be differentiated from the competition. Second, our approach was really clever, and won many ideological converts based on cleverness alone. We could usually count on the smartest guy in the room being an ally, because using faulting was such an impressive intellectual accomplishment. Third, we had really smart engineers who enabled us to undertake obligations, particularly porting obligations, that with more prudence we might have avoided. With engineers that talented, you need really disciplined, far-sighted top management, because in the short term it’s perfectly clear that engineering can work miracles. It’s only in the longer term that the cumulative miracles sap all the capacity of engineering.” (Mark Sandeen also remembers that we were the last of these three startups, whereas Gene Bonte says we all started at the same time and remembers a lot of details about it.)

Dave Stryker says: “As you say, for the largest early customers database meant concurrency, and at that point at least it was difficult to avoid concurrency conflicts among simultaneous users. In my memory, it seemed that there were often fires burning because customers had trouble getting ObjectStore to work concurrently. I know this got a lot better in the years after I left Object Design.” A main technical problem is that locks were at the granularity of pages, so sometimes ObjectStore thought that there was a concurrency conflict even though there really wasn’t, and that would hold up processing until the other transaction was finished. This is inherent in the virtual memory mapping architecture. Our competitors often pointed out this drawback.

He goes on: “I’ve certainly wondered if the architectural choice of page faults and native-format on-disk objects was the right one. I was an enthusiastic booster of the page fault architecture, but it certainly made porting, multi-architecture access, schema evolution and so on much, much harder. [Ken Rugg says that Dave Moon has made huge improvements in schema evolution in the latest releases.] Certainly, the page fault / native object on disk architecture was instrumental in many of the CAD industry wins.” And, “The open-source industry makes me wonder what’s the future of software products like ObjectStore. At Multiverse where I work now, the large majority of libraries and development tools we use are open-source. The only things we buy right now are Microsoft licenses for Windows boxes and 3D modeling tools. The database is mySQL, and it’s going to be a fine solution for a fairly long time, because gaming isn’t hugely database intensive, even though the gaming objects would map naturally to an object database. In many product areas today, the best and/or most successful products are open source.” The whole concept of open source wasn’t around when we started 1988. (Neither were Unix threads. Nor was Windows. ObjectStore was aimed at the class of computers then known as “workstations”, primarily the Sun-3.)

Sam Haradhvala says: “I have often wondered like Dave and others on this thread whether the use of page faults and native on-disk representation was the correct one. It seems that it was the right choice at that time and conferred some rather unique advantages. Given the current state of technology and the hot issues of today, the limited flexibility inherent in the approach might very well dictate a different set of choices.” But he also says: “I still find the architecture almost as appealing as on day one of the company, and feel very lucky that we had a chance to see it realized in a product.”

Ed Schwalenberg also points out that our architecture, by doing so many things transparently, avoided huge numbers of bugs, much as languages with automatic storage management (e.g. garbage collection) save you from bugs in storage allocation and deallocation.

Ken Rugg notes that we had always intended to do some kind of declarative mechanism to help support clustering and reclustering, since that’s so crucial for delivering ObjectStore’s performance advantages. That still hasn’t been done yet, and perhaps never will be as the importance of C++ continues to decline.

Fun story: Ed Schwalenberg reminds me of a truly vexing case we ever ran into with the virtual-memory mapped architecture. The program went into a mysterious infinite loop. Guy Hillyer figured out that it had a single machine instruction that had both source and destination operands in ObjectStore-managed persistent memory, in two different “versions” (when we were trying to support a very sophisticated database versioning feature). Fetching from the source, in one version, was making the destination, in a second version, out of reach. Retrying would fault in the destination, putting the source out of reach, and so the single instruction could never make progress.

Performance

The high performance that we designed ObjectStore for really did come out as we expected it to. If your data had good spatial and temporal locality, and especially if concurrent access was relatively rare, it was extremely fast.

However, it turned out that it was not so easy to anticipate the performance that would result from using it in certain ways. Sometimes customers would come to us literally a week before they wanted to deploy their product. They had just tried running it under heavy load, or with multiple users, for the very first time (yes, a week before they planned to deploy!), and all of a sudden ObjectStore was becoming a bottleneck. We had some amazingly competent consultants, who could fly in and fix these problems for the customers very quickly, but not before there was some anger from the customers. Mark Sandeen goes so far as to say that few of our customers were able to build a deployable application without help from our consultants, which limits the scalability of the business model.

Charles Lamb points out: “I think this happens in any database company.” Indeed, there is a whole industry of Oracle experts; we have engaged several at my current company. Ed Schwalenberg says: “ObjectStore made it easy — too easy — for any C++ programmer to write a “database application”, while being ignorant of concepts like lock contention, database hot spots, etc. It was folks like that, who never tested more than one user until a week before launch, who sometimes gave us a bad name.” Everybody out there, take heed: do testing under serious performance load way, way before you’re going to release your product!

Sam Haradhvala, who has had extensive real-world experience with relational databases in the last few years, remembers: “ObjectStore was characterized as being like a Ferrari, which if tuned right by the experts could be made to run like one. Tuning an application, almost as an afterthought, is a common practice even in the relational database world. ObjectStore did make it easy for people to write database applications, without worrying about lock contentions, database hot spots, etc, but so do SQL and PL/SQL. So what was it about ObjectStore that made it a harder problem? If it had been possible in ObjectStore to use object level locks the way relational programs use row level locks, it would probably not have been as much of an issue, but this is one of those areas where the architecture puts you at a disadvantage.”

There were many competitive benchmarks. ComputerVision wrote an early one, aimed at determining OODBMS performance for CAD systems, and we spent a lot of time winning this. The one that took the most effort was the OO7 benchmark, describe in my previous posting. We spent a huge amount of time improving our performance on OO7. From the engineering point of view, this was very helpful. The OO7 crew at Wisconsin found many interesting performance problems that we didn’t even know about, many of which were easy to fix. I particularly remember how much benefit we got from setting the TCP_NODELAY flag. Meanwhile, the sales forces of every OODBMS company were using OO7 as a sales tool, each claiming to have gotten the best results! OO7 wasn’t really designed to compare competing products, but rather to act as an X-ray to analyze the systems and illustrate how they worked, and the researchers were rather unhappy to see it used in sales situations. Meanwhile tension developed as the benchmark was revised in order to make it a better X-ray. The problem was that each revision favored some vendors and disfavored others. Sadly, Ken Marshall decided that the OO7 team was intentionally trying to make Object Design look bad (because one of the researchers was on the technical advisory board of one of our competitors), and Object Design pulled out of the benchmark, invoking the clause in our license saying that customers could not distribute benchmark results. As you can imagine, the Wisconsin team was pretty upset about this. Charlie Lamb and I eventually published our own OO7 numbers, with complete instructions for anyone about the exact procedure that we had used, so that they could duplicate it. In my opinion, we did the best overall, though not on every test, but it was never official because of Object Design’s having withdrawn from the study.

Gene Bonte says: “I remember Ken Marshall [the CEO] telling me that in his days at Oracle (which he left to join us), 80-90% of the significant sales depended on benchmarks. For a new market like ours, this was the same or higher. Our salespeople and pre-sales engineers spent a lot of time trying to get customer benchmarks written so that it would favor our VMMA approach. Our competitors did the same for their approaches. Given there were almost no concurrent user engineering applications in existence, this was always a weak to non-existent part of the benchmark and we were always strong in these situations. Thus we won most of the benchmark wars.

An important thing that we never got around to implementing was putting more data processing on the server side. In formulating the architecture, I was heavily influenced by work at Xerox PARC on database systems in which the server just stored pages of data without interpreting them. This matched ObjectStore’s needs very well; the server side wasn’t where we knew the C++ data layouts and database schema. But sometimes this meant that you had to read a lot of data into the client side in order to search for small amounts of data in the database. We had originally hoped that this would not be a problem on the grounds that local area networks are awfully fast. That was a good answer for many cases, but not all. I have only recently (in my present job) worked with sophisticated Oracle experts who have shown me more about how to improve performance by processing (in PL/SQL, in their case) on the server side; I didn’t appreciate that well enough back when we designed ObjectStore.

Dave Stryker points out: “One thing that has made it harder for OODBMS’s is ever-growing memory and CPU power in PCs. ObjectStore database sizes were typically just a few gigabytes or less. Our original Sun-3 workstations had 8 Mbytes of RAM, I believe, and if you’re going to search a couple of gigabytes on an 8 Mbyte machine, you’re going to need a database system with indexes. In contrast, today even my laptop has a 2 Gbytes of memory, and lots of workstations have 8 Gbytes or more. It’s completely practical and common to slurp up a couple of gigs of information into memory and search it in memory on a machine like that. So the ‘object database cache’ of the past gets done now, most of the time, using in-memory data structures. Even when a database is the right answer, the extra overhead of translating from an on-disk representation to an object representation happens 100 times faster on today’s CPUs than on the 50Mhz CPUs of 1990. So the performance advantages of not translating are much smaller.”

Looking into the future, Dave Moon says: “The illusion of random access memory is becoming increasingly unconvincing on modern hardware. Although dereferencing a pointer takes only one instruction, when the target of the pointer is not cached in the CPU that instruction can take as long to execute as 1000 ordinary instructions executed at peak speed. It’s not clear that other approaches to database navigation are able to execute at peak speed, i.e. with no cache misses and no delays due to resource conflicts within the CPU, but if they were able to execute that fast, they would be able to expend hundreds of instructions to do what pointer dereferencing does and still come out equally fast, in the random access case where the target is not cached. Thus, the advantage of ObjectStore’s architecture is being eroded by hardware evolution. But at the same time, the advantage of C++ and other conventional programming languages is being eroded in the same way. It is not unreasonable to predict that we will see widespread abandonment of the illusion of random access memory in the next two decades. The IBM Cell processor used in video games is the first crack in the dam.”

Standards

Many customers wanted an industry standard, to avoid vendor lock-in. There was never a real standard for OODBMS’s. There was an attempted standardization effort called ODMG. Unfortunately, it was run by the vendors, not by the customers. So every vendor tried to adjust the standard to benefit his own technical approach and make life hard for the other company’s technical approach. It was really not done in good faith, and we were just as bad as anyone else, perhaps even worse. Unfortunately, there wasn’t any other OODBMS that worked the way ours did, so our customers really did have a vendor lock-in problem, which we never succeeded in addressing.

Ken Rugg points out that there wasn’t even a common understanding of what an object database even is! “If you looked under the covers, the actual persistence mechanisms behind Versant and ObjectStore, let alone something like Cache, are very different. Also, these differences are much more visible to the user than differences in the engines of RDBMS products.”

Complexity

Several key customers wanted support for versioning, e.g. so a CAD system could easily keep track of earlier versions of a design. But our highly sophisticated versioning system involved such complex semantics and such a complicated implementation that it made the whole ObjectStore client side mind-bogglingly complex. I remember Dave Andre and I reporting to Dave Stryker that it was almost working, but it made the product unmaintainable! We eventually had to rip it out. It was a huge waste of engineering resources and a good lesson in the virtues of simplicity, one of the hardest and most important lessons to learn in all of software engineering.

Java, PSE Pro for Java, and Smalltalk

(Thanks to Sam Haradhvala for help with this section.)

When Java came to prominence, we had to figure out how to turn ObjectStore into a Java OODBMS. Again, we went for transparency: persistent Java objects. You program with them just the way you regularly program in Java, except that you put in transaction boundaries and so on. Objects are persistent if they are reachable from any object designated as a persistent root object.

To do this, we used a novel trick: we took the Java class files, and added new JVM instructions before a read or write, to check whether the object being accessed had been read in. If not, we’d read it in on demand. As Sam Haradhvala points out, this can be thought of as a two-level faulting architecture. It used object-level faulting to fault in the contents of individual Java objects, while using VMMA to fault in the underlying C++ object representation, implement scalable collections, etc. This architecture could have provided the underpinnings for object-granularity locking and increased flexibility in other areas.

The PSE Pro for Java product had its own storage engine which used just object-level faulting with a specialized lightweight, small footprint, storage engine. It did atomicity and durability: committed changes happened either all-or-nothing, even in the face of system crashes. However, it did not support concurrent access between separate Java processes. It was targeted at an entirely different market segment than the ObjectStore Java product, but had the same API, so that you could, e.g., use it as scaffolding.

There was even an ObjectStore Smalltalk product which used the VMMA architecture, with special hooks built into the Smalltalk ParcPlace VM, so that it could co-exist with with the Smalltalk GC.” This was built by a team of very smart people on the West Coast. Unfortunately, they didn’t communicate tightly with the key developers on the East Coast, and so they didn’t fit into the architecture properly. The code became too hard to maintain, and the demand for Smalltalk turned out to be a fad in those particular years, so we discarded this.

Object-Relational Mapping

Jack Orenstein was very interested in object-relational mapping, which he describes as “my quixotic mission at Object Design”. “The idea was to bring relational database features to ObjectStore: collections, queries over them, and mappings to and from the relational model. A relational interface to ObjectStore would have expanded the pool of ObjectStore users, and opened up the product to off-the-shelf relational tools, e.g. Crystal Reports. The opposite direction (ObjectStore programming model on top of a relational database) would have opened up the ObjectStore API to other kinds of databases. These projects were of interest to a small number of customers (e.g. USWest, Credit Suisse), but for various reasons, some due to internal company politics, they were never internally funded and supported to the point where we came out with a product.” That was probably a mistake, perhaps a big one.

Objects can be stored in RDBMS’s using object-relational mapping tools. Relational databases have become so successful in exploiting hardware, and are such a ubiquitous component of the computation infrastructure, that a vast number of applications map their objects to relational tables. Hibernate is a very popular system for doing this in Java. You use Java annotation and XML configuration files to specify the mapping, which can be pretty sophisticated. Hibernate is clever at generating efficient SQL. It’s widely used and well-documented. A big advantage of mapping tools is that they let you share data with other, relation-oriented applications. On the other hand, this approach is not appropriate for the kind of CAD-like applications at which ObjectStore is aimed. Sun’s Entity Enterprise Java Beans (particularly the EJB 3.0 standard) is another mapping tool. See here for a paper by Mick Jordan about other Java approaches to orthogonal persistence.

Benson Margulies says: “The idea of persistent storage of an object data model, is, in fact, ever-more-common … in the form of object-relational middleware. Relational databases have become so successful in exploiting hardware, and are such a ubiquitous feature of the computation infrastructure, that a vast number of applications map their objects to relational tables and go home for a nice lunch. The trio of ObjectStore, Object Design, and the OODBMS concept can claim much credit for this. We wouldn’t have Hibernate, not to mention 15 incomprehensible Java standard initialisms, if not for what we did. And we had to do it. Ironically, if we had set out to build the object-relational product, I think that we would have failed. It couldn’t have been fast enough. We identified and exploited a gap, and we had a relatively successful run in that gap.”

Query Optimization

This entire section is by Jack Orenstein, regarding the “Third Generation Database System Manifesto” claim that query optimizer’s can always do better than a programmer can do by hand.

The relational side of this debate relies on an assumption that applications navigate to data of interest, and then, after the query, process that data. (Or, in a few cases, process the data inside the query, e.g. simple arithmetic, simple forms of aggregation, simple updates.) But in many applications, that separation of navigation and processing is impossible or not feasible.

I’ve implemented polygon overlay, which I think is typical of such applications. In polygon overlay, you need to traverse linked lists of vertices and edges making up polygons. You don’t navigate to an edge and then retrieve some of its data for later processing (after the database query). Instead, the navigation and processing of the data are tightly intertwined. Yes, with enough work you might be able to separate the implementation into navigation and processing parts, express the navigation part in a query language, and then have the query optimizer generate an execution plan better than the one implicit in your original code. An approach like this would obviously be completely alien to developers.

But if you really did write your application this way, separating navigation from processing, then the optimizer could, in principle, come up with an execution plan that reduces the number of disk reads compared to your original implementation.

But only if data is clustered in a predictable way. A relational optimizer uses a cost model to estimate the number of page accesses required to implement a query using a candidate execution plan. That cost model makes assumptions about how data is organized on disk, and uses some observations of actual data (e.g. key frequency distributions). If ObjectStore data were clustered as in a relational database, then the relational argument might have some merit. The optimizer would take estimates of page reads into account, something the low-level, data structure navigating C++ code is obviously not doing. But if the ObjectStore data is clustered intelligently, then that argument falls apart. In other words, a programmer can easily beat an optimizer if the programmer is also responsible for clustering the data. (The tools for clustering data in relational systems are extremely limited.)

Multiple Applications

ObjectStore was organized around providing persistence for a particular application. However, Ken Rugg points out that even in non-traditional market areas, some customers needed a DBMS that needs to be shared across multiple applications with different access patterns. In such cases, it was hard to optimize one of them without hurting the other, since much of the performance depends on the way the data is clustered, and it can’t be clustered two different ways at the same time in the same database.

Ken says: “One area that we are working on is how to synchronize data in ObjectStore with relational data so you can ‘have your cake and eat it too’. I think having multiple special purpose stores that are optimized for each consumer and synchronized and consistent with each other, (assuming you can manage them all in a reasonable way,) is better than a single ‘least common denominator’ store that is shared by all the applications in an enterprise. Of course doing the synchronization this isn’t an easy problem.”

Business Problems at Object Design

From time to time, I, and others, would lobby management to provide post-sales technical support, to help the customers learn how to best use ObjectStore. The pre-sales engineers tried to do this when they could, but they were usually too busy doing their pre-sales job. Periodically, one management regime or another would agree, and set up post-sales technical support. Life was good. But not for long, because management would see how valuable the customers thought post-sales technical support was, and they’d get the bright idea that we should charge for it and make it a profit center, making these guys into more consultants. (We always had consultants who could be hired.) Well, that was a big mistake. Lots of customers can’t pay for consultants. In some corporate cultures, for you to hire a consultant from the vendor tacitly implies that you are incompetent. What Object Design needed was successful customers to use as reference accounts when we tried to sell to new customers. Post-sales technical support was a long-term investment. But management would often lose sight of this and go for the short-term profit.

Mark Sandeen says: “The fact that we needed this level of technical support resulted in an interesting situation. Every now and then we’d hire the best and the brightest engineers from our customers, leaving our customers without the talent to architect their systems appropriately.” He and I can remember at least five of these, including several of our most awesome.

Our sales force faced obstacles. One of our sales reps, Ben Bassi, told me that the moment he walked in the door and said that they were here to sell a “database”, many customers would say “We already have Oracle: go away”, without giving us a chance to explain what we were about. (But Mark Sandeen says: “I never had that happen to me personally. And I trained all the staff that worked for me to never go anywhere near a prospect that was using Oracle (or RDBMS’s in general). In the early days we followed leads from folks who had purchased C++ compilers and tools, and after we had some wins in GIS, network management, etc. We would target those folks directly. We’d sell high performance, concurrent, persistence solutions to application developers.”

We even thought of trying to not even call it a database system: maybe it’s an “application data management” product, or something. Unfortunately our marketing department never really solved this problem. Our early salespeople were great. Later management regimes felt that you didn’t really need salespeople who understood the product; they were too hard to find and cost too much. Wrong. Some of the best salespeople left when that policy started take over.

If you took any of our CEO’s and locked him in a room with the product, he’d not have the faintest idea how to use it. It was a technical product aimed at programmers. Our first CEO, Ken Marshall, was very good at delegating, and his own lack of technical background wasn’t much of a problem. But after he left, the next CEO considered himself much more technically competent than he really was, and he made a lot of bad decisions, and he hadn’t really wanted to be CEO anyway, and he was only interested in wild ideas that would make the company grow super-fast, but those ideas never worked. The third CEO, acquired from a merger, was a good guy but, in my opinion, totally unfamiliar with how to run a software product company, and he pretty much ignored the advice of the technical people (particularly Ken Rugg, who was CTO and VP Engineering) even though he originally solicited it. That was when I finally threw in the towel. Fortunately, Progress Software bought the company, and the original ObjectStore part was put under a new general manager who was apparently quite good. So life is good again over there, and they’ve actually hired back a lot of very talented people who had left the company earlier!

Here’s a real life example of why it’s so hard to escape Oracle and embrace ObjectStore. I currently work at ITA Software, Inc., where we are building a new airline reservation system. We’re using Oracle RAC for the database system. Our rules say that all persistent mutable information must be stored in Oracle. Why? Because we are using Oracle Dataguard to copy data to our disaster recovery site(s), and to copy all online data to an archive, and our operations department wants data for disaster recovery handled uniformly across the system. We might use ObjectStore as a cache, but the place where we’d probably benefit most from a cache is a big module that’s written in Common Lisp, and there isn’t a good interface from Common Lisp to ObjectStore. It’s often for reasons like this that it’s hard for ObjectStore to get a foothold. However, there’s another product being developed at ITA for which ObjectStore, using its Java interface, looks like it might be a great fit.

Ken Rugg notes that the company took a big hit when the bubble burst in 2000. Object Design primarily sold to high-tech companies, since the users of the product were very technical and leading-edge. In particular, one of the major markets for ObjectStore was telecommunications companies, who were particularly hard-hit in that period. This contributed to a decline in revenues and eventual acquisition.

Caveats and Thanks

Everything here is my own personal opinion, and should not be taken as a statement by Object Design or Progress Software!

Much of this is in the past tense because I’ve been gone so long, and because things have changed, but ObjectStore is still alive.

Thanks to all the contributors named above, particularly Benson Margulies, whose highly cogent criticism compelled me to substantially reorganize the whole essay. I have made small edits to the contributions. Of course, I take responsibility for all errors.

Object-Oriented Database Management Systems Succeeded

Monday, December 31st, 2007
news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

Object-oriented database management systems (OODBMS’s) have been harshly criticized, especially by Prof. Michael Stonebraker, who has called them a “failure”. As a co-founder of what was the leading OODBMS company, Object Design, I take issue with this judgment. As I see it, we did what we set out to do and had a lot of success. As you’ll see, though, you have to distinguish between the hype, and what the product was really about.

There are many OODBMS products, not all alike. Here I focus almost entirely on Object Design’s product, ObjectStore, since that’s the one I know about. Some of what I say applies to other OODBMS products and companies, and some does not.

This essay includes very substantial contributions by my colleagues, which I have tried to organize into a cogent whole. Contributors, in alphabetical order:

Gene Bonte: Co-founder, CFO
Sam Haradhvala: Co-founder
Charles Lamb: Co-founder
Benson Margulies: Senior engineer and head of porting (after Ed)
Dave Moon: Senior engineer
Jack Orenstein: Co-founder
Mark Sandeen: Senior salesperson
Ed Schwalenberg: Senior engineer and head of porting
Dave Stryker: Co-founder, VP of Engineering

What Is An OODBMS?

In the late 1980′s, the implementors of CAD (computer-aided design, both electrical and mechanical) and CASE (computer-aided software engineering) wanted database management systems, but found that relational database systems (RDBMS’s) did not serve their needs. RDMS’s had been developed for business data processing. They were sold by Oracle, Informix, IBM, and Sybase (and later Microsoft). They had become a big business with a big market. The CAD and CASE practitioners published papers and had conferences explaining why they needed a whole new approach to data management, based on object-oriented technology.

Several startup companies were formed around 1988 to meet these needs. The first object-oriented languages that the CAD and CASE community could use had just emerged and started to gain popularity, particularly C++. That made the time ripe to build and sell commercial OODBMS’s.

At Symbolics, I had led a project that built an OODBMS for Lisp, which we intended to use in many applications such as email, program development tools, and so on. Statice 1.0 was released in 1988. However, it only ran on Symbolics hardware. The team wanted to port it to run on conventional hardware (still in Lisp), but Symbolics wasn’t interested. Meanwhile Dave Stryker, who had been VP of Engineering at Symbolics, had left, and joined entrepreneur Tom Atwood, who was working on starting a new OODBMS company. Charlie Lamb, Sam Haradhvala, and I resigned, and with Jack Orenstein of CCA, and Gene Bonte, joined them to found Object Design.

Object Design built an OODBMS called ObjectStore, which was released in 1989. ObjectStore focused on persistence of programming language objects. It would be easy to learn. It would not make you learn a new language or reorganize all your data. You could write your program in the way you were familiar with: as an ordinary C++ program. All you had to do was change some “new” statements to add a parameter saying that you wanted the object to be persistent (and what database or cluster to put it in), and add transaction boundaries, and voila, your program had persistence. Rather than being oriented around SQL-style queries, your program could navigate from object to object just the way any C++ program does: by following pointers. We also added collections (sets, lists, etc.) and a simple query language for them, along with indexes and a simple query optimizer. ObjectStore made applications fast because they could do direct navigation, and operate on the data without having to go through things like network connections, API’s, JDBC, and so on. Once a page had been used in a transaction for the first time, navigation took only one instruction.

Gene Bonte says: “Our ‘persistent language’ model was good because we focused on C++ developers. This was a new phenomenon and a technical guru was often a key decision maker in the initial buy decision. We bought marketing lists from all the C++ publications and targeted developers in our seminar programs. We were very much on the bleeding edge. I did the initial Object Design marketing plan and we sized our target markets in CAD, GIS, etc. in the $100′s of millions. Nobody had products for this market.”

We disputed the central dogma of relational databases: that all data should be in tables. The underlying assumption behind the relational model is that many applications come and go, but data is forever, and you have to organize the data as if you don’t have any idea what application might show up someday. This is sometimes called the principle of “data independence”. ObjectStore was much more organized around providing persistence for a particular application, so little of the dogma of relations had any relevance. The engineering customers did not want tables: this was clear from their published papers, and even The Economist endorsed us in this regard, publishing a story containing the metaphor of storing the design of an airplane as an alphabetical list of parts.

ObjectStore’s architecture is described in our CACM paper: “Charles Lamb, Gordon Landis, Jack Orenstein, and Daniel Weinreb, “The Object Store Database System,” Communications of the ACM, vol. 34, No. 10, Oct. 1991, pp. 50-63. For more background on object databases, see the Wikipedia article. This discusses all object databases, some of which were quite different from ObjectStore. The article is generally excellent, including the section on “Advantages and Disadvantages”. (The one thing I disagree with is the business about their lacking “a formal mathematical foundation”. Actual RDBMS products are very, very far from the mathematical foundation of relational theory. None of our customers ever complained about a lack of a mathematical foundation. It’s just not an issue.)

For a technical look at the kind of scenarios that ObjectStore was designed to handle, see the OO7 benchmark from the University of Wisconsin. The benchmark is intended to model typical CAD/CAM/CASE applications and contains several hierarchical structures and 1-1, 1-many and many-many relationships between objects. The benchmark can be configured in a variety of ways and comes with a set of standard configurations. OO7 defines a number of different traversal, query and update operations. (Carey, M., DeWitt, D. and Naughton, J. The OO7 Benchmark. Proceedings of ACM SIGMOD Int. Conf. on Management of Data, pp12-21, Washington DC, 1993. Also Carey, M., DeWitt, D., Kant, C and Naughton, J. A Status Report on the OO7 Benchmarking Effort. Proceedings of ACM OOPSLA 2007, pp414-426, Portland, OR, October 1994. )

Gene Bonte says: “We went after engineering applications and we found an interested audience. These customers were doing C++ for the first time and in general did not know how to do real OO development. Further, most had no experience with databases as they never worked for their applications. Early on, I remember we found that many of our early customers had tried Oracle or some other RDBMS at the insistence of management and it did not work. This gave me confidence that this was our real market where we had a competitive advantage.”

The Hype

Tom Atwood, the original founder and chairman of Object Design, as well as the rest of top management, often made grandiose claims for OODBMS’s. They said that OODBMS’s were the “next generation” after RDBMS’s, and would take over their whole market. Jack Orenstein remembers that Tom Atwood had a slide called “three waves”. The first wave was ISAM, the second wave was the hierarchical, network, and relational data models, and the third wave was object-oriented.” Mark Sandeen also remembers our claim that “there were two or three orders of magnitude more unstructured data than structured (rows and columns) data in the world, and ObjectStore would be the preferred way that data was stored.” Object Design’s second management team said in the mid 1990′s that ObjectStore was “the database of the Internet” or “the database of the World Wide Web”, hopping aboard the new bandwagon. You can see that latter hype in the initial public offering document. This kind of hype was used to attract investors and salespeople to Object Design.

The founding engineers never believed this. Our target market was not the relational database business-data processing users, but rather users who didn’t have any database solution that would work for them. We did our best to ignore the hype, and get on with the software development and customer support.

Prof. Stonebraker’s criticisms

The harshest criticisms of OODBMS’s have come from Professor Michael Stonebraker, one of the most renowned figures in the database world. He was an early proponent of RDBMS’s and the inventor of a prominent RDBMS called Ingres, developed at U.C. Berkeley. He formed a company to commercialize it in 1982 (it has fallen from prominence and was open-sourced in 2004). He later returned to academia and started the Postgres project, which supported extensible types and was released as open source. He formed a company in the late 1990′s, Illustra, to commercialize it. Recently he has become a professor at M.I.T., and has recently formed more companies to produce novel database technology.

Criticisms from so important a leader, in both the academic and commercial spheres, were widely reported:

“OO systems have not focused on bread-and-butter traditional business-data processing applications where high performance, reliability, and scalability are crucial. This is a large market where relational systems excel and have enjoyed wide adoption.”

“Companies are justifiably loath to scrap such systems for a different technology, unless it offers a compelling business advantage, which has rarely been demonstrated by object-oriented systems. As such, relational systems and their orject-relational descendants continue to be the market leaders.”

“A much bigger problem is that the vendors behind ODMG represent zero billion [dollars] in revenue while the vendors behind SQL . . . represent several billion in revenue. Hence, it is not a standard with critical mass in the marketplace.”

“Relational vendors realized that objects are important and added them, producing object-relational systems. However, the failure of OODBMS vendors to realize the importance of SQL and the needs of business-data processing has hurt them immensely.”

“ODBMSs occupy a small niche market that has no broad appeal. The technology is in semi-rigor mortis, and ORDBMS’s [object-relational DBMS's] will corner the market within five years.”

As you can see, these comments are directed towards the hype rather than the reality. They all assume that there is one “database market”, namely the “traditional business-data processing applications”. So while many of them are essentially correct, taken as they were meant, from my point of view they are entirely beside the point.

As he says, the fate of ObjectStore had nothing to do with the object-oriented features of the object-relational hybrids, in which “object-oriented” meant something almost completely different, and which were aimed at the traditional business data processing market. Those systems were never appropriate for the applications that ObjectStore was designed for. Similarly, the object-oriented features added to the relational database systems (each quite different from the next and violating the whole relational “mathematical foundation”) had nothing to do with what ObjectStore was about.

Mark Sandeen says: “We never tried selling to to the traditional data processing organizations. Our sales training specifically forbade our sales teams from pursuing traditional MIS applications. You can’t be unsuccessful at something you haven’t tried to do.”

Prof. Stonebraker has also asserted that OODBMS’s do not support queries. That’s a pretty strong statement, and as written it’s incorrect. I think that what he meant is that they don’t support anything like SQL, with fancy query optimizers. Now, the OQL (Object Query Language) in the ODMG standard is every bit as much of a query language as SQL, and some of the OODBMS’s really implemented it well (O2 did, if I remember correctly). ObjectStore did not have a sophisticated query language and optimizer, but we certainly did have queries, indexes, and a simple query optimizer: this is what our target market needed.

Prof. Stonebraker was one of the authors of the “Third Generation Database System Manifesto” (2003), which spends a lot of time attacking OODBs. (As you can see from the URL itself, this was really the “object-relational manifesto”.)

First, it says that the navigational (pointer-following) orientation of OODB’s is a “step backward” to CODASYL (network) databases, which have been discredited. They say: “First, when the programmer navigates to desired data in this fashion, he is replacing the function of the query optimizer by hand-coded lower level calls. It has been clearly demonstrated by history that a well-written, well-tuned, optimizer can almost always do better than a programmer can do by hand. Hence, the programmer will produce a program which has inferior performance.” This is incorrect in the context of the way ObjectStore is actually used. Navigation takes one single machine instruction: how is your query optimizer going to beat that? C++ programmers know how to write fast C++ code and ObjectStore is basically persistent C++. (There’s a Java version now, but I’m referring to the original concept.) (More about this in the later essay about ObjectStore technology.)

Next, the Manifesto says that schema evolution is a “killer”, on the grounds that when you change indexing or clustering, you have to modify the program. This is also incorrect. When doing queries, ObjectStore did have an optimizer that dynamically adjusted to the availability of indexes. Clustering was entirely transparent to the application. Finally, most such operations were done by navigation anyway.

Third, they say that although many programmers want to do navigation, they are wrong and “simply require education”, comparing them to “programmers who resisted the move from assembly language to higher level programming languages”. Their point is that optimizers can do better than navigation, just as compilers can do better than hand-written assembly language. I don’t know if they thought this kind of condescending attitude was likely to win them converts!

Jack Orenstein adds: “I think that the additions made to SQL to support OLAP applications (basically expanding what can be done inside “groups”), is an admission that this original argument about re-education was wrong. If early-1990s SQL was good enough for re-educated application developers, then the later extensions would not have been necessary.”

Later on, they make the usual argument that OODBMS clustering is not an advantage since RDBMS’s can theoretically do all kinds of clustering within the relational data model. While that’s theoretically true, Oracle’s ability to actually do this is quite limited, even after many years of Oracle development. I have often heard RDBMS enthusiasts talk about the hypothetical abilities of ideal relational database systems, without honestly admitting how far those ideals are from what you can really buy. ObjectStore provided tremendous control over clustering, which was crucial to its ability to provide high performance.

Stonebraker’s more recent criticism is more modulated. In his database column of Sept 2007, he says: “OODBs failed for other reasons than the inclusion of OO technology in RDBMS. First, OODB’s were designed and built for the engineering database market. The technology’s main focus was on persistence of programming language objects and not on business data processing features such as a strong transaction system and SQL support. OODB vendors were unsuccessful in selling to this market for a variety of reasons — reasons that are too lengthy to go into here. However, the main reason for their lack of market success was their inability to construct a value proposition with sufficient return on investment for the engineering customer. The demise of OODB has little to do with the inclusion of OO features in RDBMS, an effort that my Postgres system was in the forefront of.”

As you can see, he’s now changed his story substantially. He’s a lot closer now, except for the part starting “lack of market success”, as I’ll show below.

In an interview in ACM Queue Vol 5, no. 4, May/June 2007, Prof. Stonebraker basically says that the CAD guys didn’t have enough pain to consider switching, and they had “mountains of proprietary code” that was already fast enough. “They failed because the primary market they were going after didn’t want them.”

Again, this is a lot closer what really happened, but ObjectStore only “failed” this very narrow goal.

Technically, he’s incorrect about transactions: ObjectStore does totally bona fide ACID transactions. He’s also incorrect about high performance: ObjectStore performed far better than RDBMS’s or ORDBMS’s in ObjectStore’s target markets. And he’s also incorrect about reliability: plenty of products were based on ObjectStore and were quite reliable. And our users did not especially want SQL. As for “scalability”, that can mean a lot of things, but with no specific claim and no data whatsoever, it’s not a very convincing criticism. ObjectStore can handle databases that are quite large.

But what about his claim that we were “unsuccessful in selling to this market”? Did we really have a “lack of market success”? Did we “fail”?

ObjectStore and the CAD Market

At the very beginning, we had hoped that the major electrical and mechanical CAD companies would build or re-engineer their major products on ObjectStore. This mostly did not happen. Why?

We talked to Mentor Graphics, one of the leading ECAD vendors, in 1988. They were very excited and said that if we could make a product such as we were describing, they’d buy it right away. We had dinner with a group of people from Mentor at the OOPSLA ’88 conference, who had just heard Prof. Stonebraker’s keynote address in which he made his usual points, but they said that those points did not apply to them because their needs were different. Their constraints were too complex and application-dependent for relational database systems (Sept 28, 1988). We had a big meeting (Feb 3, 1989) in which they explained that their main interest in OODB’s was actually for sharing between their tools, configuration management and version control, and concurrency control.

Sadly, it turned out that the people we were talking to were Mentor Graphics’s advanced product group, who turned out to be distinct from the real product people. The real product people already had the file-based approach that they were using, and although they could see the benefits of OODBMS’s, the problems we addressed were not their highest priority. Using ObjectStore would have required some big changes to already-mature software that was already deployed at many customers. So Mentor’s main product never got re-hosted on ObjectStore.

Ken Rugg also points out that “they like to code and think they can do it better themselves. I believe that currently there is still a large percentage of that market that uses home-grown file-based storage for managing their model data. I think the fact that we were ‘closer’ to the application code than a relational system actually made them more likely to try to do it themselves.”

It’s almost certainly this failure and a few others that Prof. Stonebraker was referring to. My guess is that he stopped paying attention to Object Design after this, and that’s why he bases his recent comments only on these particular customer.

Later, we did sell ObjectStore to many CAD companies, for other products than their existing mainstream products. We had three or four different sales to Cadence, who built several tools based on ObjectStore. We worked closely with ViewLogic, who were building a new CAD system and architected it using ObjectStore; it all worked well technically, but unfortunately they never really hit it big. We even sold to Mentor Graphics, eventually, for other applications. We sold to many CASE companies, although CASE didn’t turn out to be such a big commercial success.

Mark Sandeen points out: “It turns out that the CAD market is a pretty small market when compared to the total universe of folks writing C++ applications. Even if we had won Mentor [for their main application], they would have been a relatively small portion of the market.” In other words, what Prof. Stonebraker is talking about didn’t make very much difference.

How Was ObjectStore Used?

ObjectStore is, in fact, good for CAD, and there were many CAD applications, just not the particular ones we had initially contemplated. There were general-purpose and special-purpose CAD systems. I remember one whose job was to help you design (configure) complicated phone switches, and it worked very well.

ObjectStore was also very strong for geographical information systems, network management, configuration management, and many financial applications.

ObjectStore makes a great application-specific persistent cache in front of relational databases. Modern transaction processing systems often access data at a very rapid rate that would overwhelm the relational database system, and so need caches to relieve the database load. ObjectStore does this extremely well, and since it’s persistent, you still have a nice warm cache even when you are recovering from a system failure. Perhaps the highest-profile customer is Amazon.com, which uses ObjectStore as a cache for its inventory data. Yes, when Amazon says “we have 3 copies of this book”, that came out of ObjectStore.

A May, 2006 white paper from Monash Information Services called “Memory-Centric Data Management” talks about several systems, including ObjectStore, explaining what they excel at, looked at in modern terms. (Curt Monash is very sharp and his papers are fun to read.) “Progress’s ObjectStore, for example, provides complex query performance that wouldn’t be realistic or affordable from relational systems, no matter what the platform configuration. Most notably, ObjectStore answers Amazon’s million-plus queries per minute; it also is used in other traditionally demanding transaction environments such as airplane scheduling and hotel reservations. ObjectStore’s big difference vs. relational systems is that it directly manages and serves up complex objects. A single ObjectStore query can be the equivalent of dozens of relational joins. Data is accessed via direct pointers, the ultimate in random access — and exactly the data access method RAM is optimized to handle. On disk, this approach can be a performance nightmare. But in RAM it’s blazingly fast.”

ObjectStore is a great “kit” for building special-purpose highly-optimized database systems. For example, the British Ordnance Survey makes a cool multi-layer digital map product based on ObjectStore, with data structures highly optimized for representation of 2-D cartographic data. They could have built it in Oracle but it would have taken literally orders of magnitude more disk space and been substantially slower. With ObjectStore, you can build indexes that are, for example, K-D trees, suitable for representing distances and other 2-D concepts. (See papers on the EXODUS project at U. of Wisconsin, which used technology very much like ObjectStore for being such a “kit”.)

Ken Rugg adds: “Most of the capabilities of an enterprise DBMS are there and have been for a long time. In fact, I am often surprised to find some feature that is missing in the Progress OpenEdge database, but has been in ObjectStore since long ago.”

Did Object Design succeed as a business?

We introduced ObjectStore in 1990, on time. Object Design’s revenues, earnings, and growth were excellent and matched or exceeded our original business plan. In 1994, we were Number 1 on the Inc. 500 (The Fastest Growing Private Companies In America), at which time we had 200 employees and $24.6M in revenue. In 1995, Oracle made a serious attempt to buy the company. In 1996, we had a highly successful IPO (initial public offering) of stock. Our venture capital investors, not to mention the co-founders, were quite happy. (Of the 23 “Inc. Number 1″ companies between 1982 and 2005, only three went public.) Revenues in 1998 were $62M, in 1999 $$61M, in 2000 $70M, in 2001 $49M (post-bubble).

By 1996, Object Design had 204 employees, including 51 in research and development, and over 700 customers. Among the companies we sold to were ABB, AT&T, Abbot Laboratories, Alcatel, Aldus, Ameritech, Apertus, Australia Telecom, Autotrol, Avanti, Bankers Trust, BayNetworks, Bell Northern Research, Bellcore, Bellsouth, Boeing, British Telecom, CADAM, Cabletron, Cadence, Canon, Cellnet Data Systems, Computervision, Credit Suisse, DEC, Delphon, EDA (integrated design systems), Ericcson, Fidelity Investments, Ford, Fuji Xerox, GE Daytona Beach (flight simulation GIS), GTE Directories Corporation, General Electric (several sites), Goodyear., Hewlett-Packard, Honeywell, Hughes Information Systems, Hyperdesk, IBM Poughkeepsie (CAD), IDD Information Systems, Independence Tech, Intel, Intergraph, Long Term Credit Bank of Japan, Loral, Lucid, MCI, MIT, Manugistics, Matra, McDermott, Mead Data Central, Mentor Graphics, Mitsui, NASA, NSA, NeXT, New York Stock Exchange, Nomura Securities, Oberon, Objective Spectrum, Olivetti, Pitney Bowes, Platinum Technology, PowerFrame, PriceWaterhouse, RoadNet (owned by UPS), Sandia National Laboratory, Schlumberger, Sema Software (text processing in SGML), Sherpa, Siemens AG (telephone switching), Southwest Airlines, Sprint, Sterling Software, Sun Microsystems, Synopsis, Texas Instruments, Toyota, U.S. West, Universal Oil Products, Vodaphone, Wildfire Communications, Xerox, and Zuken. (This list was compiled by me, my notebooks from the time, Mark Sandeen, Tom Kincaid, and the public offering document.)

Sun Microsystems was building a new object-oriented platform called Distributed Objects Everywhere, for which we provided the object database. IBM formed a strong strategic alliance with Object Design: they purchased equity, they bought lots of product, they set up a joint marketing program, and built it into their whole software product road map. We also got useful technical advice from the high database wizards at IBM’s Almaden Research Facility such as C. Mohan and (I think) Bruce Linsday. Microsoft modified their new operating system technology in order to allow ObjectStore to run on it.

ObjectStore, and OODBMS’s in general, have often been criticized in sentences using the word “niche”. What does that mean? A niche simply means a particular market: a kind of customer with certain kinds of needs. To sell to a particular set of markets is exactly what we intended all along. Sometimes “niche” is supposed to connote “small niche”, but you can read the above and make your own judgement.

What Happened With Prof. Stonebraker’s Products?

The object-relational database system that Prof. Stonebraker spent so many years praising and selling, known at various times as Miro, Montage, and Illustra, with its “Datablades” architecture to support extended datatypes, reached the market in 1992. Then it was bought by Informix, which back-burnered it and put its team to work adding object-relational technology to Informix, resulting in Informix IUS. Then IBM bought the database part of Informix, and to the best of my knowledge, no longer sells Illustra. Far from “cornering the market”, it was gone after only four years. It’s hard to see how to construe Illustra a success and ObjectStore a failure. And so much for his claim about “their OR descendants continuing to be the market leaders”. They were not market leaders in any market. Furthermore, the “object-oriented” features that were added to the RDBMS’s never turned out to be important or widely used, to the best of my knowledge. They certainly didn’t take over the business data processing market.

Prof. Stonebraker’s latest pitches for his new technologies and new startups, say that they’re posing a major challenge to Oracle. And they’re targeted at markets other than mainstream business data processing (dare I say “niches”?). Prof. Stonebraker’s published a paper in 2005 called “One Size Fits All: An Idea Whose Time Has Come and Gone”, about how RDBMS’s such as Oracle can be beaten by new kinds of DBMS’s for specific application areas. That’s what we’ve been saying all along! His new startups look promising to me. At the OOPSLA 2007, I had a long discussion with Richard Tibbetts, co-founder and architect of Prof. Stonebraker’s StreamBase (www.streambase.com), which sounded pretty impressive. Prof. Stonebraker himself came to my workplace to tell us about Vertica (www.vertica.com), which we’re very interested in. There’s also HBase (which I think is the same thing as Horizontica), a newer effort. It’ll be educational to see what happens to to these companies and products over the next twenty years.

What Happened With Other OODBMS’s?

What became of Object Design’s competitors? They’re doing fine, selling their OODBMS’s. Versant went public with over $21M in revenues and over 7.6M in profits. Objectivity, still private, is also selling their OODBMS. Gemstone, who was there before the rest of us (named Servio Logic in the earlier days), is still selling too.

DB4O (and also) is a popular embeddable open-source dual-license OODBMS, supported by a venture-backed company called db4objects of San Mateo CA. It’s aimed at Java and .NET. It supports ACID transactions, although I don’t know the specifics. Object Design also has a simple embeddable language-transparent Java database called “PSE Pro for Java”, which does have transactions and some good scalability (it only reads in the objects that you actually use).

There’s also Cache from InterSystems, who are intensively marketing their OODBMS. I’ve heard it’s popular and very fast. Their paper talks about using Cache for persistent Java objects. It does navigation, but also provides a JDBC/SQL interface. The paper is satisfyingly technical, including code samples, comparing Hibernate, DB4O, and Cache.

These days a lot of value can be had from database systems that don’t have query languages at all, such as the various versions of the Berkeley Database from SleepyCat (now part of Oracle). Sometimes just looking up a record by key, ISAM style, is exactly what you need. Charlie Lamb and Sam Haradhvala are working on the Java version of Berkeley Database. These products have been extremely successful.

What Happened To ObjectStore?

I have been away from the company for a long time, so most of my knowledge is second-hand. In this section I won’t attribute my sources, in order to avoid getting anyone into trouble.

Object Design, later renamed eXclon Corporation, was acquired in 2002 by Progress Software, who put in place an excellent new top manager, and which has retained the technical staff. In fact, several former Object Design employees rejoined the company. Many of my long-time friends and co-workers are still there.

ObjectStore is still being actively maintained. There is a new product manager. There have been some large deals recently. A new major release, 7.0, came out earlier this year, with support for Windows 64-bit, Microsoft Vista, Visual Studio .NET 2005 SP1, and Red Hat Linux 4.0 Update 4. It also has a new Data Services Administrator tool, new support for Java 5, and other improvements. Release 7.1 is coming next summer, in time for the twentieth-anniversary party. I just got an evaluation copy of ObjectStore for my current employer; we may be using it in an upcoming product.

Here are some of the key reasons it’s not as big a seller as it used to be (from me, Dave Moon, and Ken Rugg):

  • ObjectStore is still difficult to sell, because it’s not a solution, and it’s not even a tool for directly making a solution. Rather, it’s a framework on top of which a sophisticated software engineer can build a tool that they can then make into a solution.
  • Object Design developed a very powerful sales force, but during various corporate turmoil, most of it disbanded, and it has been very hard to recreate it.
  • Progress has developed other products that have been very successful and bring in more revenue for the effort than ObjectStore itself. The division selling ObjectStore has limited resources and is focusing on these products.
  • When we started, C++ was the exciting new object-oriented language being used for new applications, and ObjectStore was designed primarily for C++. It’s been twenty years, and these days fewer new applications are using C++.
  • In Java, there are now persistent objects packages based on object-relational mapping that is far better than it used to be, almost as easy to program in as ObjectStore (for Java), and free.
  • Computer have gotten very, very much faster over the last twenty years. ObjectStore’s performance advantage became less important, and therefore more applications could make the conventional choice of using a relational system with the new object-relational mapping technology.
  • ObjectStore’s performance advantage in Java is considerably less than in C++: you get much less control over the clustering, and there is significant per-object overhead.
  • There are fewer software developers who have a good grounding in data structures and algorithms. The Java generation assumes this comes as part of the language or DBMS, and that they don’t have to figure this out for themselves. With ObjectStore, many of the most compelling applications are a result of someone building a custom data structure to store and index the information in a unique way.
  • There was a negative reaction to the hype (see above) which gave OODBMS’s a bad name. Prof. Stonebraker’s comments may have contributed, as well; he’s very prominent and often quoted by the press.
  • Another way things have changed since twenty years ago is that software developers are much less open to buying substrate software. They are accustomed to getting it free (or very cheaply).
  • ObjectStore’s markets never got quite big enough for third-party vendors to make support products, and there weren’t a lot of outside consultants who were ObjectStore experts.

Gene Bonte says: “ODBMS technology continues today in the marketplace 20 years, and 8-10 technology cycles, after its founding. How many other technologies have come and gone in this time frame? Also, two of the top five OODBMS firms went public, which is a very high percentage for venture-backed firms and speaks to a real success. If you tell a venture firm that 40% of their investments will go public, you will have a smiling VC. Meanwhile, another 40% are still around doing business. It is probably safe to say that the overall return on investment in VC investments in OODBMS technology is quite positive. From a commercial perspective, and from a technical and longevity perspective, OODBMS technology has been a success. Relational versus object-oriented was never the real issue from a business point of view. The issue was building a successful company and produce shareholder value. This was accomplished.”

Caveats and Thanks

Everything here is my own personal opinion, and should not be taken as a statement by Object Design or Progress Software!

Much of this is in the past tense because I’ve been gone so long, and because things have changed, but ObjectStore is still alive.

Thanks to all the contributors named above, particularly Benson Margulies, whose highly cogent criticism compelled me to substantially reorganize the whole essay. I have made small edits to the contributions. Of course, I take responsibility for all errors.

Notes on the book: Dreaming in Code

Thursday, December 27th, 2007
news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

I just finished reading an amazing book: “Dreaming in Code” by Scott Rosenberg. Like many good, recent non-fiction books, it alternates between a specific narrative with colorful real people, and general background information. In this case, it’s the story of Chandler, a personal information management tool, and the team who are building it, led by Mitch Kapor.

The general background explains far more about real, contemporary software, how it is built, and what it’s all about, than anything I’ve read before. Everyone learning to be a software engineer, or who wants to understand what software engineers actually do, should read this book.

In only 355 pages, Rosenberg discusses, in clear language that’s easy to follow, at least the following:

  • What working on a software project in a team is like, the subjective experience
  • Open software, and the “Cathedral vs. Baazar” concept
  • Doug Englebart’s ideas (very germane to Chandler)
  • Famous software fiascoes
  • Computer languages, especially Python and how it compares to others
  • Reusable software, software libraries, build versus buy
  • What “geek” really means
  • CVS, Bugzilla, and Wikis
  • Why user interfaces are so hard to design
  • Dependencies between parts of a system and how they block work
  • Release management and scheduling
  • Specifications and their nature
  • Layers of abstraction
  • Scaffolding
  • Code reviews
  • WebDAV and CalDAV
  • Microsoft FUD
  • Requirements analysis
  • Methodologies: waterfall, agile
  • The gist of No Silver Bullet and The Mythical Man-Month
  • Ruby on Rails
  • Software engineering, its history and what it means
  • Complexity
  • Late binding
  • Object-oriented programming
  • Recursion
  • The halting problem

The story of Chandler and the team is compelling and instructive. On page 173 of the book, he says: “By now, I know, any software developer reading this volume has likely thrown it across the room in despair, thinking, ‘Stop the madness! They’re making every mistake in the book!’” I did indeed feel that way by page 173. Here’s my sense of what went wrong, based on the account in the book:

  • They did not have one architect (Brooks makes a very good point about why there should be a single person)
  • They didn’t work out the architecture in advance, and they went back and changed it many times
  • They had a very flexible data concept/model, in which items change type frequently in a user-visible way, which they didn’t work out until quite late
  • They kept changing their mind about their UI substrate: wxWidgets? Mozilla internals?
  • The software ecosystem changed around them after all those years, and using a Web UI now made sense, but it was too late for them
  • They could not figure out what database technology to use (they finally decided not to use the Zope Object Database, although their reasons for that decision don’t impress me)
  • It was originally supposed to be peer-to-peer, but they could not figure out how to make that work, so they changed it to be server-based, a major change very late in the design
  • They had to design a security model for all this
  • It was all extensible, which is great but takes a lot of work to do right
  • There were complicated semantic issues with sharing, “chain-sharing”, etc. which were not worked out early.
  • They wanted to have extensional and intensional collections, like iTunes, but also wanted to combine the two (the so-called “exclude Bob Marley” feature), which makes the semantics a lot harder
  • Their internal terminology was inconsistent, symptomatic of a lack of architectural integrity
  • They did serious requirement analysis only late in the project
  • It was putatively open-source, but it was much too immature to really get outside developers involved
  • They were too focused on doing “the right thing” instead of getting something out fast; see Gabriel’s “Worse is Better” paper
  • They released much too early, partly because of the glare of publicity due to Mitch Kapor’s involvement

I see that they are still in “preview” releases. This has been going on for six years now! They have no projected release date for 1.0. It will be free, under the Apache license.

I have always wanted a good personal information manager, and a lot about Chandler looks very promising. Someday I may be a happy user. Right now, I think I’ll wait until release 1.0.

I hope they have moved beyond the problems illustrated in the book and are running smoothly now. Kudos to the whole Chandler team for letting Rosenberg be so involved, being so honest with him, and letting him produce this unique, spectacular book.

Footnoted: News from the Footnotes about American Corporations

Thursday, December 27th, 2007
news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

I heard on Marketplace (American Public Media’s radio show) about footnoted.org, where Michelle Leder exposes juicy information about corporations found in the footnotes of their reports. If you’re a shareholder, you might be interested to know how your management is spending your corporation’s money. Here are some fun ones:

  • Fred J. Kleisner, Interim President and CEO of Morgans Hotel Group, had to relocate to New York City. The company is paying all his expenses, including a housing allowance of up to $30,000 per month. “Even in Manhattan, that guys you some nice digs.” He also gets a $750K salary with a bonus up to 200% is he meets performance targets.
  • For their CEO Michael McGrath, I2 Technology paid nearly $1 million flying him between the company’s offices in Dallas and his home in Maine.
  • Countrywide gave CEO David Sambol a $2.7 million promotion bonus shortly before their stock imploded.
  • Qwest Communications CEO Edward Mueller’s stepdaughter attends high school in California, but Qwest is based in Denver. But no problem; she’s allowed to use the company’s Falcon 2000 private jet for her commute to school. This could cost Qwest as much as $600K, assuming normal charter rates. In fact, more than half of the CEO’s in a recent study are able to use their corporate jets for personal trips.
  • David Peterschmidt is leaving as CEO of Openwave Systems after three years, for which he’ll get a lump-sum payout of $1.5 million and full vesting of his 175,000 shared of restricted stock — also worth about $1.5 million. This was while Openwave’s shares fell by 17%; it has been falling and falling ever since. Meanwhile there has been a shareholder lawsuit, involving Peterschmidt and others, claiming the the stock price dropped because of the company’s options backdating scheme, which encompassed seven years and led the company to restate its financials. In 2007, they lost $197M on revenues of $290M.

It’s good to be CEO.

European Common Lisp Meeting

Wednesday, December 26th, 2007
news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

The European Common Lisp Meeting of 2008 will take place on Sunday, April 20, 2008, with optional dinners on Saturday and Sunday evening. I’ve been to Amsterdam and totally loved it. I’d very much like to attend; I’ll have to see whether it’s possible.