Come to New England Database Day!

news and informationbusiness,health,entertainment,technology automotive,business,crime,health,life,politics,science,technology,travel

New England Database Day is a one-day mini-conference where participants from the research community in the New England area can come together to present ideas and discuss their research.  I highly recommend this event if you’re interested in cutting-edge database technology.  There will be eight talks plus poster sessions.

(I’ll say “database” to mean “database management system”, as is often done for brevity.)

Last year’s conference (the first one) was great. Here’s my very belated report.

David DeWitt’s paper on Clustera, which controls and runs large batch operations on a big cluster of machines.  There are three prominent classes of these, exemplified by (a) Condor, for running things like circuit simulations and weather models; (b) Gamma, for doing parallel database queries; and (c) Map/Reduce.  Clustera is designed to be able to do all three of these things reasonably well.  In fact, these three are just particular points in a whole space of possibilities, which Clustera can be used on.  Also, Clustera is simpler and smaller than other such systems, because it builds on a J2EE application server and a small relational database.  Prof. DeWitt has been in charge of the amazingly productive database system research at University of Wisconsin, Madison, although now he’s going to a new Microsoft research center to be created in Madison.

George Miklau explained about policy decisions regarding archival storage, such as privacy, accountability, retention policies, subpoenas, and redaction.  He talked about how technological decisions affect these things, too.

Stavros Harizopoulis of HP Labs described an experiment that demonstrates why main memory databases can be so fast, analyzing the costs of various modules that can be omitted such as logging (most kinds), locking, latching, buffer management, and other overhead.  No one of these takes the lion’s share of the time, it turns out.  You have to do all of them to get the best performance improvements.  A major point is that a database system designed to be column-oriented can be a lot faster than a general-purpose database acting as if it were column-oriented.

Ryan Johnson of CMU talked about many issues involved in executing queries in parallel on multi-core processors.  As you’d expect, this is a hot area, since the multi-core processors are becoming so widespread, and the number of cores is going up.  He examined work sharing, pipelining, working set size, and of course caching issues.  He presented experimental results as well.

Daniel Abadi of Yale (formerly of MIT) (not to be confused with Daniel Abadi of Microsoft Research) gave a talk called “How To Create a New Column-Store Database in a Week”.  The point was that you can do it, based on a regular row-store database, but he expains why this won’t work well.  A good column-store database must be built that way from the start.

Anastasia (Natasha) Ailamaki of Ecole Polytechnique Federale de Lausanne was honored by being the last speaker; she has won many awards and is a rising star in the database community.  Her talk was “Multi-Core: Friend or Foe?”  She explained a lot about how the memory/caching systems multi-core processors work.  She also explained some of the major design tradeoffs that the hardware designers can make: fewer, more complex cores, or the opposite, and whether hardware threads are used.  Then she talked about how all this particularly affects database systems.

The event will be in Cambridge, Mass. at MIT, in the Stata Center, room 32-123 (the big lecture hall on the first floor).  It’s be this Friday (January 30, 2009) from 9 am to 6 pm.  It’s free, but they’d like you to register so they’ll know how many people are coming.  I hope to see you there!

Comments are closed.