Choosing a database for your project
April 29th, 2008 17:36When you are designing a project with complex storage requirements and some demands on reliability and performance, a few options come to mind.
Even though at OpenPanel we have strong feelings about NIH, we didn’t think writing our own database store was the way to go. Many smart people have already written many different database backends, which means that in both quantity and quality the database area is well-covered. So, writing our own was right out.
The second option that comes to mind is to use the most basic kind of database available: a key/value store like Berkeley DB or GDBM. Combining a few key/value stores together yields a lot of flexibility, but there’s a lot of glue you need to write, then. In programming language terms, the key/value API is not powerful.
The only way out then seems to be SQL. The common choice seems to be MySQL, and with good reason - it’s robust, fast, flexible (supporting most of the SQL standard), comes pre-packaged for any distribution, and just about any sysadmin or PHP-developer you run into knows his way around it more or less. A close second would be Postgres, less popular with the common PHP developer crowd, more popular with seasoned developers in other languages, and understandably so.
However, we did not go for the path well-traveled. After evaluating our requirements, we realized we barely needed the power of a client/server model database with high concurrency support. Also, we figured choosing a slightly less popular implementation would deter people from messing with the database by hand.
We chose SQLite. It’s extremely lightweight, reliable, robust and surprisingly fast. Choosing SQLite means our database is just a file (directory, to be honest) in /var/opencore, where users and admins can’t run into it by accident while mucking about in phpMyAdmin - but when they really need to mess with the database, they can, with their familiar SQL idioms.
SQLite has most the features a developer would expect from a database; transactions, subqueries, decent indexing support, triggers, and room for extension with user defined functions. The only thing sorely missing is foreign key support, but that’s easily implemented as a bunch of triggers.
Even though we now had this powerful database engine, we put in a lot of effort mapping our idea of an object revision model onto it (more about that in a later post) - but SQLite made it a lot less painful.
SQLite’s only real limitation seems its lack of concurrency, which is made worse by the locking model used that seems to invite polling for access instead of forming some kind of queue. For this reason, OpenPanel keeps just one handle to the database file and manages exclusive and shared access to it via the normal pthread mechanisms.
Incidentally, the current (unpublished) version of our automated web application installer tries to shoehorn its data into a key/value store and it’s hurting - even from Python! We’ll probably rewrite those bits to use SQLite as well (but, of course, separate from the OpenPanel database).
Summarizing, if you are looking for a data store for your software project, consider SQLite. It’s as lightweight as most key/value store libraries but throws in a hell of a lot more featurewise.











