Cassandra Database System
May 24, 2011 by Chad Brake
At Healthx, we are all about using the best technology to provide engagement solutions for our clients - technology that ultimately translates to a better experience for our customers and users. Over the years, we have used a lot of new programming languages in our secure healthcare solutions, but our SQL relational database behind the scenes has remained unchanged. The primary reason for that is we simply have not seen a new database technology that caught our eye like we have with web programming technologies …. until we met Cassandra.
Cassandra is a different kind of database system that is designed to handle very large amounts of data spread out across many commodity servers which is ideal for high volume cloud portals and applications. That type of setup along with the way Cassandra works, creates an environment that provides a highly available service with no single point of failure and it is very easy to scale to massive levels. Conversely, most enterprise level SQL databases require extremely large and expensive servers, can be quite difficult to scale, and can be very costly to provide the necessary level of redundancy.
The two types of database systems take a very different approach to how they store and interact with data. Relational databases are very structured in how tables are laid out and how those tables work together. They tend to reduce the amount of data that is stored by keeping data duplication at a minimum. The various pieces of information needed are pulled from related tables. Because of the relational nature of data, it takes a lot of server work to keep the data integrity in tact across the various tables. As a result, relational databases will usually have a high cost for writing data.
Cassandra is a completely different animal. It stores data in a completely different way than a relational database. It is a structured key-value store with column-oriented data storage. What that really means is it has a very flexible way of storing data compared to SQL’s highly structured table design, which provides developers and easy way to expand tables and allows for very fast data writing. A Healthcare solution like Healthx provides has large amounts of health benefit information to deal with, so Cassandra’s data writing performance is very valuable. Another factor in Cassandra’s appeal is that it is decentralized so there is no single point of failure and it provides a high fault-tolerance because data is automatically replicated across many nodes so if a node fails, it can be replaced with no downtime and no data loss. Cassandra was built by Facebook and is now becoming the database system of choice for other large scale web applications like Twitter and Digg.
So what does this Cassandra business really mean for Healthx? That is a question we have been working to figure out because it would be a ton of work to change our entire database over to Cassandra. That’s not really how we do things at Healthx though. We have an “Agile” mentality which means we are always going to look for small incremental improvements over big monolithic projects. We believe it is necessary to crawl before we walk, walk before we run. We know it has taken years to master MS SQL and it would take as long to master Cassandra. So our crawl approach will be to find a pilot client and move some of their data to a Cassandra database as we begin to master it and learn how to expand that to other clients and more data.
Stay tuned as we begin our relationship with Cassandra.
Posted in Engineering, Healthx
No responses to “Cassandra Database System”