Reddit joins Digg, Twitter on Cassandra
I have read that Cassandra needs good IO performance to do well. So I am wondering what type of performance you guys are getting on EC2 and what are the sort of things you have done specifically in regards to it (like RAID).
We are looking into Cassandra for thecadmus.com and any info would be greatly appreciated. Thanks.
I spent several hours Thursday night re-reading Cassandra docs and doing another test install. I am trying to decide between MongoDB (which I have a fair amount of experience with) and Cassandra for a 2 node setup where I care less (at least right now) about latency than about redundancy and convenience. I want to (initially) use an EC2 instance and a non-Amazon VPS and since I can't tell from reading the docs (and I have not seen any published comparisons), I am going to have to set up both a 2 node Cassandra install and a MongoDB replica-pair to see which does better given the hit of one service running in an east coast Amazon availability center and the other in a data center in Texas. If I am ever fortunate to have more than a few users so performance becomes an issue, I would like to add another EC2 (or two) to the mix, but still keep a service running in a non-Amazon data center.
If anyone has any links to Cassandra/MongoDB comparisons for my desired setup I would appreciate seeing them.
What is exactly the difference between Casandra and a document store like CouchDB? Or are they similar?
Just trying to get my head around this nosql thing and the different approaches...
It sort of looks like Cassandra is where all the people with A LOT of data are going. Any particular reason(s) for that?
(Edit: as compared to other 'non traditional' database systems).
Reddit also investigated (and maybe is still investigating) Riak, as they asked some detailed questions on the list. It will be interesting to see another high profile deployment of Cassandra, but I want them to start putting things that actually matter in there.