At Mixpanel, where our hardware is and the platform we use to help us scale has become increasingly important. Unfortunately (or fortunately) our data processing doesn’t always scale linearly. When we get a brand new customer sometimes we have to scale by a step function; this has been a problem in the past but we’ve gotten better at this.
So what’s the short of it? We’re unhappy with the Rackspace Cloud and love what we’re seeing at Amazon.
Over the history we’ve used quite a few “cloud” offerings. First was Slicehost back when everything was on a single 256MB instance (yeah, that didn’t scale). Second was Linode because it was cheaper (money mattered to me at that point). Lastly, we moved over to the Rackspace Cloud because they cut a deal with YCombinator (one of the many benefits of being part of YC). Even with all the lock in we have with Rackspace (we have 50+ boxes and hiring if you want to help us move them!), it’s really not about the money but about the features and the product offering, here’s why we’re moving:
IO is a huge scaling problem that we have to think about very carefully. We’ve since deprecated Cassandra from our stack but Rackspace is a terrible provider if you’re using Cassandra in general. Your commit log and data directory should be on two different volumes–Rackspace does not make this easy or affordable. EBS is a godsend.
What happens when you need more disk space? You’re screwed -> resize your box and go down. Need more than 620G of space? You can’t do it.
EBS lets you mount volumes on to any node. This is awesome if you ever need to move your data to a bad node instead of having to scp it over.
Edit: Nobody is saying you get better IO performance on Amazon simply that EBS solves different IO challenges that Rackspace does not. IO is basically terrible everywhere on the cloud.
We’re super excited about the variety of instances that Amazon offers. The biggest money savers for us we foresee are going to be Amazon’s standard XL as well as the high CPU ones. Rackspace offers a more granular variety which is a benefit if you need to be thrifty but it bottlenecks fast as you begin to scale and realize what kind of hardware you need.
Rackspace Cloud has had pretty atrocious uptime over the year there has been two major outages where half the internet broke. Everyone has their problems but the main issue is we see really bad node degradation all the time. We’ve had months where a node in our system went down every single week. Fortunately, we’ve always built in the proper redundancy to handle this. We know this will happen Amazon too from time to time but we feel more confident about Amazon’s ability to manage this since they also rely on AWS.
Rackspace’s control panel is the biggest pain in the ass thing to use. Their interface is clunky, bloated, and slow. In my experience, I can’t count how many times I’ve seen their Java exceptions while frantically trying to provision a new node to help scale Mixpanel.
Amazon has awesome and very well vetted command line tools that blow Rackspace out the water. I can’t wait to write a script and start up a node. I believe Rackspace has an SDK / command line tools now though–very early and beta however.
Probably the most frustrating thing on Rackspace is their insane requirement to post a ticket to get a higher memory quota. We’ve had fires where we needed to add extra capacity only to get an error when creating a new node that we can’t. Once you post a ticket, you have to wait for their people to answer your ticket in a 24-hour period. Now we just ask Rackspace for +100G increments way before we ever need it. I know Amazon doesn’t impose these limits to the same (annoying) extent.
Amazon has a CDN and servers distributed globally. This is important to Mixpanel as websites all over the world are sending us data. There’s nothing like this on Rackspace. We have lots of Asian customers and speed matters.
Rackspace has a limit on their automatic backups: 2G. Our databases aren’t weak 2G memory bound machines–nobody’s is at scale. S3 is a store for everything and EBS is just useful for this kind of thing. Cloudfront on Rackspace is still in its infancy.
Backups on Amazon will be so much cleaner and straight forward.
We’ve done a very methodical pricing comparison for our own hardware and have determined that the pricing is actually about the same across both services. We don’t know how well the hardware will scale on Amazon so we over-estimated to make up for crazy issues. Amazon came to about 5-10% cheaper but take that with a grain of salt. It’s probably closer to equal.
Here’s one huge thing though, Amazon in the long-run for our business will be drastically cheaper with the concept of Reserved instances and bidded instances. That’s extremely sexy to us.
Also? Amazon constantly reduces its prices. I’ve never seen Rackspace do that.
What’s the main reason for us?
Amazon just iterates on their product faster than anyone else and has the best one. We expect people to use us because of that in the long-run and we’ve taken note. Rackspace is extremely slow and as the person in charge of infrastructure and scalability we’re going to use the platform that knows how to keep guys like me happy by running fast and anticipating my needs.
Rackspace cloud’s products:
If you have an opinion, express it. Tell us what you hate about Amazon and problems you’ve seen. We haven’t moved yet.