“Cloud Computing” is a ubiquitous buzzword (buzz-phrase?) in IT news these days. You can’t swing a LOLcat without hitting some article touting specious promises of the cloud’s scalability, availability, or efficiency.
When confronted with these promises, it is natural for IT managers to ask whether the highly-touted advantages of cloud computing are worth the pain and cost of a migration. Likewise, it’s natural for engineers, admins, and other in-the-trenches workers to wonder if it’s worthwhile to tackle the learning curve associated with a given cloud computing product.
Whether you are a front-line sysadmin or a CTO, it’s important to understand how cloud computing is different when it comes to common, everyday tasks. Boomcycle has been in the cloud computing software business for years and would like to help you understand some of the tradeoffs involved and help you acclimate to the different procedures.
Cloud Computing – The Advantages
Perhaps the most brilliant concept of cloud computing is that a “server” is a virtual digital abstraction rather than an actual metal box of hardware. This abstraction of computing resources, coupled with an API and user-friendly consoles, streamlines the requisitioning process immeasurably. Anyone who has requisitioned a real dedicated server box has probably spent a fair amount of time on the phone with the sales department determining the CPU, RAM, disk drives, and other parameters of the machine and then subsequently a fair amount of time waiting for a tech to march into the data center, assemble the beast, finally bring the beast online, assign an IP address, and deliver some sort of credentials so that the real work can begin. Once a dedicated hardware box is up, one might have to run some package updates, install appropriate software, configure the machine, etc. If you fire up another box, you typically have to run all the same updates and software installation on that machine too.
Acquiring new machines in a cloud computing scenario (e.g., Amazon, Rackspace, etc.) is quite different. The “server” or “compute instance” that you requisition is in actuality simply a time slice on hardware that is already running. Your cloud provider has set up some seriously beefy hardware that runs a hypervisor like Xen or KVM and they have created tools for you to easily create a virtual machine running your choice of operating systems. Rather than speaking to a sales team, you can use your account credentials to summon up a brand new virtual machine in a couple of minutes via a browser-based web console or by interacting with an API via script. This virtual machine has a unique address you can use to access it and behaves almost exactly like dedicated hardware.
This machine virtualization first reveals its brilliance when you must create a backup or when you wish to bring more servers online — such as a development server identical to your latest production server. With a few clicks in one’s web browser, you can take a snapshot of a running machine, allocate additional storage, and make an exact copy of the machine with a different IP address. The ability to create a machine image from a running machine makes it trivially easy — and fast — to bring new machines online without the need for extensive configuration or software updates. The ability to allocate computing power via API means that you can automate the process in response to fluctuations in traffic.
The cloud’s virtualization continues to prove itself when the time comes to scale.
For a single compute instance, it is trivially easy to “resize” your virtual machine — i.e., allocate more RAM and CPU power. Gone are the days when a tech must open your box and put in RAM or add hard drives. And when one machine is no longer enough, the easy requisitioning features mentioned above and a variety of other cloud server offerings — like load balancers and managed database services — make it easy to move to a clustered architecture.
One Boomcycle client, after a catastrophic data disaster on their old-school managed, dedicated server, waited 24 hours before their hosting company could get a new machine on-line that matched the original server. In the gut check that followed, we determined that a multi-node architecture in the cloud would improve availability and data recovery and also better handle future growth. A managed database service provides redundant storage managed by the cloud provider. Optionally, a virtual machine in the cloud can serve as a hot spare by replicating the live database.
Multiple virtual machines can be brought online to respond to periodic data crunching needs and then de-allocated to save money when the work is complete.
The virtualization of computing resources offered by the cloud yields benefits for the entire life cycle of your web service. From the very beginning it makes it extremely inexpensive and easy to get a simple website up and running immediately. Cloud-allocated computing resources are more adaptable and eliminate traditional barriers to architectural reconfiguration during development. Backups and versioning are a snap.
And, when it comes time to go live — or scale up in response to overwhelming demand — you can summon essentially unlimited computing resources to do your bidding.
Cloud Computing – The Disadvantages
The cloud, for all its advantages, is not without rough edges.
Perhaps the most conspicuous of these are the highly publicized outages experienced by Amazon in April 2011 and October 2012. Cloud providers have colossal, enormously complex systems involving potentially hundreds of thousands of computers. From Boomcycle’s perspective as consumers of cloud services, these systems are largely a black box, the true inner workings of which are not known to the outside world. When outages occur, the nebulous nature of the cloud makes it hard to understand where the problem lies and this uncertainty can be quite unsettling. It can be difficult to choose an appropriate course of action when you have no information about the nature of an outage.
Cloud computing also leaves a lot to be desired when it comes to the accountability of the service provider. When one’s traditional dedicated server has a problem, the hosting company typically must fix the problem. If a traditional dedicated server is a managed server, there are augmented levels of service that extend to backups, security patches, and other software-related duties. The service paradigms associated with the traditional dedicated servers follow fairly well-worn patterns and realms of responsibility are pretty clearly defined for provider and customer. Because cloud services are relatively new, the delineation of such duties is less clear. It’s typical for a customer to shoulder all responsibility for managing their virtual machines and backups. Helpful, prompt customer service and quality technical documentation can often be hard to come by. When there is an outage and important data is lost, it is unlikely that you will get an apologetic call or any remuneration from your cloud service provider. There is some variation among cloud providers in this respect. With Amazon, it’s usually sink-or-swim so you should be prepared to bring a trusted sysadmin. Rackspace offers a managed Cloud Server product and their tech support is friendly and helpful.
Email can also be a challenge. The address space allocated to cloud-based servers is almost invariably on a spam block list. Obviously, this must be the case or every spam lord from here to Cyprus would be firing up virtual machines and spamming the world. Getting one’s IP address removed from a spam block list can be a hopeless endeavor. Cloud providers do offer mail delivery services. Amazon has SES and Rackspace has a partnership with SendGrid. It’s helpful to have a postfix jockey or mail-savvy sys admin on your team.
Security is also a concern. Similar to how a shared host runs many websites, cloud computing systems run multiple virtual machines on the same hardware. The degree to which cloud providers go to insure isolation between different virtual machines varies by provider and detailed information is not always easy to find. Amazon in particular has in the past (http://aws.amazon.com/articles/1697) offered significant detail about their security measures and also claims a variety of certifications (http://aws.amazon.com/security/) for their EC2 instances. Rackspace, on the other hand, discourages the use of their Cloud Servers or shared hosting for applications that require PCI compliance. While their product lineup continues to evolve, as of this writing they currently recommend a traditional dedicated server if PCI compliance is a requirement. While cloud products can certainly provide sufficient security for nearly any web application you can imagine, we strongly recommend you take the time to assess a cloud product’s security practices before jumping in as security practices and certifications can vary widely from provider to provider.
Is the Future Cloudy?
Overall, Boomcycle firmly believes in cloud computing. We love the way it allows us to efficiently and dynamically allocate computing power to our client’s tasks. We love the way that IT masters like Amazon work tirelessly to keep their machines running. We love the way it helps us reconfigure a server architecture. The Cloud is still a bit of a frontier, however, and it helps to have an experienced guide to help you find the right path.