Last month I was invited to give a couple of talks about Cloud computing in the wonderful C3RS (Cisco Cloud Computing Research Symposium). The slides are available online, if you want to check. Although the audiences were quite heterogeneous, there is a recurrent question among the participants of these events: How can I set my private cloud?. Let me briefly summarize the motivation of the people asking this:
I said to these people, take a look to OpenNebula. OpenNebula is a distributed virtual machine manager that allows you to virtualize your infrastructure. It also features an integral management of your virtual services, including networking and image management. Additionally, it is shipped with EC2 plug-ins that allow you to simultaneously deploy virtual machines in your local infrastructure and in Amazon EC2.
OpenNebula is modular-by-design to allow its integration with any other tool, like the Haziea lease manager, or Nimbus that gives you a EC2 compatible interface in case you need one. It is a healthy open source software being improved in several projects like RESERVOIR, and it has a growing community.
When I consider my microwave, telephone, or television I see fairly sophisticated applications that I simply plug into service providers and get useful results. If I choose to switch between individual service providers I can do so easily (assuming certain levels of deregulation of utility monopolies of course). Most importantly, while I understand how these appliances work, I would never want to build one myself. Yet I am not required to do so because the providers use standardized interfaces that appliance manufactures can easily offer: I buy my appliances as I might any other tool. Consequently, I can switch out the manufacturer or models for each of the services I use without interacting with the provider. I use these tools in a way that makes my work and life more efficient.
Nobody listens in on my conversations, nor do they receive services at my expense, I can use these services how I wish, and because of competition, I can expect an outstanding quality of service. At the end of the month, I get a bill from my providers for the services I used. These monetary costs are far outweighed by the convenience these services offer.
It is this sort of operational simplicity that motivated the first call for computational power as a utility in 1965. Like the electrical grid, a consumer would simply plug in their favorite application and use the compute power offered by a provider. Beginning in the 1990s, this effort centered around the concept of Grid computing.
Just like the early-days of electricity services, there were many issues with providing Grid computing. The very first offerings were proprietary or narrowly focused. The parallels with the electric industry are easily recognized. Some might provide street lighting whereas others would provide power for home lighting and still others for transportation and yet another group industrial applications. Moreover, each provider used different interfaces to get the power. Thus switching between providers, not a rare occurrence in a volatile industry, was no small undertaking. This, clearly was very costly for the consumer.
It took an entrepreneur to come to the industry and unify electrical services for all applications while also creating a standardized product (see http://www.eei.org/industry_issues/industry_overview_and_statistics/history for a quick overview). Similarly several visionaries had to step in and define what a Grid computer needed to do in order to create a widely consumable product. While these goals were largely met and several offerings became very successful, Grid computing never really became the firmly rooted utility-like service that we hoped for. Rather, it seems to have become an offering for specialized high-performance computing users.
This market is not the realm of service that I started thinking about early in this post. Take television service: this level of service is neither for a single viewer nor a small-business who might want to repackage a set of programs to its customers (say a sports bar). Rather it is for large-scale industries whose service requirements are unimaginable by all but a few people. I cannot even draw a parallel to television service. In telecommunication it would be the realm of a CLEC.
Furthermore, unlike my microwave, I am expected to customize my application to work well on a grid. I cannot simply plug it in and get better service than I can from my own PC. It would be the equivalent of choosing to reheat my food on my stove or building my own microwave. You see, my microwave, television service, and phone services are not just basic offerings of food preparation, entertainment, and communication. Instead, these are sophisticated systems that make my work and life easier. Grid computing, while very useful, does not simplify program implementation.
So in steps cloud computing: an emerging technology that seems to have significant overlap with grid computing while also providing simplifying services (something as a service). I may still have to assemble a microwave from pre-built pieces but everything is ready for me to use. I only have to add my personal touches to assemble a meal. It really isn't relevant whether the microwave is central to the task or just one piece of many.
When I approach a task that I hope to solve using a program, how might I plug that in just as easily? Let's quickly consider how services are provided for television. When I plug my application(TV) in to the electricity provider as well as a broadcaster of some sort, it just works. I can change the channel to the streams that I like. I can buy packages that provide me the best set of streams. In addition, some providers will offer me on-demand programming as well as internet and telephone services. If anything breaks, I call a number and they deal with it. None of this requires anything of me. I pay my bill and I get services.
Okay, how would that work for a computation? Say I want to find the inverse for a matrix. I would send out my data to the channel that inverted matrices the way I like them. The provider will worry about attaining the advertised performance, reliability, scalability, security, sustainability, device/location independence, tenancy, and capital expenditure: those characteristics of the cloud that I could not care less about. Additionally, the cloud properties that Rich Wellner assembled don't interest me much either. Certainly they may be differentiators, but the actual implementation is somebody else's problem in the same way that continuous electrical service provision is not my chief concern when I turn on the TV. What I want and will get is an inverse to the matrix I submitted in the time frame I requested deposited where I requested it to be put. I may use the inverted matrix to simultaneously solve for earthquake locations and earth properties or for material stresses and strains in a two-dimensional plate. That is my recipe and my problem.
After all, I should get services "without knowledge of, expertise with, or control over the technology infrastructure that supports them," as the cloud computing wiki page claims. Essentially the aforementioned cloud characteristics are directed towards service providers rather than to the non-expert consumer that highlights the wiki definition. Isn't the differentiator between the Cloud and the Grid the concealment of the complex infrastructure underneath? If the non-expert consumer is expected to worry about algorithm scalability, distributing data, starting and stopping resources and all of that, they certainly will need to gain some expertise quickly. Further, once they have that skill, why wouldn't they just use a mature Grid offering rather than deal with the non-standardized and chaotic clouds? Are these provider-specific characteristics not just a total rebranding of Grid?
As such, I suggest that several consumer-based characteristics should replace the rather inconsequential provider-internal ones that currently exist.
A cloud is characterized by services that:
Now that sounds more like Computation-as-a-Service.
There is a controversy in the cloud community today about whether the market is going to be one based on value or price. Rephrased, will cloud computing be a commodity or an enablement technology.
A poster on one of the cloud computing lists asserted that electricity would be a key component of pricing. He was then jumped on by people saying that value would be the key.
It seems like folks are talking past one another.
His assertion is true if CC is a commodity.
Now that said, there are precious few commodities in IT. Maybe internet connectivity is one. Monitors might be another. Maybe there are a few more.
But very quickly you get past swappable components that do very nearly the same job and into the realm of 'stuff' that is not easily replaceable. Then the discussion turns to one of value.
Amazon recognized the commodity of books and won the war over people who were trying to sell value. They appear to be attempting to do the same with computer time, which makes the battle they will fight over the next few years with Microsoft (and the increasing number of smaller players) extra interesting.
There is also the problem of making sweeping statements like "the market will figure things out". There is no "the market". Even on Wall Street. The reason things happen is because different people and institutions have different investment goals. Those goals vary over time and create growing or shrinking windows of opportunity for other people and institutions.
I've made my bet on how "the market" for cloud computing will shake out in the short to medium term. Now I'm just hoping that there are enough of the people and institutions my bet is predicated on in existence.
Besides all the hype, clouds (i.e. a service for the on-demand provision of virtual machines, others would say IaaS) are making utility computing a reality, check for example the the Amazon EC2 case studies . This new model, and virtualization technologies in general, is also being actively explored by the scientific community. There are quite a few initiatives that integrates virtualization with a range of computing platforms, from clusters to Grid infrastructures. Once this integration is achieved the next step is natural, jump to the clouds and provision the VMs from an external site. For example, a recent work from UNIVA UD has demonstrated the feasibility of supplementing a UNIVA Express cluster with EC2 resources (you can download the whitepaper to learn more).
This cloud provision model can be further integrated with the in-house physical infrastructure when it is combined with a virtual machine (VM) management system, like OpenNebula. A VM manager is responsible for the efficient management of the virtual infrastructure as a whole, by providing basic functionality for the deployment, control and monitoring of VMs on a distributed pool of resources. The use of this new virtualization layer decouples the computing cluster from the physical infrastructure, and so extends the classical benefits of VMs to the cluster level (i.e. cluster consolidation, cluster isolation, cluster partitioning and elastic cluster capacity).
Architecture of an Elastic Cluster
A computing cluster can be easily virtualized by putting the front-end
and worker nodes into VMs. In our case, the virtual cluster front-end
(SGE master host) is deployed in the local resources with Internet
connectivity to be able to communicate with Amazon EC2 VMs. This
cluster front-end acts also as NFS and NIS server for every worker node
in the virtual cluster.
The virtual worker nodes communicate with the front-end through a private local area network. The local worker nodes are connected to this vLAN through a virtual bridge configured in every physical host. The EC2 worker nodes are connected to the vLAN with an OpenVPN tunnel, which is established between each remote node (OpenVPN clients) and the cluster front-end (OpenVPN server). With this configuration, every worker node (either local or remote) can communicate with the front-end and can use the common network services transparently. The architecture of the cluster is shown in the following figure:
Figure courtesy of Prof. Rafael Moreno
Deploying a SGE cluster with OpenNebula and Amazon EC2
The last release of OpenNebula includes a driver to deploy VMs in the
EC2 cloud, and so it integrates the Amazon infrastructure with your
local resources. The EC2 is managed by OpenNebula just as another local
resource with a configurable pre-fixed size,
to limit the cluster capacity (i.e. SGE workernodes) that can be
allocated in the cloud. In this set-up, your local resources would look
like as follows:
The last line corresponds to EC2, currently configured to host up to 5 m1.small instances.
The OpenNebula EC2 driver translates a general VM deployment file in an EC2 instance description. The driver assumes that a suitable Amazon machine image (AMI) has been previously packed and registered in the S3 storage service. So when a given VM is to be deployed in EC2 its AMI counterpart is instantiated. A typical SGE worker node VM template would be like this:
NAME = sge_workernodeOnce deployed, the cluster would look like this (sge master, 2 local worker nodes and 2 ec2 worker nodes:
>onevm listYou can get additional info from your ec2 VMs, like the IP, using the onvm show command
So, it is easy to manage your virtual cluster with OpenNebula and EC2, but what about efficiency?. Besides the inherent overhead induced by virtualization (around a 10% for processing), the average deployment time of a remote EC2 worker node is 23.6s while a local one takes only 3.3s. Moreover, when executing a HTC workload, the overhead induced by using EC2 (vpn, and a slower network connection) can be neglected.
This is a joint work with Rafael Moreno and Ignacio M. Llorente
I've written here about the importance of SLAs for useful cloud computing platforms on a few occasions in the past. The idea behind clouds, that you can get access to resources on demand, is an appealing one. However, it is only part of the total picture. Without an ability to state what you want and go to bed, there isn't much value in the cloud.
Think about that for a minute. With the cloud computing offerings currently available there are no meaningful SLAs written down anywhere. Yet people, every day, run their production applications on an implicit SLA that is internalized something like "amazon is going to give me N units of work for M price".
There are two problems with this.
The idea here is that rather than just accepting what your cloud provider sends you at the end of the month as a bill, the world of cloud computing is complex enough that a reasonable set of runtime information must be made available to substantiate the providers claim for compensation.
This is particularly true in the world of SLAs. If my infrastructure is regularly scaling up, out, down or in to meet demands it is essential to be able to verify that the infrastructure is reacting the way that was contracted. Without that, it will be very hard to get people to trust the cloud.
There is a growing number of posts and articles trying to show how cloud computing is a new paradigm that supersedes Grid computing by extending its functionality and simplifying its exploitation, even announcing that Grid computing is dead. It seems that new technologies and paradigms have always the mission objective to substitute existing ones. Some of these contributions do not fully understand what grid computing is, focusing their comparative analysis on simplicity of interfaces, implementation details or basic computing aspects. Others posts define Cloud in the same terms as Grid or create a taxonomy which includes Grid and cluster computing technologies.
Grid is as an interoperability technology, enabling the integration and management of services and resources in a distributed, heterogeneous environment. The technology provides support for the deployment of different kinds of infrastructures joining resources which belong to different administrative domains. In the special case of a Compute Grid infrastructure, such as EGEE or TeraGrid, Grid technology is used to federate computing resources spanning multiple sites for job execution and data processing. There are many success cases demonstrating that Grid technology provides the support required to fulfill the demands of several collaborative scientific and business processes.
On the other hand, I do not think there is a single definition for cloud computing as it denotes multiples meanings for different communities (SaaS, PaaS, IaaS...). From my view, the only new feature offered by cloud systems is the provision of virtualized resources as a service, being virtualization the enabling technology. In other words, the relevant contribution of cloud computing is the Infrastructure as a Service (IaaS) model. Virtualization rather than other non significant issues, such as the interfaces, is the key advance. At this point, I should remark that virtualization has been used by the Grid community before the arrival of the "Cloud".
Once I have clearly stated my position about Cloud and Grid, let me show how I see Cloud (and virtualization as enabling technology) and Grid as complementary technologies that will coexist and cooperate at different levels of abstraction in future infrastructures.
There will be a Grid on top of the CloudBefore explaining the role of cloud computing as resource provider for Grid sites, we should understand the benefits of the virtualization of the local infrastructure (Enterprise or Local Cloud?). How can I access on demand to a cloud provider if I have not previously virtualized my local infrastructure?.
Existing virtualization technologies allow a full separation of resource provisioning from service management. A new virtualization layer between the service and the infrastructure layers decouples a server not only from the underlying physical resource but also from its physical location, without requiring any modification within service layers from both the service administrator and the end-user perspectives. Such decoupling is the key to support the scale-out of a infrastructure in order to supplement local resources with cloud resources to satisfy peak or fluctuating demands.
Getting back to the Grid computing case, the virtualization of a Grid site provides several benefits, which overcome many of the technical barriers for Grid adoption:
If you are interested in more details about how virtualization and cloud computing can support compute Grid infrastructures you can have a look at my presentation "An Introduction to Virtualization and Cloud Technologies to Support Grid Computing" (EGEE08). I also recommend the report "An EGEE Comparative study: Clouds and grids - evolution or revolution?".
There exist technology which supports the above use case. The OpenNebula engine enables the dynamic deployment and re-allocation of virtual machines on a pool of physical resources, providing support to access on-demand to Amazon EC2 resources. On the other hand, Globus Nimbus provides a free, open source infrastructure for remote deployment and management of virtual machines, allowing you to create compute clouds.
There will be a Grid under the CloudThere is a growing interest in the federation of cloud sites. Cloud providers are opening new infrastructure centers at different geographical locations (see IBM or Amazon Availability Zones) and it is clear that no single facility/provider can create a seemingly infinite infrastructure capable of serving massive amounts of users at all times, from all locations. David Wheeler once said, "Any problem in computer science can be solved with another layer of indirection… But that usually will create another problem“, in the same line, federation of cloud sites involves many technological and research challenges, but the good news is that some of them are not new, and have been already studied and solved by the Grid community.
As stated above Grid is not only about computing. Grid is a technology for federation. In the last years, there has been a huge investment in research and development of technological components for sharing of resources across sites. Several middleware components for file transferring, SLA negotiation, QoS, accounting, monitoring... are available, most of them are open-source. As also predicted by Ian Foster in his post "There's Grid in them thar Clouds", those will be the components that could enable the federation of cloud sites. On the other hand, other components have to be defined and developed from scratch, mainly those related to the efficient management of virtual machines and services within and across administrative domains. That is exactly the aim of the Reservoir project, the European initiative in Cloud Computing.
ConclusionsIn order to conclude this post let me venture some predictions about the coexistence of Grid and Cloud computing in future infrastructures:
In summary, let's try to forget about hypes and concentrate on the complementary functionality provided by both paradigms. My message to the user community, the relevant issue is to evaluate which technology meets your requirements. It is unlikely that a single technology will meet all needs. My message to the Grid community, please do not see Cloud as a threat. Virtualization and Cloud are needed to solve many of the technical barriers for wider Grid adoption. My message to the Cloud community, please try to take advantage of the research and development performed by the Grid community in the last decade.
Recent comments
3 weeks 1 day ago
3 weeks 4 days ago
5 weeks 17 hours ago
6 weeks 23 hours ago
8 weeks 2 days ago
8 weeks 2 days ago
8 weeks 6 days ago
8 weeks 6 days ago
9 weeks 5 days ago
9 weeks 6 days ago