Quantcast
Viewing latest article 4
Browse Latest Browse All 7

Secure Workload Migration Within and Between Data Centers

This was a very busy week for me at EMC.  It marked the announcement of a solution we put together with a lot of hard work by people from EMC, RSA, Intel, and HyTrust.  This was the first project I was handed after coming on board with EMC and I’ll have to admit that I’m extremely satisfied with the results and with the overall success the project has seen.

The solution was announced this week at Intel IDF 2012 in San Francisco.  We started off the week giving an overview at the Solution Provider Summit for the Open Data Center Alliance (ODCA).  We then got air time during Renee James’ keynote on Wednesday, I had a technical presentation that day, we got the press releases out, shot about two hours of video footage with the Intel film crew that will be posted later this year to Intel’s Cloud Builders site, and spent a collective 12 hours on the demo floor meeting with people to give live demos of what we had done.  Needless to say, I am ready to get on my plane tomorrow morning and come home to North Carolina!

If you want the general overview for the solution and don’t care too much about the details, you can click here.  If you care about the details and want to learn more, continue reading.

First, let’s talk about the challenges we are trying to address.  In this solution, we’re trying to address two major issues that service providers face.  While these are targeted at service providers, it’s important to realize that any enterprise building out their own private cloud will face similar challenges.

The first is workload migration, or moving your VMs from one data center to another to avoid disaster, recover from it, perform maintenance, or other activities that might require you to move your workloads.  For example, let’s say you have a data center on the east coast and have a hurricane approaching.  You may want to plan for the worst and move your VMs from that data center to a data center outside the path of the hurricane.  The challenge here is how to do this non-disruptively without causing downtime to your applications and users, thus avoiding potential lost revenues or productivity.

The second issue we’ve helped address here is guaranteeing the integrity of the underlying hosts that your workloads are running on.  As enterprises move their workloads to cloud providers or build their own, they are asking questions like:

  • How will the cloud infrastructure be verified?
  • How do I know the hosts that my VMs are running on are secure?
  • Will I be able to satisfy my audit and compliance requirements in this environment?  Ultimately an enterprise owns their audit process but when parts of their infrastructure or applications are managed by a service provider, that provider has to be a partner in the audit process.

Look at it this way.  If a single hypervisor is attacked using something like a BIOS rootkit attack, you could compromise dozens of systems in an enterprise cloud or dozens of customers in a multi-tenant cloud provider environment.  Attacks at this level are designed to evade your typical runtime security software like your anti-virus.

Based on those challenges, we wanted to design a joint solution to help address the needs.  First, we wanted to show how EMC VPLEX could be used to enable non-disruptive workload migration within and between data centers.  Second, we wanted to show how Intel Trusted Execution Technology, or TXT, could be used to help add security to a service provider cloud environment by allowing security policy enforcement based on TXT trust status.  Finally, we wanted to show how the TXT control procedures could be used in overall compliance reporting, providing an end to end view of the trust status for the hosts running your cloud environments.

So here is an overview of the solution we built.  We are representing a single service provider who has a cloud infrastructure stretched across two data centers.  We have ESXi 5.1 on all but one host because we wanted that host to be out of compliance, as we’ll discuss later.  We have Intel TXT enabled on the hosts.  We’re using EMC VNX storage arrays and EMC VPLEX Metro which is a storage virtualization appliance that we’ll also discuss later.  We are using HyTrust Appliance for the active security policy enforcement and we are pulling both security logs and TXT host trust status into RSA Archer eGRC to provide overall compliance reporting.

Let me start off by giving a brief overview of EMC VPLEX.  The bottom line with VPLEX is that it is an in-band storage virtualization device that sits between your hosts and your storage and gives us the ability to export virtual volumes from the underlying storage arrays simultaneously.  To my hosts, VPLEX appears as a target and to my targets, it appears as a host.  It gives me an instant copy of my data at both locations.  At the base you have the physical storage layer. Next is the virtual storage layer with VPLEX that supports heterogeneous storage arrays and can create virtual volumes across these different arrays. You then have the physical host layer with VMs on top of that.  Now the really interesting bits of this solution come up when we introduce the second site.  VPLEX’s AccessAnywhere technology allows you to export a single virtual volume from both of these VPLEX clusters simultaneously. From the perspective of workload mobility, my data is now already at both sites.  So when it comes time to move VMs from one site to another, all I have to do is a simple vMotion.  I no longer have to worry with Storage vMotion or replicating large amounts of data from one site to another because my data is already there.  This eliminates the time needed for your data to be moved from one site to another and there are a ton of interesting solutions that can be built on top of this technology.

Image may be NSFW.
Clik here to view.

So we now have our baseline infrastructure.  We have our two data centers with storage, network, and compute and we are using storage virtualization to enable our non-disruptive migration between sites.  Now let’s focus on the security aspect.

For those that are not familiar with Intel TXT, let give a brief overview of it for reference.  TXT is a hardware based security technology that is built into current Intel chipsets.  The bottom line is that it allows me to specify a known good configuration for my hosts in my cloud environment, and then measuring every host in the environment against that known good configuration each time a system is booted.  During that process, parts of the BIOS and hypervisor are measured and if they match the known good values, that host is given a label of “trusted.”  If the values don’t match, that means something has changed that should not have changed and a “not trusted” label is applied to the host.  We can then take that trust status and bring it into our security applications.

Image may be NSFW.
Clik here to view.

The first thing we are going to do with our TXT trust status is bring it into RSA’s Archer eGRC platform and specifically RSA’s Solution for Cloud Security and Compliance.  This solution is based on the Archer eGRC platform.  As of the current release, over 130 VMware-specific controls have been added to Archer to enable VMware security policy implementation and management tied directly to regulations, such as PCI and HIPAA.  This RSA solution does two things.  It discovers new virtual infrastructure devices and it interrogates those devices against the control procedures to verify VMware security controls have been implemented correctly. The results of these automated discovery and configuration checks are fed directly into Archer for continuous monitoring across the cloud environment.  For this solution, we have now brought in Intel TXT related control procedures on top of the existing controls for cloud environments.  This allows us to gain a high level view of our overall hardware compliance, in addition to all the benefits we’ve previously had with our GRC system.

In addition to simply using that trust status for overall reporting of our cloud infrastructure integrity, we can also bring the TXT trust status into HyTrust Appliance.  That solution sits in between the administrators and vCenter and gives me the ability to create administrative policies based on that trust status.  In our solution, we have set up policies that prevent an admin from moving a virtual machine, or workload, from a trusted host to an untrusted host.  If that is attempted, HTA will block it and generate real-time security events that can also be fed into RSA Archer’s Incident Management view so that actions can be taken to mitigate that risk.

So now we have the complete solution.  We have our data centers with EMC storage.  We are using EMC VPLEX to export distributed virtual volumes from those data centers to present to our hosts so that we can enable non-disruptive workload migration within and between the data centers.  We have enabled Intel TXT on all of our hosts, we have created our white list server, and we are measuring each cloud host against that known good configuration.  We are then taking that TXT trust status and creating policies that restrict movement of our workloads to a host that is not trusted.  And finally, we have wrapped everything into RSA Archer for both high level compliance views/reporting as well as real time incident event management.

Image may be NSFW.
Clik here to view.

Now let’s take a walk through some screenshots so you can get an idea for what this looks like in the real world.  This first screenshot is the trust attestation server or verification server.  This is what polls vCenter for the hosts and stores the overall trust status of the hosts.  This is an application developed by Intel so companies can take advantage of TXT on their server platforms.  Notice that three of the four hosts in the demo environment have an overall trust status of green.  There is one that is untrusted and if you notice, that’s because the VMM status is negative.  On this host, we installed ESXi 5.0 which does not match our white list server running ESXi 5.1.

Image may be NSFW.
Clik here to view.

This next shot is a view of the Cloud Security and Compliance view in RSA Archer.  As you can see from the graph in the upper left corner, our overall compliance rating is 75% which represents 3 out of our 4 hosts with a label of “trusted.”

Image may be NSFW.
Clik here to view.

Next, I’ll show a shot of vCenter after we have attempted to migrate a VM from one of our trusted hosts to that untrusted host.  When I attempt to do that, HyTrust Appliance blocks the migration and I get an error in vCenter.

Image may be NSFW.
Clik here to view.

We can then view specifics of that log by going to the HyTrust tab within vCenter.  It is from here that I can do all administrative functions for the HyTrust Appliance and view my logs for the enforcement actions.

Image may be NSFW.
Clik here to view.

Finally, I can show those logs being brought into RSA Archer for my real time security event management.  Once I have them coming in, I can trigger other actions to help mitigate my risk.

Image may be NSFW.
Clik here to view.

So that’s it end to end.  Overall it was a great project and we are targeting Q1 of 2013 for the general availability of all the parts.  As part of the release, we are planning to produce an EMC Proven Solution Guide around the solution as well as an Intel Cloud Builders document.  We will also have a complete video demo of the solution available at that time.


Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

Viewing latest article 4
Browse Latest Browse All 7

Trending Articles