Author: Industry Perspectives

Migrating to the Cloud: Top 3 Best Practices
Jake Robinson is a Solutions Architect at Bluelock. He is a VCP and former CISSP and a VMware vExpert. Jake’s specialties are in infrastructure automation, virtualization, cloud computing, and security.

JAKE ROBINSON
Bluelock

Working at an Infrastructure-as-a-Service provider, I see a lot of IaaS application migration. Migration occurs in both directions–from physical servers to cloud, from private cloud to public cloud (and back), and to private cloud from public cloud.

Though it occurs often, migration shouldn’t be rushed. A poor migration strategy can be responsible for costly time delays, data loss and other roadblocks on your way to successfully modernizing your infrastructure.

Each scenario is different based on your application, where you’re starting from, and where you’re going.

Best Practice: Pick Your Migration Strategy.
- Option 1: Just data migration. This is typically the correct choice for Tier 1 and 2 applications. If you choose to migrate your VM or vApp, it’s still going to be constantly changing. If it’s a Tier 1 application you won’t be able to afford a lot of downtime, so typically, we’ll recommend invoking some sort of replication. Replication is a complex, detailed subject in itself, but the key to understanding it is to identify the size of the data, the rate of change and the bandwidth between the source and target. As a general rule, if your rate of change is greater than or equal to your bandwidth, your migration will likely fail. That’s because the rate of change refers to everything coming in to the app, it’s gaining gravity as the rate comes in. The bandwidth is like the escape velocity it requires to get off the ground, or migrate. You need a high enough bandwidth to “overtake” that rate of change.
- Option 2: Machine replication. This is best for Tier 1 and 2 applications that can afford some downtime and it involves stack migration. There is less configuring in this scenario, but there is more data migrating. Option two is best if you’re moving to an internal private cloud. You will be able to replicate the entire stack, because you have plenty of bandwidth to move stuff around. It’s important to note the portability of VMware-based technology, because VMware allows you to package the entire VM/vApp, the entire stack, into an OVF. The OVF can then be transported anywhere if you’re already on a virtualized physical server.
- Option 3: P2V migration. You typically see this for Tier 2 and 3 apps that are not already virtualized. The concept involves taking a physical app and virtualizing it. VMware has a VMware converter that does P2V, and it’s very easy to go from a physical to a private cloud using P2V. It is, however, an entirely different set of best practices, and you should do some extended research to make sure you have the latest updates, best practices and suggestions. In option three, there is no replication; however, those apps can be shipped off to a public cloud provider to run in the public cloud after being virtualized.
- Option 4: Disaster Recovery. A final path some companies take is to treat it as a Disaster Recovery (DR) scenario. Setting up something to do basically replication from the physical to one machine to another. They choose to replicate the entire stack from point a to point b, and then click the failover button.
Now, let’s say you have identified the best vehicle and path to migrate your application. Before you actually get to work there is still quite a bit of information to evaluate and incorporate.

Best Practice: Understand the Gravity of Your Data.

When moving Tier 1 applications from a physical data center to a private or public cloud, we have to take data gravity into account, and the data itself will be the weightiest part.

There’s no easy way to shrink down the data, so you need to evaluate the weight of the data in the app you’re considering migrating. Especially if you’re a high transaction company, or if it’s a high transaction application, there would be a lot of data to replicate. The data of the app constitutes 99 percent of the data gravity of the application.

Another aspect that you should evaluate as part of your pre-migration plan is to determine how connected your VM or vApp is to other apps. If you have a lot of applications tightly coupled to the application you want to migrate, the cloud might not be an option for that application, or at least only that application.

Best Practice: Identify How Your Apps Are Connected.

Does your application have data that other applications need to access quickly? If so, an “all or nothing” philosophy of migration is your best option. If you have an application that is tightly coupled to two or three others, you may be able to move them all to the cloud together. Because they are still tightly coupled, you won’t experience the latency that would occur if your cloud-hosted application needed to access a physical server to get the data it needs to run.

A step beyond identifying how many apps are tied to the application you wish to migrate, work next to identify which of those applications will be sensitive to latency problems. How sensitive it can be should be a consideration of whether you migrate the app or not.

To be able to check this best practice off your list, be very sure you understand everything your application touches so you won’t be surprised later, post-migration.

Each application, and migration strategy, is unique, so there is no detailed instruction manual that works for everyone.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
February 22, 2013
Keep Your Servers Cool, And Your Business Hot
Gary Bunyan is Global DCIM Solutions Specialist at iTRACS Corporation, a Data Center Infrastructure Management (DCIM) company. This is the 10th in a series of columns by Gary about “the user experience.” See Gary’s previous columns on Turning DCIM’s Big Data into Actionable Insight and Unlock Your Capacity By Unplugging Your Ghost Servers.

GARY BUNYAN
iTRACS

With more and more computing resources being deployed in denser and denser data center spaces, it’s no wonder data center professionals are focused on cooling. The data center is now at the epicenter of today’s hottest businesses, essential to their competitive positioning and market success. But no matter how dense the data center gets, its servers, blades, and other IT assets must be kept cool.

Cooling is an expensive necessity in many data centers. Customers – like the one in the Middle East I was just visiting – are always looking to provide just the right amount of it.

What’s the “right” amount of cooling? It’s the perfect balance between under- and over-cooling in a consistent flow across all of your assets. Not enough cooling and the servers are at risk. Too much cooling and you’re spending more on energy that you need. Uneven cooling and you get thermal inconsistencies – areas either too hot or too cold. The right amount of cooling – and you’re running the most efficient infrastructure possible, minimizing both your cooling costs and your risk.

The trick is to use a DCIM solution that offers you live data about thermal conditions at the individual server level, and sends you real-time alerts if a threshold is in potential of being reached. Instead of guessing about server temperatures, DCIM gives you actual inlet and outlet readings at the device level in your dashboards. Armed with this knowledge, you can run a “lean machine” that minimizes cooling while keeping your assets safe. This tightrope can only be walked safely if you have real-time information at your fingertips. Guesswork is way too dangerous.

Beware of Thermal Hot Spots

Using a DCIM tool with visualization lets you identify, manage and resolve potential hot spot issues on your floor before they turn into problems. The key is to be proactive, not reactive. Here’s how it works:

Forensics: Identifying the Source of the Hot Spot

(1) A thermal hot-spot alert goes off in your DCIM environment. If you’re using a visualization tool, you can instantly see the problem area highlighted in red. The alert is being fed by real-time data from any number of sources depending on your environment – Intel DCM, Power Assure, RF Code, or other data feeds. With a few clicks, you interrogate the alert and learn that rising temperatures in the top U positions of 3 racks are about to go critical.

Real-time thermal readings indicate an issue.

(2) You run forensics at the device level, looking at a live data feed of inlet and outlet temperatures from each affected server. You confirm that server inlet/outlet temperatures have exceeded thresholds.

Inlet/outlet temps in the server pool have exceeded thresholds.

(3) With a few clicks – still within the DCIM system – you review maintenance schedules and reports. You learn that a CRAC unit is still offline, past its scheduled repair window.

(4) With a few more clicks, you interrogate the servers under threat and identify which business unit owns them – you confirm they are revenue-generating applications with direct impact on profitability. So you must take action immediately.

Resolution: Migrating the Applications to a Safer Environment

(5) Using DCIM’s predictive what-if scenarios, you quickly determine where you can move the applications running on the endangered servers – you need to migrate them to cooler IT assets with the appropriate power and connectivity.

(6) You confirm your migration strategy in the safety of the DCIM software and give your technicians a clear set of instructions – complete with automatically-generated 3-D diagrams – so they know exactly what to do.

(7) You confirm with the business unit and secure approval to move the applications, then dispatch you technical teams to execute the move.

Applications are temporarily moved to safeguard the business until hot-spot is resolved
Images courtesy of iTRACS..

8) Once you’ve confirmed that the CRAC maintenance has been completed and the CRAC unit is back online, you migrate the applications back to the original servers.

The Bottom Line – Uninterrupted Revenue for the Business

Identifying and resolving hot spot issues is relatively easy when you have the right DCIM tools. And the benefits are quantifiable:
- Optimum business continuity – the organization’s revenue-generating applications continue operating with no impact on customer service or revenue streams.
- Uninterrupted service levels – you continue to meet your SLAs.
- Mitigating risk associated with maintenance – you use incident to improve maintenance scheduling and minimize potential future risk to operations.
See Gary’s previous columns on Turning DCIM’s Big Data into Actionable Insight and Unlock Your Capacity By Unplugging Your Ghost Servers.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
February 21, 2013
Meet the Future of Data Center Rack Technologies

Raejeanne Skillern is Intel’s director of marketing for cloud computing. Follow her on Twitter @RaejeanneS

RAEJEANNE
SKILLERN
Intel

The Open Compute Summit just keeps getting bigger and better. By the numbers, the two-day event held in Santa Clara in mid-January this year drew three times the crowd of the 2012 gathering – amounting to more than 1,500 attendees! I could barely get a hotel room in the area due to the large number of people coming in for this event.

The summit is a meeting place for the people and organizations that support the Open Compute Project, an initiative announced by Facebook in April 2011 to openly share data center designs across the industry. And with the growth of this summit, it was clear that end users and vendors alike are getting involved and sharing ideas to make this a reality.

At the event, Intel (a founding member of the Open Compute Project) announced our collaboration with Facebook where we are defining next-generation rack technologies and how we will enable these technologies through Open Compute. As part of this collaboration, our two companies unveiled a mechanical prototype, built by Quanta Computer, that includes Intel’s new and innovative photonic rack architecture. This prototype showed the cost, design and reliability improvement potential of a disaggregated rack environment using Intel processors and SoCs, distributed switching with Intel switch silicon, and interconnects based on Intel silicon photonics technologies (green cables in photo below).

This rack prototype was unveiled at Open Compute Summit. Intel’s photonic rack architecture, and the underlying Intel silicon photonics technologies, will be used for interconnecting the various computing resources within the rack. (Photo by Intel.)

That’s the big picture—and the big news. Let’s now drill down into some of all-important details that shed light on what this announcement means in terms of the future of data center rack technologies.

What is Rack Disaggregation and Why is It Important?

Rack disaggregation refers to the separation of resources that currently exist in a rack, including compute, storage, networking and power distribution, into discrete modules. Traditionally, a server within a rack would each have its own group of resources. When disaggregated, resource types can then be grouped together, distributed throughout the rack, and upgraded on their own cadence without being coupled to the others. This provides increased lifespan for each resource and enables IT managers to replace individual resources instead of the entire system. This increased serviceability and flexibility drives improved total cost for infrastructure investments as well as higher levels of resiliency. There are also thermal efficiency opportunities by allowing more optimal component placement within a rack.

Intel’s photonic rack architecture, and the underlying Intel silicon photonics technologies, will be used for interconnecting the various computing resources within the rack. We expect these innovations to be a key enabler of rack disaggregation.

Why Design a New Connector?

Today’s optical interconnects typically use an optical connector called MTP. The MTP connector was designed in the mid-1980s for telecommunications and not optimized for data communications applications. At the time, it was designed with state-of-the-art materials manufacturing techniques and know-how. However, it includes many parts, is expensive, and is prone to contamination from dust.

The industry has seen significant changes over the last 25 years in terms of manufacturing and materials science. Building on these advances, Intel teamed up with Corning, a leader in optical fiber and cables, to design a totally new connector that includes state-of-the-art manufacturing techniques and abilities; a telescoping lens feature to make dust contamination much less likely; with up to 64 fibers in a smaller form factor; fewer parts – all at less cost.

What Specific Innovations Were Unveiled?

The mechanical prototype includes not only Intel silicon photonics technology, but also distributed input/output (I/O) using Intel Ethernet switch silicon, and supports Intel Xeon processor and next-generation system-on-chip Intel Atom processors code named “Avoton.”

These innovations are also aligned to Open Compute projects underway. The Avoton SOC/memory module was designed in concert with the writing of the CPU/memory “group hug” module specification that Facebook proposed to the OCP board work group at the summit. The existing OCP Windmill board specification (that supports the 2S Xeon processors) will be modified to show that the power and signal delivery to the board was modified to interface with the OCP Open Rack v1.0 specification (for power delivery through 12V bus bars) and for networking (to interface with a tray-level mid-plane board that holds the switch mezzanine module. Intel will also contribute a design for enabling a photonic receptacle to the Open Compute Project (OCP) and will work with Facebook*, Corning*, and others over time to standardize the design.

What About Other Innovations?

Intel has already delivered several innovations to the Open Compute Project and its working groups to enable future designs based on Intel Architecture. These innovations span board, system, rack, and storage technologies.

Here’s an example of how Open Compute Project investments are driving new technologies and products available on Intel Architecture.

Motherboards, storage, racks and management technologies are all running on Intel architecture, with multiple vendors.

In particular, Intel has been working with the OCP community to finalize the Decathlete board specification for a general-purpose, large-memory-footprint, dual-CPU motherboard for enterprise adoption. We expect that in 2013 several end users will be purchasing products from OEMs (Quanta & ZT Systems today) based on Decathlete. Intel also supported Wiwynn’s design efforts using the current Intel SoC roadmap to enable Knox Cold Storage (Centerton today, Avoton in the future).

Intel has been working with the OCP community to finalize the Decathlete board specification for a general-purpose, large-memory-footprint, dual-CPU motherboard for enterprise adoption.

Want to Dive Even Deeper?

To learn more about silicon photonics, see Intel’s video, “How Silicon Photonics Works” and to hear more about silicon photonics potential impact on the data center, see Data Center Knowledge’s story and video with Jeff Demain of Intel Labs, “Silicon Photonics: The Data Center at Light Speed.” For a look at innovations driven by all the contributors to the Open Compute Project, visit the Specs & Designs section of the Open Compute Project website.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

February 20, 2013
The Taxonomy of Exascalar

Winston Saunders has worked at Intel for nearly two decades and currently leads server and data center efficiency initiatives. Winston is a graduate of UC Berkeley and the University of Washington. You can find him online at “Winston on Energy” on Twitter.

WINSTON SAUNDERS
Intel

This is the third post in a series on Exascalar. See Part II.

The Exascalar plot of the recent Green500 data shows a triangular shape (as shown in Part I) reproduced below with a triangle added for emphasis. The shape is so persistent I decided to spend a little time thinking about what the shape means and thought I’d share some insights. Of course, this being social media, comments and corrections are always welcome.

Click to enlarge graphic.

Examining the Elements

To describe something, it’s best to give it a name. So I decided to give each element of the triangle a name. The upper vertex of the triangle, for instance, has the highest performance and very high (perhaps even the highest) efficiency. This corner represents “Super Computing Leadership.” The vertex at the lower right, we have seen previously, is where new architectures with higher efficiency tend to appear. This was the case with the BlueGene Architecture in June 2011, as is the case with the Xeon Phi architecture in November 2012. This corner is appropriately named the “Technology Doorway.”

The corner to the left I call the “Corner of Inefficiency.” Why? As I noted in my previous blog post my previous blog post, the systems here produce no more work output that systems in the “Technology Doorway” but at this time may consume up to one hundred times more energy, representing a huge opportunity for cost-of-ownership optimization.

Click to enlarge graphic.

Each leg of the triangle can also be associated with a specific attribute. The base of the triangle is the lower performance cut-off of the Top500 list and is governed by performance and the population of supercomputers. The hypotenuse is constrained by system power. The associated economics push against a “Wall of Power.” We expect this edge of the triangle to be largely immobile. The right leg of the triangle is by far the most interesting as systems along this edge represent are pushing forefront of technology and efficiency innovation. This leg will push to the right as time advances.

An encouraging evolution would be to see the triangle, which indicates systems are primarily retired for performance reasons, turn into more of a trapezoidal shape, where there is an equally pronounced cut-off at the “corner of inefficiency.” Given large energy budgets required for multi-MegaWatt systems. This seems like an opportunity.

Future releases of Exascalar will either validate this viewpoint or provide greater insight into the development of computing technology. Until that time, as always, comments are encouraged and appreciated.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

January 28, 2013