February 15, 2008

Virtualization and Hosting

Egenera has numerous hosting clients that use the Processing Area Network (PAN) architecture (and PAN Manager software specifically) to improve competitive advantage for their businesses. One of our largest customers leverages Egenera as part of the foundation for the fastest growing segment of their business - ascending to the top spot in Gartner's Magic Quadrant.

When I ask system administrators about their biggest challenges in deploying virtualization within their hosting infrastructure, they tell me it all comes down to politics and misconceptions. "Many application owners want their apps hosted on dedicated boxes" is a common response. They want dedicated servers that they can point to and say, “That server is mine. It will never go down because no other application can touch it. Performance will be good because there is no hypervisor overhead. It’s mine, all mine!"

The problem with this philosophy is that it is bad for business and even worse for the application owner. Let me clarify:

  1. Dedicated hardware leads to server sprawl, underutilized CPU and memory. Multiply this dedicated hardware by 100 or 1000 and now you have a big mess. In the long run this mess (multiple cables, switches, disks, HBAs, KVMs, tape drives, etc.) brings overall availability of dedicated hardware to below shared hardware. Shared hardware is consolidated down where physical mess is replaced with easy to manage software.
  2. Dedicated hardware costs money. Lots of money. Not just the capital expense of the box itself but the power/cooling and operating expense associated with managing all the complexity around it. IDC says for every $1 spent on hardware it costs $0.50 to power and cool it yearly.
  3. Dedicated hardware cannot grow capacity as the application requires it. If your application is on a 2-way dual core 8GB machine, it may take months of downtime to get an extra 4 cores and 8GBs of more memory.
  4. Dedicated hardware is ineffective  for disaster recovery. If your application is tied to a specific box and cannot easily move to another server that is different (CPU, memory, driver, chip vendor, network and SAN cabling varies), verifiable DR is nearly impossible. Basically you are doubling physical complexity and doubling the management headache. Every time system administrators need to conduct a change control it has to happen twice. 70% of DR plans today require 100+ steps to accomplish a single server replication. This is because of physical complexity tied to the dedicated box.
  5. Dedicated hardware is not necessarily faster than shared pools of resources within a single box. For example the Xen hypervisor provides close to bare metal performance. Most apps are not CPU bound but IO and memory bound. Certain hypervisors out there play well with this type of profile.
  6. Dedicated hardware is not necessarily better for high availability. If your application is tied to a specific box, model, make and speed and if that box fails and you do not have complicated clustering in place, you may be down for days.
  7. Dedicated hardware is not necessarily better for security. A little lonely box just wouldn't get the same kind of attention that the big honker in the corner gets. The issue could be in the partitioning though. But a bare metal hypervisor resolves that by isolating one server from another. If one crashes the others are unaffected.

Bottom line is that system administrators are burdened to educate the business on how virtualization can decrease expenses, provide a competitive advantage and take data centers from cost to profit centers. There is no silver bullet to solving this challenge.

What I suggest system administrators do is deploy an architecture that can allow applications to go from physical to virtual (VMs that are bare metal partitions and NOT hypervisors) and back quickly and with little downtime. Do not tie applications to a specific box or box type. Remove the state from the equation. If you can do this, then you can persuade your application owners to first try deployment on bare metal partitions (we call them vBlades). They get verifiable DR, better management, better asset utilization monitoring, and capacity on demand in minutes out of the box. If the application owner decides that they really really need a physical box later (and the monitoring data that you have in hand proves it), you can move them over to that setup quickly. It all comes down to deploying that next wave of virtualization ("think Virtualization 2.0") infrastructure and taking the time to educate your end clients!

    October 10, 2007

    Virtualization 1.0-2.0+

    So, what is "Virtualization 2.0"? And what's "Virtualization 1.0" for that matter?

    Starting with where the industry is moving...Virtualization 2.0 starts with removing the layers of complexity by virtualizing both above the operating system and below. That means going beyond server virtualization (Virtualization 1.0) and combining that with network and storage virtualization to transform data center infrastructure into flexible, changeable assets.

    Virtual Machine (VM) technology (using hypervisors from VMware, XenSource and others) is still a focus. No one will deny that it's important technology for consolidating boxes and upping utilization. But for virtualization to move into a more strategic and powerful role in your organization, you should be coupling VMs with robust management software that replaces specialized VM instruction sets with point and click tools that manage both virtual and physical assets. This way, you can run bare metal mission critical servers and VMs on the same platform. You can do this only if your platform allows for out of the box high availability , verifiable disaster recovery, and solves power and cooling issues for all server types.

    The "holy grail" of Virtualization 2.0 and beyond is when you can go physical to virtual and virtual to physical dynamically.

    Egenera is the "Data Center Virtualization Company" - you've heard us say this a lot. Data Center Virtualization is Virtualization 2.0+. We've been going outside the "box" for more than seven years now.

    August 22, 2007

    Virtualization - assess for success

    "How many VMs or Egenera vBlades can I put on one of your bare metal blades?"

    This is a question I am asked repeatedly. Unfortunately, there is no absolute answer as not all applications are created and hosted equally. Some of our customers run 40 vBlades on a single Processing Blade, while others run just five. Both are equally happy with performance and seeing great return on their investment.

    However, here are few things to consider and some tips to help you size and architect your VMs:

    1. Track the CPU and memory utilization of all your servers - make sure to log it by time of day and type of day. For example, applications involved with equity trading will have high utilization when the market is in session.
    2. Determine the uptime requirement for the application. Does it require N+1 hardware high availability? Are you OK grouping together mixed application VMs where some require little uptime and others require a lot?
    3. Figure out if your application requires chargeback. Is it SAN hosted? Are you using VLANs for your applications? (SAN and VLANS are key factors to processor virtualization as they facilitate easy VM moves from bare metal to bare metal)
    4. Are there security policies in place that require physical separation of servers via physical firewalls or switches?
    5. Can applications that are IPv6 compliant live on the same machines that are IPv4?
    6. Can Linux operating systems coexist with Windows operating systems?
    7. Do any of your applications require access to physical devices (like USB) on the metal?
    8. Consider moving from Dual-Core to Quad-Core x86 blades. Your power needs will remain the same but you’ll be able to increase the number of vBlades you can host safely by 35%. Not to mention Intel and AMD are embedding virtualization triggers into most of their chips going forward.
    9. Use assess tools like Platespin's PowerRecon. This is one example of a good tool that can help you navigate through the sizing effort.
    10. Inject tools into your architecture that allow for easy migration of VMs from one bare metal server to another. The tools allow for re-sizing of VMs (CPU and Memory) based on load. The tools also enable virtual to physical migration. For example, Egenera PAN Manager allows you to run your "server" on a vBlade and then host it on a physical "pBlade" if needed later. Going from p 2 v and v 2 p is critical, as your Windows file server today may not need a dedicated bare metal blade but when you grow 5x (say because of an acquisition), it may very well need to.

    The bottom line here is to think about the plan first, engage subject matter experts and invest wisely. Take the time to assess your architecture as it’s very easy to get caught up in the great excitement that is virtualization.

    Now that VMware has gone public and XenSource will be merging with Citrix, the hypervisor market is becoming larger than ever. Platespin has done a nice job developing and marketing their tool to take advantage of this trend as has Egenera with PAN Manager software. Of course at Egenera, we're extending the value of the hypervisor through data center virtualization. The hypervisor is just one small piece of the puzzle.

    July 27, 2007

    Tying it all together

    Virtualization, Disaster Recovery (DR) and Green IT (power and cooling)! These three topics are headlining the tech sections of newspapers around the globe, and businesses are smartly marketing their products around this tidal wave of buzz. Everywhere you look, it's the same themes.

    Take for example, CNBC's Mad Money, one of my favorite shows. Jim Cramer just recently spent an entire broadcast on virtualization and VMware's IPO slated for next month. GE just announced a "green" credit card. And there's still press on the most widely covered instance of DR failure resulting from hurricane Katrina. New Orleans is still reeling from the loss of its many data centers and resident insurance records housed within them. Unfortunately in some cases, not having had a solid DR plan in place outside of the state, some data may never be recovered. I know I keep bringing DR into my discussions here, but it's something I think about a lot and something our customers are concerned with.

    Important to note, these three topics are all interconnected. IT organizations need to be looking at issues across their enterprises holistically, lest they waste time, money and productivity on each "silo." Ultimately, the biggest loss will be to the competition as different initiatives are farmed out as separate projects.

    I'd love to hear from readers how you're approaching these issues. Have you found one approach to address all three? Have you made changes in your current infrastructure to focus on one over another? What are your future plans or where have you already realized success?

    As always, we look welcome any success stories and commentary that our readers have on how they're tying the pieces together.

    June 28, 2007

    Dog Days Again...

    It’s the dog days of summer again and here at Egenera we continue to help our customers best manage whatever comes their way.

    Do you have a sound DR strategy for your data center to deal with brownouts (for example)?  It's worth looking at again. Last year the top of the Empire State Bldg had to turn the lights off as power was at a premium with the high temperatures.  Check out what just happened in NYC this week.

    Have you gotten any further going "green"?  Quick reminder: 38% of the nation's power is consumed by our data centers.

    We're not done talking about it.

    We hope you're not either.

    June 15, 2007

    Moving beyond legacy

    So why have IBM, HP and Sun dominated the enterprise server market for so long? No it's not because of the many millions of dollars they put towards marketing and advertising (although they do of course). It's that they continue to leverage the massive install base they've amassed - customers they can "lock in" because they have so much already invested.

    I've talked about this before. In many large enterprises, the result is dedicated silos of computing with little sharing for applications. For example, edge server applications (web, file, print) have their own operating systems and hardware, while mission critical back-end server applications (Oracle, ETL, email, CRM) do as well. All these applications need CPU and memory. Why can't they share pools of resources yet still have separation?

    So in my opinion, the bottom line is that the "Big 3" are driven primarily by leveraging the legacy...but they have customers doing things to their data centers that waste huge amounts of money; create massive complexity; and can't scale without throwing more hardware (and services) at the issue. For example HP sells the following server lines and has been for years:
    - DL380/D580s for file and print apps
    - p/c-class blades for blade apps / virtual apps
    - Itanium servers for OpenVMS apps
    - Superdomes for heavy HP-UX apps

    IBM is doing the same thing (x series, Bladecenter, p-Series, z Mainframes). And many clients are OK with fragmenting their server environment because again, they've had this set-up for years. They are OK locking into operating systems like HP-UX or AIX, which tie you to a specific vendor.

    How can the customer get the best value when their applications are locked on different types of hardware? They have no bargaining power because the proprietary OSs are preventing competitive hardware to be introduced. If you have all different hardware and OSs, think about how that affects disaster recovery and data center management. Not good!

    The solution:

    1. Move towards migrating applications to the most common OSs supported by vendors today (Linux, Windows, Solaris x86 are the most popular). This will give you more leverage with the hardware vendors.
    2. Look for single source platform solutions that integrate hardware and software that allow for sharing CPU and memory across different application pools. It's OK to run a VM farm on the same Frame as your mission critical Oracle RAC. You CAN get bare metal performance for your high-end applications on one Frame. You CAN get N+1 high availability, 5 9s reliability, and share CPU and memory without risk.
    3. Think like a "client" not just a "customer." Ask for one solution that can solve it all and allows you to make changes as your application set evolves or your business goals change. Do not settle for a piecemeal approach when you don't have to anymore.

    May 09, 2007

    Consolidate 10 data centers to 2

    Are you consolidating and centralizing your data centers from "many to few"? Well, you're not alone. Take a look at some headlines from the last year:

    Why consolidate data centers at all? We all know that any change to any data center - from a facilities level to start with - means months - if not years - of planning. But, the bottom line is scaling down actually helps large enterprises reduce IT total cost of ownership in regards to energy consumption (power), real estate, WAN and SAN Infrastructure, hardware, software licenses, maintenance services, and staff headcount.

    Let me remind you that an estimated 38 percent of the U.S.’s power supply is being absorbed by data centers (Source: Univ.NH study). As corporate regulatory policies are enforced based on Sarbanes-Oxley Compliance or Government mandates, reducing the number of centers to manage is key. Add disaster recovery requirements for each data center on top of this and you have an ever growing problem to solve. The bottom line is "Why have 10 data centers when you can have 2...while providing the same or better service levels."

    How to get started...well there are many opinions on this. At Egenera of course, we're of one mind on where to start. But you knew that already.

    Server virtualization is obviously a key piece of the puzzle but we've discussed how it's not a silver bullet in itself. Hypervisors are important enablers (for us too!) but VMware or Xen alone cannot solve your problems. Customers need to look farther ahead than that.

    Related blogs entries on "Data Center Virtualization" and "Darkened Data Centers" are quite clear on what I believe are two key pieces to the puzzle.

    It's finally happening. Take a look.

    April 06, 2007

    Going Dark (data center dark that is)

    A "darkened" or "dark" data center has been designed to eliminate the need for human intervention so that system admins and managers can deal with mission critical development and process improvement projects vs. spending all their time fighting fires. These "lights out" data center initiatives have allowed managers to achieve the following:

    • Lower TCO - By introducing tools that allow for remote management of IT platforms, IT managers can centralize systems administrators into primary locations. By having SAs work out of a few locations managing multiple remote sites, IT teams will reduce the amount of travel expenses and drive up the productivity of their staff. IT staff will be able to deliver "more with the same" and maximize their working potential. IT managers will also have better control of their asset management enabling them to reduce waste, reduce server sprawl and make smarter buying decisions.
    • Faster Time to Server Recovery - By having the tools in place to remotely troubleshoot computer systems and applications, subject matter experts are empowered with a much broader toolset. No longer are experts dependent on hands and eyes at a keyboard (physically) in front of a server or a "crash" cart hundreds of miles away. With the introduction of better remote management tools, these experts can be introduced more frequently and at a higher volume during high severity fire fighting.
    • Faster Time to Server Deployment - Many organizations blame the long deployment time of a new server on the coordination of activities between multiple parties. For example, to setup a new server, the facilities team has to provision power, space, and cooling; the network team has to run fiber and allocate ports on switches and routers; the SAN team has to do the same on their side; and the OS provisioning team has to build the server image. Every time a new server is added this process needs to be recreated over and over again. If these steps could be consolidated to only occur once and then all new server add ons could be done remotely, one can reduce time-to-deployment from a matter of weeks to hours.
    • Reduction in Error and Downtime - Most IT downtime is the result of human error or change controls gone bad. By introducing a management software layer with unbreakable processes to control these changes, downtimes can be reduced. Imagine changing 100+ network cables from one switch to another during a change control. Instead of having to unplug and plug each one at a time, one can run a few scriptable software commands to get the job done. This is very achievable with the right tools in place.

    If you've bought in, there are a number of ways to start darkening your data center.

    1. Remove physical components and replace with software. We talk about this a lot at Egenera. The less hands-based robotic functions required in your data center, the better. If these tasks can be replaced by software commands, those commands can be initiated from anywhere.
    2. Invest in compute hardware that is modularly field replaceable. Screwdrivers should not be needed to fix any component of the server. If a blade fails just pull the blade out and insert a new blade - no re-cabling or re-configuration needed if a new part needs to be swapped in.
    3. Introduce virtualization tools like hypervisors to reduce the number of physical servers you need, such as VMware, Xen, SWSoft etc.
    4. Introduce SAN remote management/provisioning tools to replace local disks needing tape back-up. Replace with offsite disk to disk mirrroring.
    5. Introduce management software that allows system administrators to better monitor and manage the software commands mentioned in step 1 as well as manage the virtualization complexity mentioned in step 3. Egenera PAN Manager software is a good option. PAN Manager and a few other tools (like Bladelogic and OpsWare) have the ability to up your remote system admin span to 80 blades. For example, Egenera has a very large military client that uses PAN Manager to manage 100+ BladeFrame systems at 100+ data centers from one NOC location by one NOC team.

    Today, in all practicality, most data centers are not 100% dark but many are quickly starting to turn the lights off!

    March 15, 2007

    Going Data Center Green in 2007

    In 2005, "green" data centers were only for the most bleeding edge IT shops that had their current energy requirements under control and could invest in forward thinking initiatives. Well for the rest of us, our number just came up...it's 2007 and it's time to go green!

    We can all agree: server sprawl is a very real issue; energy requirements are skyrocketing; there's a continual move towards faster processors, memory, diskdrives; global warming in playing a major factor and it's only getting hotter (remember last August when the Empire State Building actually had to shut of the lights to save power?).  Interesting fact, did you know that IDC says that for every $1 spent on hardware, $0.50 is spent on energy to power and cool it?  With all that's changing in technology, how does anyone actually accurately budget for power and cooling anymore?

    So what does a green data center really mean for you in a practical sense? First a definition: a green data center is constructed to run as economically as possible. All computer, electrical, cooling and lighting systems, as well as building materials and facilities must be rated for maximum efficiency.

    Knowing what we know about complexity and change in an enterprise data center, how does a company actually go green without going blue in the face?

    • Make "going green" an IT priority or expect to play catch-up to your competitors
    • Start investing in technologies that reduce the amount of power you need. Consider how solar power can augment your peak demands during critical spikes in utilization
    • Start investing in technologies that reduce the amount of watts you need for individual server racks. There are some very cool (and usable) inert gas chiller technologies out there (including the one from Egenera and Liebert we call CoolFrame) that can reduce the amount of cooling power needed for a server rack by 80%
    • Reduce the amount of idle or poorly utilized servers by investing in virtualization, utility computing technologies, "scaling horizontal" applications and databases, and hypervisors

    Some quick ideas for you to chew on. What are you doing to get green this year?

    March 01, 2007

    Where to spend your virtualization money

    In 2007 and 2008, many analysts are forecasting tremendous growth for the data center virtualization market. The past 20 years have been pretty interesting. Legacy big iron architectures moved to client/server architectures, which then moved to open x86 architectures running Linux, and finally to "blades." This ecosystem change has caused an explosion of whitebox and server sprawl across the data center. More servers = more complexity and more complexity = harder to manage. The latest wave of technology in our IT jungle is virtualization. Instead of buying more servers, system admins are carving up their x86 boxes into virtual machines. Sounds good right? Well, maybe.

    As more and more IT regulatory mandates are enforced by your organization, managing what's virtual is compounded astronomically. Your VMs will eventually move into production (if they aren't already). Long gone are the days where VMs are only seen in test and dev. Try unplugging a VM from your network if there is a security breach. Try making sure it's backed-up. Try checking to see if it's been patched. Where is the machine anyway? What else will go down? Who else is on that machine? You get the idea. Not to mention that you're definitely not going to run all your apps on VMs. Most of your mission critical, heavy transaction applications and databases will stay on bare metal.  So what to do?

    Here are some simple recommendations to consider when embarking on your virtualization hunt in 2007:

    1. Virtualize from the OS up. When people think virtualization, they think VMware. But VMware is only one option in hypervisors. There are others...Virtuozzo, Xen. My clients absolutely love SWsoft's Virtuozzo for example. They run it on our Egenera systems for Windows VMs and have been very happy. Keep an eye out for this hypervisor in 2007.
    2. Virtualize from the OS down. Why just reduce the number of servers you have? True virtualization includes replacing physical components with software. Reduce the number of NICs, HBAs, local disks, cables, routers, switches, etc. Less I/O, not just fewer servers. One word of caution here: purchasing separate platforms for VM environments and mission-critical non-VM environments just adds to the complexity. Try to invest in a platform that is flexible and scalable enough to do them both at the same time.
    3. Manage your hypervisors more effectively. Just because you bought ESX and Dell doesn't mean you have solved your complexity problem. Many find it's just the opposite - particularly on the management side. There are solid solutions on the management side - of course, Egenera PAN Manager, also Opsware and BladeLogic. All of these work well with hypervisors. I should mention that Egenera PAN Manager comes with an out-of-the-box hardware platform layer that also solves #2.
    4. Utilize tools to migrate and right-size. Consider physical-to-virtual (P2V) tools like Platespin that can help migrate your physical servers to virtual servers...make these moves repeatable and scalable. Being able to right-size your VMs is critical.
    5. Disaster recovery shouldn't be an afterthought. Make sure your solution includes a site-to-site replication strategy. VMs will be used for production applications in 2007. Tie your VM architecture to your SAN architecture. More about my thoughts on the right DR approach can be found in a previous entry.

    Happy Hunting!

    Alan Chhabra

    • Director, Public Sector, Egenera

    Alan's Links

    February 2008

    Sun Mon Tue Wed Thu Fri Sat
              1 2
    3 4 5 6 7 8 9
    10 11 12 13 14 15 16
    17 18 19 20 21 22 23
    24 25 26 27 28 29  

    Recent Comments