|
SYS-CON.TV Webcasts
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Top Links You Must Click On
Features No RAC, No RISC, No Problem
New platform options for mission-critical applications
By: Andy Bailey
Jul. 23, 2010 10:15 AM
Having fended off challenges from Linux for several years now, the RISC-Unix platform is now under siege on another front - x86 servers. Long dismissed as workgroup and departmental servers, or as platforms for low-level enterprise applications, x86 servers are making serious inroads into corporate data centers. Earlier this year, analyst firms Gartner and IDC issued studies showing that sales of and revenue from RISC-Unix servers were continuing their slide against x86 boxes. The Gartner study showed a 28.5 percent decline in the number of units shipped and a 26.9 percent drop in revenue from them. The IDC study released in May of this year cited a "perfect storm" of circumstances, including the recession, the purchase of Sun by Oracle, and potential hardware upgrades by other vendors inducing delays in purchase decisions, all leading to the lowest level of spending on RISC servers IDC has ever recorded. Also encouraging the move away from RISC-Unix for many applications is the combination of a gathering virtualization groundswell in the data center and the rapidly improving performance capabilities of the x86 platforms. Of all the factors driving x86's popularity, virtualization may be the one that ultimately establishes it as the predominant data center server platform. Commodity x86 technology has been steadily eroding the RISC stranglehold on the data center, starting with the bottom-end, non-critical work load environments. A notable point of resistance to this trend, however, has been Tier One workloads perceived as mission-critical running on RISC-Unix -based platforms - things like Oracle RAC, Sybase Adaptive Server and IBM DB2 pureScale. Legacy Oracle databases running on these systems are a prime example. Business today runs on data, making Oracle RAC running on RISC-Unix, and its attendant applications, a mission-critical underpinning of the business. RAC is a known, if not always admired, quantity among database administrators. It addresses vital enterprise-level computing IT issues, including the need to maintain predictable performance even as databases grow. The RAC architecture - multiple instances of a database running on separate nodes, yet treated as a single entity - allows IT to bring more systems online as needed. Load balancing among the nodes can help maintain performance at acceptable levels as workloads scale. Finally, the redundancy of the nodes helps ensure that a problem on a single machine doesn't stop the overall functioning of the database. This added availability assurance is a key point for IT professionals who are lukewarm about RAC. Ensuring that mission-critical, revenue-generating workloads don't go down is a major responsibility for database administrators, and RAC has shown commendable resilience over the years. Reliability alone is reason enough for most DBAs to stick with RAC. Combined with RAC's scalability, it's hard to convince most DBAs to try an enterprise computing platform other than RAC running on RISC-Unix. Nevertheless, there are serious downsides to standing pat with RAC. The first and foremost is cost. RAC is expensive to license and maintain, and because of Oracle's per-processor pricing, growing a RAC implementation has built-in costs. It doesn't take much digging in user groups and in the blogosphere to find DBAs gritting their teeth over RAC's licensing expenses. DBAs have been able to justify the cost, however, because they have critical applications to support and RAC's performance merited the expense. The times are changing. Today's economics are forcing many companies to rethink existing IT assumptions and look more closely at the total costs of their systems. This type of scrutiny helped propel the first round of x86 platform virtualization consolidation. Companies realized they could get a lot more work out of underutilized platforms at a relatively low cost. For a high-end RISC machine running a Tier One application, the cost of the system hardware is just the starting point of the expense tally. When clustering is introduced, application license costs typically increase in exchange for the perceived benefit the Tier One ISV is offering the customer for this more-complex, but ostensibly, higher-performance solution. In the case of Oracle RAC, the complexity of keeping that environment optimally tuned and maintained pushes administrative costs even higher. And as the number of professionals with requisite RISC-Unix administration skills ages and dwindles, the competition and price for those skills increases, also driving up the overall cost of remaining with existing, decades-old solutions. If the expense, complexity, and skills issues - as well as a continued tight economy - prompt a re-consideration, what alternatives are worth considering? Clearly DBAs responsible for Tier One applications cannot compromise on performance or scale. Any proposed replacement must be capable of delivering the availability and peace of mind that RAC on Unix-RISC provides IT, backed by the mainframe-class service they're used to. Let's start with the perception of performance at the hardware level. As mentioned above, the rapid evolution of the x86 architecture over the past five-to-seven years has closed what was once a significant performance gap between these systems and premium RISC machines. The move to 64-bit, multicore CPU chips as a standard, capable of supporting tens and hundreds of gigabytes of RAM, along with faster bus architectures, has dramatically boosted the performance of commodity x86 systems. One has only to look at the proliferation of clustered x86 supercomputers running Linux to get a sense that the performance question is resolved: clustered x86 systems top this year's four fastest supercomputers, as an example of that performance evolution. [1] But what of availability? Beyond system horsepower and scale, tier-one, mission-critical applications also need continuous and unfailing availability. Downtime on these systems is costly and, when unplanned, has potential for substantial business impact. This need for very high availability is often the trump card when IT contemplates moving key infrastructure support systems such as RAC off of RISC Unix. These systems come from a heritage of rock-solid reliability and a time when the competitive alternative was multi-million dollar mainframes or truly fault tolerant Big Iron systems. Now, fortunately, there are less costly x86-based alternatives that deliver the requisite availability and reliability, particularly when virtualized and clustered. A traditional hardware clustering approach is a well-understood method of improving x86 system's reliability and availability. Clustering has its own drawbacks, though. Migrating applications to a clustered environment is complex, as is the day-to-day administration needed to maintain adequate availability levels. If overall cost reduction is the main priority, clustering probably isn't the best approach. More recently, virtualization has been employed to go beyond simple improvements in utilization and system consolidation. Development of key features, such as allowing running virtual machines to be migrated to and from hardware systems with no service interruption, have enabled data centers to become more agile in managing planned downtime and system maintenance. These types of capabilities can also be harnessed by advanced virtualization management systems to enable cluster-like failover and recovery between systems to improve availability characteristics. Nice, but again, RISC-Unix veterans have a high bar to meet and a high level of anxiety to assuage. Cluster-like availability is less than most of them will settle for. Westpac Institutional Bank, part of Australia's oldest financial institution, found virtualization was only part of the answer when it launched a major application upgrade program recently. Westpac designed a new front-end processing system to run its Westpac Integrated Banking Service (WIBS) along with a dozen other critical applications, in a virtual production infrastructure. More than 100 of Westpac's banking clients rely on WIBS to complete payment transactions. These institutions send or receive some 300,000 workflows each month containing AUD$40BN in payments - an average of $1.262M per minute each day. With most transfers taking place between 8 a.m. and 5 p.m., each minute of processing potentially represented three to five times more transaction dollar value that would not complete if WIBS was down. System resiliency and availability were primary operational concerns. Clients interacted directly with the file transfer application and, through virtualization, the platform would support many critical applications on less hardware. In addition, the system needed to integrate with Westpac's back-end banking systems. A virtualized platform could deliver the fundamental efficiency Westpac hoped for, but availability remained an issue. Virtualized environments, like clusters, assume that failures will occur, and deal with them by switching over to backup systems with as little downtime as possible. Nevertheless, downtime is part of their architecture. Virtualization alone couldn't provide the availability assurance Westpac needed for a critical, client-facing application. After exploring platform options, including traditional clustering, Westpac found peace of mind in a "belts and suspenders" approach; deploying virtualized environments on fully fault-tolerant, Intel processor-based hardware. Since its early days, truly fault tolerant hardware has quietly evolved away from highly proprietary architectures and astronomical price tags. Today's fault-tolerant systems are based on industry standard components and run standard operating systems, such as Windows Server 2008 and Red Hat Enterprise Linux. They also support leading bare-metal, enterprise-grade virtualization platforms like VMware vSphere, unmodified right out of the box. The fault-tolerant hardware solved both of Westpac's fundamental problems. It eliminated clustering's management and maintenance complexity while providing levels of availability and fault avoidance that clustering cannot. Virtualization added agility and workload management capabilities. By combining virtualization and fault tolerance, Westpac found a solution that could provide a best-of-all-worlds approach to deploying a mission-critical tier-one application. All of which brings us back to performance. The simple, rock-solid availability of a standards-based fault-tolerant platform and the agile management of virtualization get the nod from IT. But what about the scale and performance issues that worry database administrators? As noted above, part of the original appeal of Oracle RAC on Unix was its known performance characteristics at scale and under load. Early virtualization solutions were notoriously resource-intensive, limiting application performance especially under load. That would be a non-starter for any organization considering moving a demanding, mission-critical database system like Oracle RAC off of legacy UNIX platforms. Fortunately virtualization technologies, like fault-tolerance platforms, have evolved - and quickly. Early adopters of virtualization watched as their hypervisors ate up 40-50 percent of system resources, leaving demanding applications unable to perform at a high level. Today, market-leading hypervisor technologies impose nearly negligible overhead burdens, far below those in use just three or four years ago. Indeed, benchmarking and real-world experience has shown that, on unsaturated database systems, the overhead of leading bare metal hypervisors is imperceptible for real-world use cases. Even on fully saturated systems, the best in modern hypervisor technologies will impose at most a 10 percent burden on the system. With the advances previously cited in the realm of x86 systems - multi-core CPUs, support for large amounts of RAM, faster bus systems - even this level of resource use is easily outpaced by performance improvements in the platform, leaving the applications with the horsepower they need. The fact is a virtualized fault-tolerant server of 2010 can outclass and outperform even high-end RISC/Unix platforms driving mission-critical applications of today. ParAccel, a start-up on the leading edge of analytic database technology, recently helped prove this point when they decided to run an industry-standard query performance benchmark, called TPC-H, in an x86 virtualized environment. The performance was literally record setting. Using 40 physical host machines, each with eight cores, ParAccel spread its Paraccel Analytic Database across 80 virtual, clustered nodes and broke the world records for both performance and price/performance. Those 40 physical servers, by the way, represent 37 percent fewer machines than were used in what is now the second-best performance result. Clearly, virtualization is no longer the impediment to application performance it once may have been. Advances in hardware resource utilization have effectively eliminated concerns organizations may have had about deploying demanding applications in virtual environments. Oracle RAC on the RISC-Unix platform became the standard for high-demand, critical enterprise applications because it delivered the right combination of performance, stability and scalability. Those qualities came at a price - a high one. As databases grow and business relies on applications to generate more revenue, expanding on the RISC-Unix platform grows exponentially more expensive. Without credible alternatives, IT and database administrators were locked into RAC on RISC-Unix. The improvements in x86 hardware and virtualization software over the last few years have changed the game. Together, they provide the performance and versatility that companies need to expand their application and database infrastructures at a reasonable cost. Adding fault-tolerant hardware provides the final ingredient - continuous application availability. Reference 1. Top500 TOP10 June 2010, http://www.top500.org/ Reader Feedback: Page 1 of 1
Enterprise Open Source Magazine Latest Stories . . .
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||