Building Reliable, Resilence and Recoverable System (HP)

The three Rs – reliability, resiliency and recoverability
– led to the creation of a list of “Best Practices” for hardware, software and application design
– thus demonstrating that Integrity servers and SQL Server are “Always On”.

to demonstrate three key points:
1) how the Integrity servers can be used to maximum advantage in this environment;
2) how out-of-the-box functionality from SQL Server can be used effectively;
3) how intelligent application design can utilize underlying technologies to survive multiple points-of-failure and recover gracefully from these failures.

Two general types of tests were executed:
1) system-level tests and unit/component tests for the hardware;
2) functionality tests and failure tests for Microsoft SQL Server.

The tests also contained business goals, constraints and metrics – to judge success or failure against typical IT requirements of any data warehouse solution.

While performance and cost were not included in these test goals, it is understood that these two metrics are critical to the success of a data warehouse or business intelligence solution. They have been intentionally excluded in this particular context, but were always a factor when considering alternatives to specific “Always On” challenges.

The design of the test environment was crucial to its own success. Multiple Integrity servers were employed to provide fallback and failover in cases where the primary machine is off-the-air. Database mirroring technology was implemented to provide hot-standby database capabilities.

The SAN contained redundant hardware components at nearly every level. And the application was able to utilize different data sources with little or no manual effort.

The first series of tests checked all aspects of the hardware: cables, controllers, spindles, processors, PCI cards, machine/network/power failures. In all cases (except one: data center power failure), the application was able to continue without interruption or resume operations with little manual intervention.

The second series of tests centered on SQL Server. These exercises were designed to show how the Integrity server could be utilized to its utmost potential in conjunction with the SQL Server capabilities chosen for this case study. Once again, the strengths of the HP Integrity line (large memory, n-way CPUs, 64-bits) shine through, as allocating resources to the tests improves the throughput and response time. For the failure tests, SQL Server provided the necessary resilience by using the mirroring technology (and a second Integrity server as the backup machine) for the application.

While the case study did not test every aspect of a typical data warehouse application, inserting massive amounts of data is a crucial piece of the solution. The tests showed that the Integrity servers can greatly assist that effort by providing the needed hardware capacities and growth for data-intensive applications such as a data warehouse. And Microsoft SQL Server’s out-of-the-box features were a perfect complement to the Integrity server for the goals of these tests: Always On.


Popular Posts