print this page
McMURRY UNIVERSITY
Course
CSC 3330
Professor
Mr. Louis Voit
Semester
Spring 2007
Group Members
Russell M. Cozart
Janne Herfurth
Seminar Outline...
A disaster recovery plan (DRP) - sometimes referred to as a business continuity plan (BCP) or business process contingency plan (BPCP) - describes how an organization is to deal with potential disasters. Just as a disaster is an event that makes the continuation of normal functions impossible, a disaster recovery plan consists of the precautions taken so that the effects of a disaster will be minimized and the organization will be able to either maintain or quickly resume mission-critical functions. Typically, disaster recovery planning involves an analysis of business processes and continuity needs; it may also include a significant focus on disaster prevention.

Fundamental Seminar Focus Areas
- Sample DR/BC Plan
- PowerPoint Presentation
- 7 Tiers of DR Preparation
- Overview of Concepts

Data Failure Examples -
Reasons to be Prepared
- September 11th
- Hurricane Katrina

Flash Video Diagrams
- Storage
- Networking
- Archiving
- Data Protection
- Security

Outside Links
- ITtoolkit.com - DRP

Class Input
- Class Input and Comments
 
Disaster Recovery Preparation & Business Continuity Planning
+ Disaster recovery is becoming an increasingly important aspect of enterprise computing. As devices, systems, and networks become ever more complex, there are simply more things that can go wrong. As a consequence, recovery plans have also become more complex. According to Jon William Toigo (the author of Disaster Recovery Planning). For example, fifteen or twenty years ago if there was a threat to systems from a fire, a disaster recovery plan might consist of powering down the mainframe and other computers before the sprinkler system came on, disassembling components, and subsequently drying circuit boards in the parking lot with a hair dryer. Current enterprise systems tend to be too large and complicated for such simple and hands-on approaches, however, and interruption of service or loss of data can have serious financial impact, whether directly or through loss of customer confidence.
+ Appropriate plans vary from one enterprise to another, depending on variables such as the type of business, the processes involved, and the level of security needed. Disaster recovery planning may be developed within an organization or purchased as a software application or a service. It is not unusual for an enterprise to spend 25% of its information technology budget on disaster recovery.
+ Nevertheless, the consensus within the DR industry is that most enterprises are still ill-prepared for a disaster. According to the Disaster Recovery site, "Despite the number of very public disasters since 9/11, still only about 50 percent of companies report having a disaster recovery plan. Of those that do, nearly half have never tested their plan, which is tantamount to not having one at all."
Disaster recovery increasingly comes down to data protection. Next to personnel, data is your most irreplaceable asset. The efficacy of a DR Plan is expressed in terms of the metric "Time to Data."
The lines once drawn between high availability and disaster recovery and between DRP and information security are fading. It all comes down to providing for continuous access to a valid copy of data required for business decision-making.

Overview: Organizations need disaster prevention and recovery capabilities to aid them in preventing avoidable disasters and to help them cope with disaster events that cannot be avoided. However, these disaster recovery plans are no longer sufficient to ensure recovery within an acceptable timeframe.

With the advent of n-tier client/server computing, networked storage, and business process deconstruction, effective DR planning can no longer be performed after the fact. It must become part of the processes by which applications are designed and systems and networks are built.

A DR capability is not a bolt-on created by a planner or team of planners who are given the job and told to "play the hand of cards that you've been dealt." That kind of planning is fraught with huge internal costs as planners must replicate servers and storage on a one-for-one basis at their recovery site. Optimally, planning should provide for the creation of minimal equipment configurations that temporarily enable critical operations to be sustained on consolidated platforms.
So, too, the DR planner must change. Everyone in the organization must become a DR planner, sharing responsibility for identifying disaster potentials so they can be addressed with suitable recovery capabilities. Moreover, DR planners need to become more IT savvy, able to talk the talk of the application architects, server and network administrators and storage managers with whom they must work to safeguard the technology framework that supports the business.

In the 21st Century, disaster recovery planning is no longer a secretary-friendly task.

These days, the buzz heard around contingency planning conferences is that traditional disaster recovery planning methodologies - those invented by the so-called "Deans of Disaster" in the 1970s and 1980s - are obsolete.

Since 1995, it has become increasingly fashionable for DR planners and consultants to recast themselves as "Business Continuity Planners." BCP practitioners decry the "limited focus" of traditional DR planning on the IT infrastructure, the mainframe and data center. They urge that the focus of planning should be expanded to encompass "business processes"- the totality of both IT infrastructure supports and employees who perform manual and automated tasks. The old guard had it wrong, they argue: the business process, and not the system, is the appropriate focus for contingency planning.

Acronyms and organizing principles aside, however, the differences between BCP and traditional DRP remain comparatively minor. The focus of BCP, on holistic business processes, may differ somewhat from older DRP views, but the techniques of traditional disaster recovery planning persist.

For example, neither discipline has articulated any disaster recovery methods that are appropriate for distributed client/server applications. Leveraging older mainframe-centric recovery methods, both BCP advocates and DRP traditionalists have opted to approach the problem of client/server DR planning through the application of one-for-one redundancy in software, middleware and host hardware.

From the standpoint of both cost and efficiency, the "replacement-through-redundancy" approach to client/server system recovery is nothing short of a bust. Yet, both practitioners and vendors continue to approach client/server recovery from this perspective, making it an Achilles Heel both for actual recovery and for contingency planning budgets.

Some Backgroung: DR planning originally focused on the recovery of mainframe operations. Beginning in the early 1960s, mainframes provided the predominant platform for mission-critical business information processing services and, by all accounts, will continue to do so for some time to come. Some analysts contend that 70 to 80 percent of critical applications continue to reside on the mainframe. Thus, many planners and vendors are content to continue to utilize time-tested mainframe replacement strategies as the centerpiece of their contingency plans - and rightly so.

However, in a growing number of business environments, distributed computing platforms - sometimes called open systems platforms - are proliferating. Driven by a number of factors, companies are rolling out distributed Enterprise Resource Planning (ERP), Manufacturing Resource Planning (MRP) and Customer Relationship Management (CRM) applications (just to name three) on these distributed platforms to meet mission-critical information needs. These and other client/server applications account for a large part of the growing percentage of critical apps that do not reside on mainframe hosts and yet require comprehensive backup and recovery strategies.

Some mainframe backup service providers - "hot site" vendors - argue that there is little difference between the requirements for backing up client/server apps and the requirements for backing up complex mainframe-hosted applications. Traditional DR planning methodology provides the steps involved:

1. Identify the application and its host requirements.

2. Size host platform resources (including communications requirements) to fit.

3. Subscribe to the necessary replacement resources at the hot site.

In short, define a recovery platform based on application requirements. Then, provide those requirements on a suitable backup platform in a recovery setting.

In many cases, vendors argue, the recovery platform required will have a "smaller footprint" in terms of resources and capabilities than the actual production platform. Since every application used by a company in normal operations is not equally critical, often a smaller host system (sometimes called a minimum equipment configuration) can be used to operate those few applications that are deemed critical.

Vendors caveat the above with a simple assertion: if the critical applications involved are very complex, it may be necessary to provide a 1-for-1 replacement of production platform resources and capabilities on the recovery platform. Extending this reasoning, if client/server applications are no more difficult to backup than complex mainframe-based apps, then planners should expect to recover client/server applications on distributed platforms that are identical to their production system configurations.

While vendors admit that 1-for-1 platform replacement strategies are inherently more expensive than minimum equipment configuration approaches, planners are often advised that there are no alternatives. In the case of multi-tier client/server applications, the cost for a 1-for-1 platform replacement can be quite daunting.

Devil in the Details: While not all client/server applications require 1-for-1 element replacement in order to be recovered at an alternate site following an interruption, this is frequently the case - especially for "homegrown" apps. One reason has to do with the manner in which application functionality is expanded over time.

The rollout of client/server applications is often an iterative process. A basic set of functionality is provided initially, with additional functionality being added at various intervals over time. A protracted application implementation cycle opens the door to a number of factors that can limit recovery options. For one, different middleware products may be used as new functionality is added to the application. According to integrators who have been engaged to "web-enable" multi-tier client/server applications for customers who are interested in capitalizing on the Internet, it is commonplace to encounter n-tier client/server applications whose components are held together by a kludge of middleware products from different vendors. These different products approach the problem of intra-application messaging from very different viewpoints.

Some middleware products identify the components of a client/server application by a filename on a particular server. Others may use addressing schemes based on MAC addresses, server IP addresses, or other mechanisms hard-coded within hardware platform components. In the final analysis, it is a miracle that such platforms perform at all efficiently in a normal production environment. Replicating the requirements for proper performance in a recovery environment can be a time-consuming nightmare.

Vendors of commercial off-the-shelf client/server applications, including leading ERP software vendors, claim to have the solution. Purchasing a leading ERP app and using vendor-recommended (or, in some cases, provided) middleware components can expedite rollouts and reduce the kludge factor. According to integrators (and a few inside sources at leading client/server application suite vendor shops), this is not true.

In the competitive world of software, vendors are constantly jockeying for position in the market by enhancing their products with new functionality. Routinely, in an effort to keep up with an opponent's product, a vendor will simply purchase technology from a third-party software company (or they will just buy the company). Rarely is time sufficient to thoroughly integrate the acquired technology with the existing product. Patches and middleware fixes are commonly used by vendors themselves to get the products to market quickly.

Thus, whether homegrown or commercial-off-the-shelf, n-tier client/server applications are typically difficult, costly and time-consuming to deploy and just as difficult, costly and time-consuming to recover. The devil, as they say, is in the details.

Angels in the Architecture: Despite the criticisms that may be lodged against client/server from a DR perspective, this has not curbed the appetite of modern corporations to invest in this model for mission-critical application delivery. Client/server isn't going away, so DRP/BCP practitioners need to deal with it

Like IBM account executives of yesteryear, modern contingency planners need to view the challenge of client/server as an opportunity to excel. The key to expanding the options for recovering client/server applications in the wake of disaster is to become more proactive.

There are numerous documented cases of distributed systems providing speedier recovery from disaster than did centralized systems confronting the same disaster event. My personal favorite is Tokio Marine, a property and casualty insurance company that deployed its key applications on a distributed platform and recovered operations within four hours of the 1995 Kobe earthquake. Meanwhile, a competitor's mainframe shop down the street required over a week to accomplish a recovery. The reason for the different outcomes: Tokio Marine had designed its systems for resource replication within the system itself. When servers in one part of its network were taken out by the earthquake, replicated elements on other servers were able to be brought on-line rapidly.

The Tokio Marine case and many others demonstrate that client/server systems can be made resilient and recoverable if disaster recovery considerations are kept clearly in mind when architects first set designs to paper. Too often, DRP/BCP practitioners lack even a rudimentary understanding of modern client/server technology and tend to approach planning with a mindset that they must not interfere with application development processes, but rather play whatever hand they are dealt.

Given an understanding of client/server technology, a good relationship with application architects, and a bit of management backing, DRP/BCP practitioners can influence the development of applications so their availability and recoverability is enhanced. A few approaches may include:

+ Encouraging the design of applications for element portability: This means helping designers to make a priority of design techniques that enable server elements to be operated on consolidated platforms in an emergency.
+ Encouraging the standardization of application components: Suggesting that the same middleware product, for example, be used throughout the life of the application rollout can simplify the recovery process in some cases.
+ Encouraging the use of platform-agnostic messaging products: Contingency planners should help designers to prioritize recoverability as a criterion for product selection. If a messaging product requires a machine ID or other hard-coded identifier to assure the proper delivery of client-to-server or server-to-server application tasks or messages, its use could be a threat to recoverability. Opt instead for middleware products that use databases of server locations assembled at the time that the server is started and refreshed periodically afterwards. This will enable server components to be more readily restored on different hardware platforms.

The list could go on, but the point is that to build alternatives to one-for-one replacement in client/server system recovery, contingency planners need to concern themselves with the architecture of applications themselves. Most system architects are willing to concede that enhanced capabilities for application recovery will not be added to these apps unless someone specifically asks for them.

Conclusion: While important, the Business Continuity Planning "extensions" to traditional Disaster Recovery Planning have not contributed significantly to the methodology for systems recovery. With the advent of the Internet and the appearance of the ubiquitous web browser, nearly every application within an organization is being converted (or will be shortly) into a client/server application. This trend will facilitate recovery in other areas - for example, web-enabled applications open new doors for end user recovery strategies in which end users work from home during an emergency - but it challenges older methods for systems recovery.

Instead of continuing debate over what should be "the appropriate focus" of contingency planning (business systems versus processes) and whether a certification in one discipline is preferable to certification in the other, perhaps contingency planners need to get back to solving a fundamental problem: How will we recover mission-critical applications within the shortest possible timeframe following an unplanned interruption?

[ back to top ]