Beyond redundancy : how geographic redundancy can improve service availability and reliability of computer-based systems /
Eric Bauer, Randee Adams, Dan Eustace.
- 1 PDF (xxvi, 304 pages) : illustrations.
Figures xv -- Tables xix -- Equations xxi -- Preface and Acknowledgments xxiii -- Audience xxiv -- Organization xxiv -- Acknowledgments xxvi -- PART 1 BASICS 1 -- 1 SERVICE, RISK, AND BUSINESS CONTINUITY 3 -- 1.1 Service Criticality and Availability Expectations 3 -- 1.2 The Eight-Ingredient Model 4 -- 1.3 Catastrophic Failures and Geographic Redundancy 7 -- 1.4 Geographically Separated Recovery Site 11 -- 1.5 Managing Risk 12 -- 1.6 Business Continuity Planning 14 -- 1.7 Disaster Recovery Planning 15 -- 1.8 Human Factors 17 -- 1.9 Recovery Objectives 17 -- 1.10 Disaster Recovery Strategies 18 -- 2 SERVICE AVAILABILITY AND SERVICE RELIABILITY 20 -- 2.1 Availability and Reliability 20 -- 2.2 Measuring Service Availability 25 -- 2.3 Measuring Service Reliability 33 -- PART 2 MODELING AND ANALYSIS OF REDUNDANCY 35 -- 3 UNDERSTANDING REDUNDANCY 37 -- 3.1 Types of Redundancy 37 -- 3.2 Modeling Availability of Internal Redundancy 44 -- 3.3 Evaluating High-Availability Mechanisms 52 -- 4 OVERVIEW OF EXTERNAL REDUNDANCY 59 -- 4.1 Generic External Redundancy Model 59 -- 4.2 Technical Distinctions between Georedundancy and Co-Located Redundancy 74 -- 4.3 Manual Graceful Switchover and Switchback 75 -- 5 EXTERNAL REDUNDANCY STRATEGY OPTIONS 77 -- 5.1 Redundancy Strategies 77 -- 5.2 Data Recovery Strategies 79 -- 5.3 External Recovery Strategies 80 -- 5.4 Manually Controlled Recovery 81 -- 5.5 System-Driven Recovery 83 -- 5.6 Client-Initiated Recovery 85 -- 6 MODELING SERVICE AVAILABILITY WITH EXTERNAL SYSTEM REDUNDANCY 98 -- 6.1 The Simplistic Answer 98 -- 6.2 Framing Service Availability of Standalone Systems 99 -- 6.3 Generic Markov Availability Model of Georedundant Recovery 103 -- 6.4 Solving the Generic Georedundancy Model 115 -- 6.5 Practical Modeling of Georedundancy 121 -- 6.6 Estimating Availability Benefit for Planned Activities 130 -- 6.7 Estimating Availability Benefit for Disasters 131 -- 7 UNDERSTANDING RECOVERY TIMING PARAMETERS 133 -- 7.1 Detecting Implicit Failures 134. 7.2 Understanding and Optimizing RTO 141 -- 8 CASE STUDY OF CLIENT-INITIATED RECOVERY 147 -- 8.1 Overview of DNS 147 -- 8.2 Mapping DNS onto Practical Client-Initiated Recovery Model 148 -- 8.3 Estimating Input Parameters 154 -- 8.4 Predicted Results 165 -- 8.5 Discussion of Predicted Results 172 -- 9 SOLUTION AND CLUSTER RECOVERY 174 -- 9.1 Understanding Solutions 174 -- 9.2 Estimating Solution Availability 177 -- 9.3 Cluster versus Element Recovery 179 -- 9.4 Element Failure and Cluster Recovery Case Study 182 -- 9.5 Comparing Element and Cluster Recovery 186 -- 9.6 Modeling Cluster Recovery 187 -- PART 3 RECOMMENDATIONS 201 -- 10 GEOREDUNDANCY STRATEGY 203 -- 10.1 Why Support Multiple Sites? 203 -- 10.2 Recovery Realms 204 -- 10.3 Recovery Strategies 206 -- 10.4 Limp-Along Architectures 207 -- 10.5 Site Redundancy Options 208 -- 10.6 Virtualization, Cloud Computing, and Standby Sites 216 -- 10.7 Recommended Design Methodology 217 -- 11 MAXIMIZING SERVICE AVAILABILITY VIA GEOREDUNDANCY 219 -- 11.1 Theoretically Optimal External Redundancy 219 -- 11.2 Practically Optimal Recovery Strategies 220 -- 11.3 Other Considerations 228 -- 12 GEOREDUNDANCY REQUIREMENTS 230 -- 12.1 Internal Redundancy Requirements 230 -- 12.2 External Redundancy Requirements 233 -- 12.3 Manually Controlled Redundancy Requirements 235 -- 12.4 Automatic External Recovery Requirements 237 -- 12.5 Operational Requirements 242 -- 13 GEOREDUNDANCY TESTING 243 -- 13.1 Georedundancy Testing Strategy 243 -- 13.2 Test Cases for External Redundancy 246 -- 13.3 Verifying Georedundancy Requirements 247 -- 13.4 Summary 254 -- 14 SOLUTION GEOREDUNDANCY CASE STUDY 256 -- 14.1 The Hypothetical Solution 256 -- 14.2 Standalone Solution Analysis 259 -- 14.3 Georedundant Solution Analysis 263 -- 14.4 Availability of the Georedundant Solution 269 -- 14.5 Requirements of Hypothetical Solution 269 -- 14.6 Testing of Hypothetical Solution 277 -- Summary 285 -- Appendix: Markov Modeling of Service Availability 292 -- Acronyms 296. References 298 -- About the Authors 300 -- Index 302.
Restricted to subscribers or individual electronic text purchasers.
"This book provides both a theoretical and practical treatment of the feasible and likely benefits of geographic redundancy for both service availability and service reliability"-- "While geographic redundancy can obviously be a huge benefit for disaster recovery, it is far less obvious what benefit is feasible and likely for more typical non-catastrophic hardware, software, and human failures. Georedundancy and Service Availability provides both a theoretical and practical treatment of the feasible and likely benefits of geographic redundancy for both service availability and service reliability. The text provides network/system planners, IS/IT operations folks, system architects, system engineers, developers, testers, and other industry practitioners with a general discussion about the capital expense/operating expense tradeoff that frames system redundancy and georedundancy"--