Article
Article
- Computing & Information Technology
- Computing - general
- Fault-tolerant systems
Fault-tolerant systems
Article By:
Meyer, John F. Computing Research Laboratory, Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, Michigan.
Last reviewed:January 2020
DOI:https://doi.org/10.1036/1097-8542.251650
- Dependability and performability
- Security
- Fault prevention versus fault tolerance
- Fault-tolerance techniques
- Related Primary Literature
- Additional Reading
Systems, predominantly computing systems, communication systems, and other computer-based systems, which tolerate undesired changes in their internal structure, internal state, or external environment. Such changes, generally referred to as faults, may occur at various times during the lifetime of a system, beginning with its specification and proceeding through its use. Faults can be classified in a variety of ways according to when, where, why, and how they occur. For example, in the taxonomy of faults reported by A. Avizienis and colleagues in January 2004, elementary fault classes are defined according to eight basic dichotomies. Specifically, faults that occur during specification, design, implementation, or maintenance of a system are development faults, while those that occur during use (by users of services delivered by the system) are operational faults. Some other basic distinctions include where a fault occurs (internal or external, relative to some designated boundary between the system and its use environment), its phenomenological cause (natural or human-made), the objective of its cause (malicious or nonmalicious), the intent of its cause (deliberate or nondeliberate), and how long it persists (permanent or transient). Membership in a given elementary fault class may exclude membership in another; for example, a development fault cannot be external because the use environment is nonexistent during the periods of development. Operational faults, on the other hand, can be either internal or external and either natural or human-made. Examples include physical component failures (internal, natural), temperature and radiation stress (external, natural), mistakes by human operators who are integral parts of the system (internal, human-made), and malicious denial-of-service attacks (external, human-made).
The content above is only an excerpt.
for your institution. Subscribe
To learn more about subscribing to AccessScience, or to request a no-risk trial of this award-winning scientific reference for your institution, fill in your information and a member of our Sales Team will contact you as soon as possible.
to your librarian. Recommend
Let your librarian know about the award-winning gateway to the most trustworthy and accurate scientific information.
About AccessScience
AccessScience provides the most accurate and trustworthy scientific information available.
Recognized as an award-winning gateway to scientific knowledge, AccessScience is an amazing online resource that contains high-quality reference material written specifically for students. Contributors include more than 10,000 highly qualified scientists and 46 Nobel Prize winners.
MORE THAN 8700 articles covering all major scientific disciplines and encompassing the McGraw-Hill Encyclopedia of Science & Technology and McGraw-Hill Yearbook of Science & Technology
115,000-PLUS definitions from the McGraw-Hill Dictionary of Scientific and Technical Terms
3000 biographies of notable scientific figures
MORE THAN 19,000 downloadable images and animations illustrating key topics
ENGAGING VIDEOS highlighting the life and work of award-winning scientists
SUGGESTIONS FOR FURTHER STUDY and additional readings to guide students to deeper understanding and research
LINKS TO CITABLE LITERATURE help students expand their knowledge using primary sources of information