A fault-tolerant mechanism for distributed/parallel system based on task replication techniques
Abstract:
In this work we propose a fault-tolerant mechanism for parallel programs based on task replication. We use a sequential discrete-event simulator of a distributed system subject to failures to compare our semi-active and passive approaches. The performance has been studied as a function of the program size, the processor degradation, and the system size. In addition, we compare our fault-tolerant mechanism with similar approaches. In this case, the criteria studied are the recovery time, the throughput, and the response time.
Año de publicación:
2002
Keywords:
- Reliability of parallel/distributed systems
- Task replication
- Fault tolerance
Fuente:

Tipo de documento:
Article
Estado:
Acceso restringido
Áreas de conocimiento:
- Ciencias de la computación
Áreas temáticas de Dewey:
- Ciencias de la computación