A fault-tolerant mechanism for distributed/parallel system based on task replication techniques


Abstract:

In this work we propose a fault-tolerant mechanism for parallel programs based on task replication. We use a sequential discrete-event simulator of a distributed system subject to failures to compare our semi-active and passive approaches. The performance has been studied as a function of the program size, the processor degradation, and the system size. In addition, we compare our fault-tolerant mechanism with similar approaches. In this case, the criteria studied are the recovery time, the throughput, and the response time.

Año de publicación:

2002

Keywords:

  • Reliability of parallel/distributed systems
  • Task replication
  • Fault tolerance

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Ciencias de la computación

Áreas temáticas de Dewey:

  • Ciencias de la computación