Deviation-Tolerant Computation in Concurrent Failure-Prone Hardware

Deviation-Tolerant Computation in Concurrent Failure-Prone Hardware
Phillip Stanley-Marbell and Diana Marculescu, TU/e ES reports, 2008.

Abstract

In many applications of computing systems, particularly those which process data samples from real-world signals, it is possible to trade off accuracy of computation in the presence of hardware faults, for performance or energy efficiency. Such trade-offs may be even more pronounced in platforms which employ multiple processing elements, as one may then also consider trade-offs between the speeds of communication between processing elements (and hence the computation throughput), and the possibility of faults in such communications (and hence possible errors in a computation’s result).

Presented are analysis on the relation between faults occurring in compute hardware or communicated program state (in a multiprocessor system) and the resulting deviations in values manifested in source-level program variables. These relations are dependent on the distributions of values taken on by program variables of different data types in the absence of faults, and we present detailed characterizations of these distributions for a large collection of programs. We show how the analytic derivations, in conjunction with the empirical characterizations, can enable the implementation of deviation-tolerant transformations in programs. The work is presented in the context of a hardware platform we have designed and implemented, containing 24 processing elements, that manifests tradeoffs between occurrences of faults in hardware, performance, and energy efficiency.

Cite as

P. Stanley-Marbell, and D. Marculescu. “Deviation-Tolerant Computation in Concurrent Failure-Prone Hardware”. Technische Universiteit Eindhoven, TU/e ES Tech Report, Number ESR-2008-01, 2008.

BibTeX

@article{stanley2008deviation,
  title={Deviation-Tolerant Computation in Concurrent Failure-Prone Hardware},
  author={Stanley-Marbell, Phillip and Marculescu, Diana},
  journal={ES reports},
  volume={2008},
  year={2008},
  publisher={Technische Universiteit Eindhoven}
}