TY - JOUR
T1 - A scalable built-in self-recovery (BISR) VLSI architecture and design methodology for 2D-mesh based on-chip networks
AU - Chen, Kun Chih
AU - Lin, Shu Yen
AU - Shen, Wen Chung
AU - Wu, An Yeu
N1 - Funding Information:
Acknowledgements This work was supported by the National Science Council of Taiwan under Grant NSC 98-2220-E-002-034 and NSC 97-2221-E-002-239-MY3.
PY - 2011/6
Y1 - 2011/6
N2 - On-Chip Networks (OCNs) have been proposed to solve the complex on-chip communication problems. In Very Deep-Submicron era, OCN will also be affected by faults in chip due to technologies shrinking. Many researches focused on fault detection and diagnosis in OCN systems. However, these approaches didn't consider faulty OCN system recovery. This paper proposes a scalable built-in self-recovery (BISR) design methodology and corresponding Surrounding Test Ring (STR) architecture for 2D-mesh based OCNs to extend the work of diagnosis. The BISR design methodology consists of STR architecture generation, faulty system recovery, and system correctness maintenance. For an n×n mesh, STR architecture contains one controller and 4n test modules which are formed as a ring-like connection surrounding the OCN. Moreover, these test modules generate test patterns for fault diagnosis during warm-up time. According to these diagnosis results, the faulty system is recovered. Finally, this paper proposes a fault-tolerant routing algorithm, Through-Path Fault-Tolerant (TP-FT) routing, to maintain the correctness of this faulty system. In our experiments, the proposed approach can reduce 68.33~79.31% unreachable packets and 4.86~23.6% latency in comparison with traditional approach with 8.48~13.3% area overhead.
AB - On-Chip Networks (OCNs) have been proposed to solve the complex on-chip communication problems. In Very Deep-Submicron era, OCN will also be affected by faults in chip due to technologies shrinking. Many researches focused on fault detection and diagnosis in OCN systems. However, these approaches didn't consider faulty OCN system recovery. This paper proposes a scalable built-in self-recovery (BISR) design methodology and corresponding Surrounding Test Ring (STR) architecture for 2D-mesh based OCNs to extend the work of diagnosis. The BISR design methodology consists of STR architecture generation, faulty system recovery, and system correctness maintenance. For an n×n mesh, STR architecture contains one controller and 4n test modules which are formed as a ring-like connection surrounding the OCN. Moreover, these test modules generate test patterns for fault diagnosis during warm-up time. According to these diagnosis results, the faulty system is recovered. Finally, this paper proposes a fault-tolerant routing algorithm, Through-Path Fault-Tolerant (TP-FT) routing, to maintain the correctness of this faulty system. In our experiments, the proposed approach can reduce 68.33~79.31% unreachable packets and 4.86~23.6% latency in comparison with traditional approach with 8.48~13.3% area overhead.
KW - Built-in self-recovery (BISR)
KW - Design-for-testability (DfT)
KW - On-chip networks (OCN)
KW - Surrounding test ring (STR)
KW - Through-path fault-tolerant (TP-FT) routing
UR - http://www.scopus.com/inward/record.url?scp=80051667168&partnerID=8YFLogxK
U2 - 10.1007/s10617-011-9074-6
DO - 10.1007/s10617-011-9074-6
M3 - Article
AN - SCOPUS:80051667168
SN - 0929-5585
VL - 15
SP - 111
EP - 132
JO - Design Automation for Embedded Systems
JF - Design Automation for Embedded Systems
IS - 2
ER -