Researchers of the CRISP consortium, a team of four companies and two universities in The Netherlands, Germany, and Finland, demonstrate a self-testing and self-repairing chip at the DATE2011 conference in Grenoble. The self-repairing chip anticipates on the future, when further miniaturization makes on-chip components and connections more fragile.
The CRISP consortium (CRISP: Cutting edge Reconfigurable ICs for Stream Processing) developed new concepts for run-time resource management to attain the goal of self-repair: while in operation, the chip tests cores and connections, and a resource manager dynamically assigns the chip’s tasks to fault-free parts.
Picture: CRISP consortium
In itself, downsizing of chip technology is good news, as it allows for example our mobile phones to become ever more powerful. The downside of extreme downscaling is that processes are starting to run into physical limitations, which result in lower production yields and earlier break-down of functional chips. “Because of the rapidly growing transistor density on chips, it has become a real challenge to ensure high system dependability”, says Hans Kerkhoff, Associate Professor, CTIT, University of Twente.
To address the question on how to make sure future miniature chips become more reliable instead of less robust, the CRISP consortium researched how chips can test and repair themselves. It combines a test for faulty components and connections on chips with a run-time resource manager which assigns tasks and communication channels to known-good components and pathways. This allows many-core chips with some faulty cores to pass production test, since they will function for the full 100% — without any compromise to reliability.
How is it possible that the chips can function for 100% with faulty components? “The solution is not to make non-degradable chips, it’s to make architectures that can degrade while they keep functioning, which we call graceful degradation. With the right dependability infrastructure many-cores can be a solution”, says Hans Kerkhoff. The chips have many cores; each core performs subtasks of a more complex application: for instance satellite navigation comprises many digital signal processing tasks. A run-time resource manager dynamically determines which core does which task. Cores can swap tasks; it does not make a difference which core does what, so cores can take over the tasks from failing cores and the chip can repair itself, extending its longevity. Bart Vermeulen, Senior Principal Scientist at NXP states: “Combining testing for faulty components and a run-time resource manager forms the heart of a flexible reconfigurable chip, that can handle changing tasks and failing components during its entire fit life.” The resource manager continuously determines the chip’s optimum Quality of Service on fault-free components.
The resource manager works during the entire chip lifetime to keep the chip up and running. Its primary function is to dynamically assign new tasks to free resources. This allows to truly benefit from the huge processing power of many-cores and creates a much-desired flexibility to adapt to new tasks and standards during the functional life of the chip.
The CRISP project is co-funded by the European Union under the Seventh Framework Programme (FP7).