Projects


B4: Hardware Monitoring System and Design Optimisation for Invasive Architectures

Principal Investigators:

Prof. Schmitt-Landsiedel, Prof. Schlichtmann

Scientific Researchers:

Qingqing Chen, Elisabeth Glocker, Shushanik Karapetyan, Daniel Müller-Gritschneder, Christoph Werner,
Martin Wirnshofer

Abstract

Subproject B4 is dedicated to the assessment of operating conditions of the invasive computing hardware, the communication of this information and optimisation of the required monitoring resources. To measure these parameters, monitor circuits are designed and their number and placement in the invasive architecture is optimised. A parameterised model for this status information, including power consumption, temperature and maximum possible performance of computing blocks as well as their ageing-related degradation status, is provided for system simulation, optimisation and emulation on the FPGA-demonstrator.
In the first funding phase, Project B4 has developed concepts for monitoring invasive computing systems (both RISC and TCPA tiles). Specifically, concepts for monitoring power, temperature and ageing have been investigated. Communication interfaces between the monitors and higher levels of invasive computing systems have been explored. A control loop concept has been developed. For the essential monitoring concepts, a method has been developed to emulate them on an FPGA. The major challenge for FPGA emulation was that most monitors contain analogue circuits. With the achieved FPGA emulation, our concepts can be evaluated in the context of an entire invasive computing system even without an ASIC hardware implementation.

Synopsis

Integrated circuits today and even more in the future are subject to significant variations—between different manufactured components (resulting from fluctuations in the manufacturing process) as well as over space (e.g., "hot spots" due to heavy local switching activity) and time (short-term resulting from fluctuations in the operating conditions such as supply voltage and temperature; long-term resulting from degradation effects due to ageing). Therefore, different processing elements even on the same invasive IC can exhibit significantly different processing capabilities and susceptibility to degradation resulting from processing loads. This also results in differing risk of IC failures.
Resource-aware programming as one of the most essential points of innovation of invasive computing shall enable an application to make its decisions for execution based on actual physical hardware properties. In order to allow invasive algorithms to exploit the state of the invasive hardware for optimal distribution of the load, this subproject will provide means to measure and communicate the specific status of a processing element. This requires new ways of hardware design optimisation specifically in view of the new capabilities of invasion, including the design of dedicated monitor circuits. This project considers optimisation strategies and design of corresponding circuits and interfaces including the demonstration by simulation, emulation on the FPGA hardware prototype platform and later on by implementation on ASIC hardware prototypes.
This comprises classification of potential monitor types and interfacing systems, circuit design and analysis. It also includes algorithmic analysis and optimisation with respect to the complete invasive system, to calibrate monitors and to optimise their number, their performance regarding accuracy and speed, and their placement. Interfacing and information propagation has to be optimised to ensure best possible utilisation of each processor block based on its individual capabilities. This will also potentially reduce manufacturing costs of invasive architecture ICs, as processor blocks can be utilised according to their individual capabilities, rather than having to discard processors that do not meet predefined performance requirements.
The activities in the first funding phase will also lay the groundwork to enable optimised invasive architecture implementations in ASICs in the second and third funding phase, by optimising power consumption and reducing susceptibility to manufacturing variations and age-dependent degradation.

Research goal in first funding phase

With the introduced resource awareness of an invasive computing system, applications have the ability to explore the system and make decisions for execution (e.g. number and selection of invaded cores) based on the current state of the hardware platform, including physical hardware properties. For realising an invasive multi-tile architecture, a closed-loop control system between applications, run-time support system (OctoPOS), agent system and the underlying hardware including the monitoring system is necessary. The monitoring system provides the system with the needed monitoring data to control the physical hardware conditions and to use knowledge about hardware-health during the resource allocation in the invade phases. This becomes even more important (especially with thousands or more processors integrated on a single chip) when considering the significantly different processing capabilities and susceptibility to degradation of modern integrated circuits as compared to older and more robust processes. So, the research goal of Project B4 that will be fulfilled at the end of Phase I has been to measure the specific status of hardware elements, preprocess these data and implement the overall monitoring system. To effectively monitor the invasive hardware, different parameters, such as temperature evolution, power consumption and maximum and age-dependent performance capability have to be monitored. A foundation for this was developed in Phase I. The monitoring of the latter two parameters will be implemented in Phase II. This information is communicated with different levels of detail to other system components: higher hardware layers, run-time support system (OctoPOS), agent system and applications. The system is then able to act considering the monitor information, e.g. during the invade phase to choose appropriate processing elements or to react, if a critical status is detected. In turn, these actions may influence the status of hardware elements and with that the measured monitoring data. Project B4 has considered optimisation strategies and the design of corresponding interfaces, including the demonstration by simulation and emulation on the FPGA hardware prototyping platform.

Methods

Hardware-monitoring concept and models:

In-situ delay monitoring: In [Wirnshofer, ISIC 2011] and [Wirnshofer, DDECS 2011], we have demonstrated the monitoring of the maximum possible performance in terms of speed and frequency by in-situ delay monitoring. Before Phase I, published performance/speed monitors were mostly critical path replicas [A. Drake, et al., "A distributed critical-path timing monitor for a 65nm high-performance microprocessor, ISSCC 2007]. In [Wirnshofer, DDECS 2012] we have demonstrated the use of in-situ delay monitors for use in adaptive voltage scaling (AVS) and have evaluated the performance improvement and power saving potential. In-situ delay monitors are enhanced flip-flops that observe the timing of the circuit. Critical, but not yet erroneous signal transitions are detected as pre-errors. The pre-error rate is used as indicator whether the remaining timing slack of the circuit is sufficient. By use of these in-situ delay monitors, all kinds of variation and ageing effects are detected inside the real circuit and thus reliable performance information is provided. When using this monitor type in an online AVS technique, the supply voltage can be regulated during normal circuit operation—without a need for test intervals. In [Aryan, ARS 2012], different designs to implement in-situ delay monitors have been presented and the reliability of the timing information as well as the power overhead have been carefully analysed.
Ageing monitoring: Before Phase I, monitors that determine more advanced system features (e.g. ageing status) were just attracting initial research efforts in the research community.
In [Lorenz, Tech. Report, 2011], we have demonstrated an innovative approach to periodically monitor the ageing of ICs during operation. The basic concept is to identify all paths that potentially might become critical during the lifetime of an IC. As different paths can age at different rates, the critical path can change during the life of an IC. Ageing depends on operating and environmental conditions and therefore cannot be determined exactly before an IC is actually being used. But it is possible to identify a range within which the delay of a path will always be, regardless of where specifically it resides within the manufacturing window and what operating conditions (temperature, supply voltage, or switching activity) it will experience. It turns out that if this window is considered, for many circuits the number of paths that can potentially become critical is reduced significantly, often by one or two orders of magnitude. Therefore, it appears to be an option to test these paths periodically during the operation of an IC to detect any ageing that might endanger correct computation. This approach can be considered as a supplement or an alternative to the methods discussed above.
The research presented in [Knoth, PATMOS 2011], [Knoth, DATE 2012], [Chen, PATMOS 2011], [Li, TCAD 2013], [Li, TCAD 2012] and [Chen, IET CDS 2012] addresses related topics. This will become especially useful for a future ASIC design of the invasive multi-tile architectures.
In [Knoth, PATMOS 2011], SWAT, a highly optimised statistical timing analyser for digital circuits has been presented that combines the accuracy of a transistor-level analysis with the performance of a gate-level analysis. SWAT is based upon a CSM (current source model) for logic cells which considers transistor ageing and process variation and employs waveform truncation and dedicated solvers to significantly improve analysis performance without noticeable loss of accuracy. Parameter variations and ageing can be handled by Monte Carlo simulations and by a special sensitivity propagation mode, which expresses arrival times as a function of local and global parameter variations. This will allow very fast, yet accurate analysis of an ASIC design, considering variations and ageing to ensure very robust InvasIC ASIC design. In [Knoth, DATE 2012], the emphasis is put on power analysis instead of timing analysis.
In [Chen, PATMOS 2011], a flip-flop timing model has been presented that allows interdependency of different computation stages to be analysed via a static timing analysis at gate level. This is done by breaking the timing boundaries by explicitly building the functional relationship between clock-to-q delay and timing parameters at the flip-flop data input. Ageing effects HCI (hot carrier injection) and NBTI (negative bias temperature instability) are also considered in the modelling to pave the way for precise and realistic ageing analysis. Application of this approach in system emulation and later on also ASIC design will improve design performance even further.
[Li, TCAD 2013] has investigated the challenges in hierarchical timing analysis considering process variations. With abstract statistical timing models containing interfacing constraints, this flow can reduce the complexity of design and verification of large SoC systems effectively. For each of the three basic circuit types (combinational, flip-flop-based and latch-controlled) methods to extract statistical timing models are proposed to prune the unnecessary timing information from the underlying modules. With additional methods for the reconstruction of correlation between modules and for system-level verification, the complete framework is several times faster than analysing the flattened circuit directly, therefore providing an efficient flow for statistical timing verification of invasive multi-tile architectures.
[Li, TCAD 2012] has evaluated the statistical timing performance of circuits with level-sensitive latches, which are widely used in high-performance designs, such as CPUs. Circuits of this type, however, impose more complexity in timing analysis due to latch transparency. With reduced iterations and graph transformations, the proposed method extracts setup-time constraints at latches and across sequential loops very efficiently, more than ten times faster than other state-of-the-art methods, while still maintaining a good accuracy in the computed minimum clock period in a parametric form. The proposed method contributes a fast tool for statistical timing evaluation in the optimisation iterations of invasive computing systems, in which the aforementioned latch circuits always serve as the source of flexibility and robustness.
[Chen, IET CDS 2012] has introduced a modelling framework for the timing behaviour of a flipflop by building a nonlinear functional relationship between the clock-to-q delay and the data/clock alignment. The proposed framework makes it possible to carry out static timing analyses at gate level taking into consideration the interdependency of different computation stages. An iterative timing analysis method is developed to find out whether a circuit can work at a given clock frequency and to determine the minimal acceptable clock period of the circuit. The new method will be able to further improve the performance and the yield of the ASIC design for invasive multi-tile architectures, especially when process variations and ageing are considered.

Implementation and emulation for FPGA demonstrator platform:

Since it has been decided that there will be no full InvasIC ASIC implementation, our focus for the first funding phase has changed somewhat: Instead of preparing the hardware demonstrator (ASIC implementation) as originally proposed, we have worked on the modelling, implementation and emulation of the monitors on the FPGA demonstrator platform, to enable an FPGA emulation an invasive multi-tile architecture in close cooperation with the whole CRC, especially together with Project B2 and Project B3.
Before the start of Phase I, monitoring of parameters such as power consumption, temperature, performance (in terms of speed or maximum operating frequency) was already state-of-the-art in modern high performance microprocessors, [Duarte et al., "Temperature sensor design in a high volume manufacturing 65nm CMOS digital process", CICC 2007] and [Tschanz et al. "Adaptive frequency and biasing techniques for tolerance to dynamic temperature-voltage variations and aging", ISSCC 2007]. These monitor data were used, e.g. for power limitation by frequency or supply voltage control or for complete shutdown of processing elements to prevent damage [Rotem et al. "Temperature measurement in the Intel CoreTM duo processor", 2006]. But they were not used to control and optimise the complete system. So, no sophisticated interaction between the physical parameter level and run-time support system or application layer were present in existing processors.
In the invasive computing architecture, hardware monitors, which are a necessary part of the resource management feedback control loop, have been included. Consequently those hardware monitors must also be included in the prototyping system. Hardware monitors such as processor core load, communication link load (e.g. AHB bus load and iNoC load) and memory access (e.g. cache miss rate) monitors that are fully digital circuits are easy to implement using the digital logic resources of an FPGA. However, other hardware monitors that are usually realised as analogue circuits are difficult to implement in the prototyping system, since our FPGA demonstrator platform, the Synopsys CHIPit system, is based on digital FPGA technology without any reconfigurable analogue circuit resources. Therefore, for FPGA prototyping, we have taken a real-time emulation approach for such analogue monitors including power monitors, temperature monitors and subsequently will take this approach also for ageing monitors in Phase II.
The figure below shows the structure of the implemented circuit of our real-time emulation approach for power and temperature monitoring.

Layout of the emulated real-time monitoring system for power and temperature monitoring.

Power monitoring and emulation on FPGA: Power monitors for processor cores of the RISC compute tiles, i.e. the LEON3 cores, have been emulated using a run-time instruction-energy look-up approach: An instruction-energy look-up table (LUT)—containing pre-characterised average energy consumption values for each kind of processor instruction—is looked up when a new incoming instruction is executed by a processor core. For a predefined time period (in accordance with monitoring frequency), the energy values per instruction are accumulated, and the accumulated value is divided by time at the end of the period to produce the power value for that period. Power monitor emulations for tightly-coupled processor arrays (TCPAs) have taken a different approach than those for LEON3 cores, since TCPA processing elements (PEs) are based on a VLIW architecture supporting instruction level parallelism (ILP). Therefore, a simple energy LUT construction and fast instruction type determination at run time are not feasible, and thus an event- counter-based energy model has been applied: Pre-characterised energy consumption values for subprocessor modules (e.g. ALU, register file and instruction decoder) are summed up and accumulated based on event counter status. Same as for LEON3 cores, the accumulated values are divided by time to produce power values for a predefined time period. The emulated real-time power consumption information as well as the accumulated energy data can be communicated to higher system levels not only for observation purposes and the evaluation of power and energy management strategies, but the power values are also used for the emulation of temperature monitors.
Temperature monitoring and emulation on FPGA: For the real-time emulation of the temperature monitor for RISC tiles, an approach that is based on the use of a power-temperature look-up table has been used. The LUT contains the resulting steady-state temperature for all possible power consumption values (for a predefined time period) received from the power monitor. Those temperature values are pre-characterised based on a thermal RC model: In this approach, the input power leads to a temperature difference because of thermal resistances (modelling steady-state behaviour) and thermal capacitances (modelling transient behaviour) that both describe the processor architecture environment. The temperature values for the LUT are obtained for every core (as the maximum steady-state temperature of the cores) taking all possible average power values and all possible placements for "active" cores into account and considering not only the core's own activity leading to a specific temperature value, but also the influence of neighbour core activities on this temperature. These results are mapped to LUT entries and are used to obtain the resulting steady-state temperature for every core for the predefined time period (in accordance with the monitoring frequency). For TCPA tiles, the same approach has been applied. But the different architecture has made it necessary to use another thermal RC model and different power consumption values. To the best of our knowledge, our approach has been the first one that deals with real-time FPGA emulation of such a power and temperature monitoring system. We presented our approach for temperature and power monitoring and emulation of FPGA for RISC tiles in [Glocker, RACING 2014] and [Glocker, Workshop Analogschaltungen 2014].
In-situ delay monitors on FPGA: In-situ delay monitors are novel hardware monitors which can be used to monitor and predict the reliability of the monitored circuit. It is basically possible to implement them in FPGA. However, since in an FPGA ageing phenomena do not take effect in reasonable test time and under normal operating conditions (i.e. temperature and supply voltage), accelerated ageing would have to be applied to the prototyping system, which is by no means an easy task for a CHIPit system. A simple solution will again be "emulation". For this, we have developed in cooperation with industry proprietary models for ageing in dependence on time and operating conditions in another project. We intend to employ these in the second funding phase and thus have realistic data for an integrated circuit solution available.

Integration and optimisation of individual monitor types on FPGA demonstrator platform:

Integration and optimisation of power and temperature emulation in RISC tile: Digital hardware monitors such as processor core load monitors and emulated analogue monitors including power and temperature monitors have been integrated with CiC for RISC tiles in cooperation with Project B3. The number and placement of the power and temperature monitors within the overall system has been optimised such that every core of a tile has one power monitor and each tile has one temperature monitor covering all cores of a tile (giving a maximum temperature for the complete core). The time period at which the monitors operate is predefined according to the monitoring frequency.
Integration and optimisation of power and temperature emulation in TCPA tile: For power and temperature monitoring, the monitoring system will not cover every PE present on a tile, but rather cover PE regions to keep the size of the monitoring system as small as possible and to still retrieve useful and sufficiently precise results. In [Glocker, ARS 2014] we presented the approach for temperature monitor modeling and emulation for TCPAs.

Optimisation of overall monitoring system on FPGA demonstrator platform: Feedback control loop of the monitoring system:

Before Phase I, the systematic optimisation of a monitoring system in terms of circuit types, required resolution, speed and monitoring frequency as well as their number and placement has not been subject of systematic research efforts. Also, we have not been aware of techniques that allow the calibration of a generic monitor to a specific design and use case.
Use of monitoring data for resource allocation: We have studied possible improvements that can be made if monitoring data are used during resource allocation to achieve different control targets. Taking temperature monitoring data for example, different task allocation techniques and application characteristics as well as different physical conditions such as package types, material parameters and cooling all result in different temperature scenarios. Also, reasonably priced processor packaging do not cover the worst case temperature hot spot scenario anymore, which would occur without an intelligent power and temperature monitoring and control as proposed here. So, hot spot temperatures must be avoided, e.g. by intelligent task allocation. In [Glocker, ARS 2013], we have modelled different scenarios in a multicore system and evaluated the temperature distribution of cores. In a multicore system, a reciprocal influence between the core temperatures of neighbouring cores occurs, so an intelligent active core placement in a non-full-usage scenario can further decrease the current temperature. We also evaluated different temperature limiting measures: The best choice is either an intelligent core choice—resulting from intelligent resource allocation—combined with lower usage-rates or lowering of the power consumption, e.g. by implementing supply voltage or frequency scaling. Since temperature should be regulated during run time, a combined implementation of different concepts and choosing a temperature limiting measure for the individual situation during run time appears to be the best solution.
Communication of monitor data/feedback control loop of the monitoring system: Instead of communicating monitor data of every monitor type through the whole invasive computing system, the monitoring data is "bundled": For using monitoring data for resource allocation and monitoring of the current hardware health, the monitoring data has to be given to the agent system—included in the run-time support system—that handles inter-tile resource allocation. The feedback control loop is shown in the figure below for a sample RISC compute tile.

Feedback control loop for a RISC tile

In InvadeX10, several application classes (such as, e.g. high performance, communication intense, high reliability) have been defined in cooperation with Project A1, Project B2, Project B3, Project C1, Project D1 and Project D3, so that an application can express to which class it belongs This is important for realising the inter-tile resource allocation that fulfils the application needs. In every application class monitor data of different monitor types are bundled, abstracted and weighted in a way to fit the needs of the individual application class. The tile-local resource allocation is done for RISC compute tiles in the CiC. For TCPA compute tiles, the tile-local resource allocation is done by a Configuration & Communication Processor. Monitor data is also abstracted and weighted for tile-local resource allocation.

Publications

[1] Nidhi Anantharajaiah, Tamim Asfour, Michael Bader, Lars Bauer, Jürgen Becker, Simon Bischof, Marcel Brand, Hans-Joachim Bungartz, Christian Eichler, Khalil Esper, Joachim Falk, Nael Fasfous, Felix Freiling, Andreas Fried, Michael Gerndt, Michael Glaß, Jeferson Gonzalez, Frank Hannig, Christian Heidorn, Jörg Henkel, Andreas Herkersdorf, Benedict Herzog, Jophin John, Timo Hönig, Felix Hundhausen, Heba Khdr, Tobias Langer, Oliver Lenke, Fabian Lesniak, Alexander Lindermayr, Alexandra Listl, Sebastian Maier, Nicole Megow, Marcel Mettler, Daniel Müller-Gritschneder, Hassan Nassar, Fabian Paus, Alexander Pöppl, Behnaz Pourmohseni, Jonas Rabenstein, Phillip Raffeck, Martin Rapp, Santiago Narváez Rivas, Mark Sagi, Franziska Schirrmacher, Ulf Schlichtmann, Florian Schmaus, Wolfgang Schröder-Preikschat, Tobias Schwarzer, Mohammed Bakr Sikal, Bertrand Simon, Gregor Snelting, Jan Spieck, Akshay Srivatsa, Walter Stechele, Jürgen Teich, Furkan Turan, Isaías A. Comprés Ureña, Ingrid Verbauwhede, Dominik Walter, Thomas Wild, Stefan Wildermann, Mario Wille, Michael Witterauf, and Li Zhang. Invasive Computing. FAU University Press, August 16, 2022. [ DOI ]
[2] Marcel Mettler, Martin Rapp, Heba Khdr, Daniel Mueller-Gritschneder, Jörg Henkel, and Ulf Schlichtmann. An fpga-based approach to evaluate thermal and resource management strategies of many-core processors. ACM Trans. Archit. Code Optim., 19(3), may 2022.
[3] Alexandra Listl, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. Application-aware aging analysis and mitigation for sram design-for-relability. Microelectronics Reliability, 134:114548, 2022. [ DOI | http ]
[4] Grace Li Zhang, Bing Li, Ying Zhu, Tianchen Wang, Yiyu Shi, Xunzhao Yin, Cheng Zhuo, Huaxi Gu, Tsung-Yi Ho, and Ulf Schlichtmann. Robustness of neuromorphic computing with rram-based crossbars and optical neural networks. In ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, Tokyo, Japan, January 18-21, 2021, pages 853–858. ACM, 2021. [ DOI | http ]
[5] Marcel Mettler, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. A Distributed Hardware Monitoring System for Runtime Verification on Multi-tile MPSoCs. ACM Transactions on Architecture and Code Optimization (TACO), December 2020.
[6] Alexandre Truppel, Tsun-Ming Tseng, and Ulf Schlichtmann. PSION 2: Optimizing Physical Layout of Wavelength-Routed ONoCs for Laser Power Reduction. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November 2020.
[7] Alexandre Truppel, Tsun-Ming Tseng, Davide Bertozzi, José Carlos Alves, and Ulf Schlichtmann. PSION+: Combining logical topology and physical layout optimization for Wavelength-Routed ONoCs. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020.
[8] Marcel Mettler, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. Runtime monitoring of inter- and intra-thread requirements on embedded mpsocs. In Proceedings of the 33rd International Conference on VLSI Design and 19th International Conference on Embedded Systems (VLSID), January 2020. [ DOI ]
[9] Grace Li Zhang, Michaela Brunner, Bing Li, Georg Sigl, and Ulf Schlichtmann. Timing resilience for efficient and secure circuits. In Proceedings of the 25th Asia and South Pacific Design Automation Conference (ASP-DAC), January 2020. [ DOI ]
[10] Alexandra Listl, Daniel Mueller-Gritschneder, and Ulf Schlichtmann. MAGIC: A Wear-leveling Circuitry to Mitigate Aging Effects in Sense Amplifiers of SRAMs. In 2019 IEEE 17th International New Circuits and Systems Conference (NEWCAS), July 2019.
[11] Ulf Schlichtmann and Li Zhang. Machine learning approaches for efficient design space exploration of application-specific nocs. Invited Talk at Xidian University, China, June 22, 2019.
[12] Alexandra Listl, Daniel Mueller-Gritschneder, Ulf Schlichtmann, and Sani Nassif. Sram design exploration with integrated application-aware aging analysis. In Design, Automation, and Test in Europe (DATE), pages 1249–1252, March 2019.
[13] Daniel Mueller-Gritschneder. Advanced Virtual Prototyping and Communication Synthesis for Integrated System Design at Electronic System Level. Habilitation, Technical University of Munich, 2019.
[14] Alexandra Listl, Daniel Mueller-Gritschneder, Fabian Kluge, and Ulf Schlichtmann. Emulation of an asic power, temperature and aging monitor system for fpga prototyping. In International On-Line Testing Symposium (IOLTS), July 2018.
[15] Grace Li Zhang, Bing Li, Jinglan Liu, Yiyu Shi, and Ulf Schlichtmann. Design-phase buffer allocation for post-silicon clock binning by iterative learning. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, volume 37, 2018.
[16] Grace Li Zhang, Bing Li, Yiyu Shi, Jiang Hu, and Ulf Schlichtmann. Effitest2: Efficient delay test and prediction for post-silicon clock skew configuration under process variations. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018.
[17] Li Zhang. Advanced Timing for High-Performance Design and Security of Digital Circuits. Dissertation, Technical University of Munich, 2018.
[18] E. Glocker, Q. Chen, U. Schlichtmann, and D. Schmitt-Landsiedel. Emulation of an asic power and temperature monitoring system (etpmon) for fpga prototyping. Microprocessors and Microsystems, 50:90–101, May 2017. [ DOI ]
[19] Shushanik Karapetyan and Ulf Schlichtmann. 20nm finfet-based sram cell: Impact of variability and design choices on performance characteristics. In Int. Conf. Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), 2017.
[20] Elisabeth Glocker. Thermisches Verhalten und emuliertes online Temperatur-Monitorsystem für das FPGA-Prototyping von Multiprozessor-Architekturen. Dissertation, Technical University of Munich, 2017.
[21] Jinglan Liu, Yukun Ding, Jianlei Yang, Ulf Schlichtmann, and Yiyu Shi. Generative adversarial network based scalable on-chip noise sensor placement. In 30th IEEE International System-on-Chip Conference, SOCC 2017, Munich, Germany, September 5-8, 2017, pages 239–242, 2017. [ DOI ]
[22] Santiago Pagani, Lars Bauer, Qingqing Chen, Elisabeth Glocker, Frank Hannig, Andreas Herkersdorf, Heba Khdr, Anuj Pathania, Ulf Schlichtmann, Doris Schmitt-Landsiedel, Mark Sagi, Éricles Sousa, Philipp Wagner, Volker Wenzel, Thomas Wild, and Jörg Henkel. Dark silicon management: An integrated and coordinated cross-layer approach. it – Information Technology, 58(6):297–307, September 16, 2016. [ DOI ]
[23] U. Schlichtmann. The next frontier in ic design: Determining (and optimizing) robustness and resilience of integrated circuits and systems. In 2016 China Semiconductor Technology International Conference (CSTIC), pages 1–4, March 2016. [ DOI ]
[24] Grace Li Zhang, Bing Li, and Ulf Schlichtmann. Effitest: Efficient delay test and statistical prediction for configuring post-silicon tunable buffers. In Proceedings of the 53rd Annual Design Automation Conference (DAC), pages 60:1–60:6. ACM, 2016. [ DOI ]
[25] Ulf Schlichtmann, Masanori Hashimoto, Iris Hui-Ru Jiang, and Bing Li. Reliability, adaptability and flexibility in timing: Buy a life insurance for your circuits. In IEEE/ACM Asia and South Pacific Design Automation Conference (ASP-DAC), pages 705–711. IEEE/ACM Press, January 2016. [ DOI ]
[26] Bing Li and U. Schlichtmann. Statistical timing analysis and criticality computation for circuits with post-silicon clock tuning elements. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 34(11):1784–1797, November 2015. [ DOI ]
[27] Éricles R. Sousa, Frank Hannig, Jürgen Teich, Qingqing Chen, and Ulf Schlichtmann. Runtime adaptation of application execution under thermal and power constraints in massively parallel processor arrays. In Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES), pages 121–124. ACM, June 2015. [ DOI ]
[28] E. Glocker, Q. Chen, A.M. Zaidi, U. Schlichtmann, and D. Schmitt-Landsiedel. Emulation of an ASIC power and temperature monitor system for FPGA prototyping. In Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2015 10th International Symposium on, pages 1–8, June 2015. [ DOI ]
[29] Elisabeth Glocker, Qingqing Chen, Asheque M. Zaidi, Ulf Schlichtmann, and Doris Schmitt-Landsiedel. Emulated ASIC Power and Temperature Monitor System for FPGA Prototyping of an Invasive MPSoC Computing Architecture. In Proceedings of the First Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014), pages 14–15, May 2014. [ arXiv ]
[30] Nasim Pour Aryan, A. Listl, L. Heiss, C. Yilmaz, G. Georgakos, and D. Schmitt-Landsiedel. From an analytic NBTI device model to reliability assessment of complex digital circuits. In International On-Line Testing Symposium (IOLTS), pages 19–24, 2014.
[31] Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Monitoring of aging in integrated circuits by identifying possible critical paths. Microelectronics Reliability, 54:1075 – 1082, 2014. [ DOI ]
[32] Veit B. Kleeberger, Martin Barke, Christoph Werner, Doris Schmitt-Landsiedel, and Ulf Schlichtmann. A compact model for NBTI degradation and recovery under use-profile variations and its application to aging analysis of digital integrated circuits. Microelectronics Reliability, 54(6–7):1083–1089, 2014. [ DOI ]
[33] E. Glocker, S. Boppu, Q. Chen, U. Schlichtmann, J. Teich, and D. Schmitt-Landsiedel. Temperature modeling and emulation of an ASIC temperature monitor system for Tightly-Coupled Processor Arrays (TCPAs). Advances in Radio Science, 12:103–109, 2014. [ DOI ]
[34] Elisabeth Glocker, Qingqing Chen, Asheque M. Zaidi, Ulf Schlichtmann, and Doris Schmitt-Landsiedel. Emulierung eines ASIC-Leistungsverbrauchs- und Temperaturmonitorsystems für FPGA-Prototyping eines ressourcengewahren Computersystems. In 16. Workshop Analogschaltungen, Wien, Österreich, 2014.
[35] Elisabeth Glocker, Srinivas Boppu, Qingqing Chen, Ulf Schlichtmann, Jürgen Teich, and Doris Schmitt-Landsiedel. Temperature modeling and emulation of an ASIC temperature monitor system for Tightly-Coupled Processor Arrays (TCPAs) on FPGA. In Kleinheubacher Tagung 2013, September 2013.
[36] Martin Barke, Veit B. Kleeberger, Christoph Werner, Doris Schmitt-Landsiedel, and Ulf Schlichtmann. Analysis of Aging Mitigation Techniques for Digital Circuits Considering Recovery Effects. In edaWorkshop, May 2013.
[37] Bing Li, Ning Chen, Yang Xu, and Ulf Schlichtmann. On timing model extraction and hierachical statistical timing analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 32(3):367–380, March 2013.
[38] Martin Wirnshofer. Variation-Aware Adaptive Voltage Scaling for Digital CMOS Circuits. Dissertation, Technical University of Munich, 2013.
[39] Martin Wirnshofer. Variation-Aware Adaptive Voltage Scaling for Digital CMOS Circuits, volume 41. Springer Series in Advanced Microelectronics, 2013.
[40] Elisabeth Glocker and Doris Schmitt-Landsiedel. Modeling of Temperature Scenarios in a Multicore Processor System. 11:219–225, 2013. Advances in Radio Science (ARS), Volume 11. [ DOI ]
[41] Martin Wirnshofer, Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, and Georg Georgakos. On-line supply voltage scaling based on in situ delay monitoring to adapt for PVTA variations. Journal of Circuits, Systems and Computers, 21(08), December 2012. [ DOI ]
[42] Bing Li, Ning Chen, and Ulf Schlichtmann. Statistical timing analysis for latch-controlled circuits with reduced iterations and graph transformations. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pages 1670–1683, November 2012.
[43] N. Chen, B. Li, and U. Schlichtmann. Iterative timing analysis based on nonlinear and interdependent flipflop modelling. Circuits, Devices Systems, IET, 6(5):330–337, September 2012. [ DOI ]
[44] Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Efficiently analyzing the impact of aging effects on large integrated circuits. In Microelectronics Reliability, volume 52, pages 1546–1552, August 2012. [ DOI ]
[45] Martin Wirnshofer, Leonhard Heiss, A.N.Kakade, Nasim Pour Aryan, Georg Georgakos, and Doris Schmitt-Landsiedel. Adaptive voltage scaling by in-situ delay monitoring for an image processing circuit. In IEEE 15th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), pages 205–208, April 2012. [ DOI ]
[46] Sani R. Nassif, Veit B. Kleeberger, and Ulf Schlichtmann. Goldilocks failures: not too soft, not too hard. In IEEE International Reliability Physics Symposium (IRPS), April 2012.
[47] Christoph Knoth, Hela Jedda, and Ulf Schlichtmann. Current source modeling for power and timing analysis at different supply voltages. In Proceedings of Design, Automation and Test in Europe Conference (DATE), pages 923–928, March 2012. [ DOI ]
[48] Dominik Lorenz. Aging Analysis of Digital Integrated Circuits. Dissertation, Technical University of Munich, 2012.
[49] Christoph Knoth. Accurate Waveform-based Timing Analysis with Systematic Current Source Models. Dissertation, Technical University of Munich, 2012.
[50] Shailesh More. Aging Degradation and Countermeasures in Deep-submicrometer Analog and Mixed Signal Integrated Circuits. Dissertation, Technical University of Munich, 2012.
[51] Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, Georg Georgakos, and Martin Wirnshofer. Comparison of in-situ delay monitors for use in adaptive voltage scaling. Advances in Radio Science (ARS), 10:215–220, 2012.
[52] Elisabeth Glocker and Doris Schmitt-Landsiedel. Modeling of Temperature Scenarios in a Multicore Processor System. In Kleinheubacher Tagung 2012, 2012.
[53] Martin Wirnshofer, Leonhard Heiss, Georg Georgakos, and Doris Schmitt-Landsiedel. An energy-efficient supply voltage scheme using in-situ pre-error detection for on-the-fly adaptation to PVT variations. In International Symposium on Integrated Circuits (ISIC), pages 94–97, December 2011. [ DOI ]
[54] Dominik Lorenz, Martin Barke, and Ulf Schlichtmann. Finding possible critical paths for on-line monitoring of aging in integrated circuits. Technical report, Technische Universität München, December 2011.
[55] Christoph Knoth, Carsten Uphoff, Sebastian Kiesel, and Ulf Schlichtmann. SWAT: Simulator for waveform-accurate timing including parameter variations and transistor aging. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), volume 6951 of Lecture Notes in Computer Science (LNCS), pages 193–203, September 2011.
[56] Ning Chen, Bing Li, and Ulf Schlichtmann. Timing modeling of flipflops considering aging effects. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), volume 6951 of Lecture Notes in Computer Science (LNCS), pages 63–72, September 2011.
[57] Veit B. Kleeberger and Ulf Schlichtmann. Reliability Analysis of Digital Circuits Considering Intrinsic Noise. In Asia Symposium on Quality Electronic Design (ASQED), July 2011.
[58] Martin Wirnshofer, Leonard Heiss, Georg Georgakos, and Doris Schmitt-Landsiedel. A variation-aware adaptive voltage scaling technique based on in-situ delay monitoring. In IEEE 14th International Symposium on Design and Diagnostics of Electronic Circuits & Systems, pages 261–266, 2011.
[59] Jürgen Teich, Jörg Henkel, Andreas Herkersdorf, Doris Schmitt-Landsiedel, Wolfgang Schröder-Preikschat, and Gregor Snelting. Invasive computing: An overview. In Michael Hübner and Jürgen Becker, editors, Multiprocessor System-on-Chip – Hardware Design and Tool Integration, pages 241–268. Springer, Berlin, Heidelberg, 2011. [ DOI ]
[60] Nasim Pour Aryan, Leonhard Heiss, Doris Schmitt-Landsiedel, Georg Georgakos, and Martin Wirnshofer. Comparison of in-situ delay monitors for use in adaptive voltage scaling. In Kleinheubacher Tagung 2011, 2011.
[61] Jürgen Teich. Invasive algorithms and architectures. it - Information Technology, 50(5):300–310, 2008.