Projects


B1: Adaptive Application-Specific Invasive Microarchitecture

Principal Investigators:

Prof. Henkel, Prof. Becker, Dr. Bauer

Scientific Researchers:

Artjom Grudnitsky, Carsten Tradowsky

Abstract

Subproject B1 investigates mechanisms that provide adaptivity at the Instructeon Set Architecture (ISA) and Microarchitecture (μArch) using a run-time reconfigurable fabric. We will research concepts and methods that allow invading that fabric and μArch within an invasive core (i-Core). The goals are to advance the concepts of state-of-the-art reconfigurable processors towards invasion and to exploit their benefits in this project. The focus is to i) investigate run-time adaptivity at the μArch level (e.g., changing the cache size), ii) provide i-let-specific accelerators on demand (adaptive ISA), and iii) dynamically accelerate `invade', etc. and the run-time system (e.g., agents from subproject C1).

Synopsis

Subproject B1 investigates novel and adaptive μArch concepts for heterogeneous application-specific processors and Instruction Set Extensions (ISEs).

The approach of (reconfigurable) application-specific processors proved to be efficient and showed good performance for applications (or threads of an application) that benefit from a fine-grained parallelism. A high degree of run-time adaptation is required to exploit the full potential of fine-grained application parallelism, if the run-time situation (e.g., which tasks require which performance at which time) is not known beforehand. Depending on the i-lets that execute on an i-Core at a particular point in time, the μArch and the ISE need to be adapted (note that assigning workload to an i-Core is performed by the agents of the invasive Run-Time Support System (iRTSS) in subproject C1). For instance, a μArch/ISE configuration that is beneficial for a particular i-let might not be beneficial for another one. Therefore, depending on the priorities of the i-lets, a compromise has to be used according to the i-Core-internal configuration of the μArch and ISE.

The highly adaptive nature of the invasive computing architecture intensifies the potential advantages that come along with an adaptive μArch and ISE. The envisioned adaptations of the ISA comprise flexible i-let-specific accelerators that support run-time "performance per area" adaptations, depending on the number of i-lets that invade an i-Core at a particular time (and thus the amount of reconfigurable fabric that is available per i-let).

The envisioned μArch adaptations are ISA-independent and comprise adaptive pipeline length, branch prediction, cache, etc. Both, ISA and μArch adaptations target i-let-specific optimisations, for example, a particular application might benefit from a certain hardware accelerator (part of the ISA) and a certain branch prediction (part of the μArch). An i-let might also benefit from different ISA and μArch implementations in different phases of its execution. Here, adaptation is used to move the i-let execution to a more efficient working point.

Additionally, the ISA and μArch adaptations are not limited to i-lets, but they also support the basic invasion commands ('invade', 'infect', 'retreat'), and the iRTSS/agents from subproject C1. Executing the basic invasion commands and the agents is a fundamental component of all invasive architectures. Therefore, accelerating these parts may increase the general system performance and efficiency, independent of any particular i-let.

Depending on the priority of

  • the i-lets,
  • the basic invasion commands (e.g., depending on how often they are needed at a certain time) and
  • the iRTSS

the μArch and ISE are adapted accordingly.

Publications

[1] Tanja Harbaum, Christoph Schade, Marvin Damschen, Carsten Tradowsky, Lars Bauer, Jörg Henkel, and Jürgen Becker. Auto-SI: An adaptive reconfigurable processor with run-time loop detection and acceleration. In 30th IEEE International System-on-Chip Conference (SOCC), pages 224–229, September 2017.
[2] Jörg Henkel. The triangle of power density, circuit degradation and reliability. Invited Keynote Speech, 30th IEEE International System-On-Chip Conference (SoCC 2017), Munich, Germany, September 7, 2017.
[3] Manuel Mohr and Carsten Tradowsky. Pegasus: Efficient data transfers for PGAS languages on non-cache-coherent many-cores. In Design, Automation and Test in Europe Conference Exhibition (DATE), pages 1781–1786, March 30, 2017.
[4] Artjom Grudnitsky, Lars Bauer, and Jörg Henkel. Efficient partial online-synthesis of special instructions for reconfigurable processors. IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 25(2):594–607, February 2017. [ DOI ]
[5] Marvin Damschen, Lars Bauer, and Jörg Henkel. Timing analysis of tasks on runtime reconfigurable processors. IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 25(1):294–307, January 2017. [ DOI ]
[6] Manuel Mohr and Carsten Tradowsky. Pegasus: Efficient data transfers for PGAS languages on non-cache-coherent many-cores. In Proceedings of Design, Automation and Test in Europe Conference Exhibition (DATE), pages 1781–1786. IEEE, 2017. [ DOI ]
[7] Alexander Pöppl, Marvin Damschen, Florian Schmaus, Andreas Fried, Manuel Mohr, Matthias Blankertz, Lars Bauer, Jörg Henkel, Wolfgang Schröder-Preikschat, and Michael Bader. Shallow water waves on a deep technology stack: Accelerating a finite volume tsunami model using reconfigurable hardware in invasive computing. In Euro-Par 2017: Proceedings of the 10th Workshop on UnConventional High Performance Computing (UCHPC 2017), Lecture Notes in Computer Science (LNCS). Springer, 2017.
[8] Carsten Tradowsky. Methoden zur applikationsspezifischen Effizienzsteigerung adaptiver Prozessorplattformen. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), December 20, 2016.
[9] Jürgen Teich. Invasive computing – editorial. it – Information Technology, 58(6):263–265, November 24, 2016. [ DOI ]
[10] Stefan Wildermann, Michael Bader, Lars Bauer, Marvin Damschen, Dirk Gabriel, Michael Gerndt, Michael Glaß, Jörg Henkel, Johny Paul, Alexander Pöppl, Sascha Roloff, Tobias Schwarzer, Gregor Snelting, Walter Stechele, Jürgen Teich, Andreas Weichslgartner, and Andreas Zwinkau. Invasive computing for timing-predictable stream processing on MPSoCs. it – Information Technology, 58(6):267–280, September 30, 2016. [ DOI ]
[11] Fazal Hameed, Lars Bauer, and Jörg Henkel. Architecting on-chip DRAM cache for simultaneous miss rate and latency reduction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 35(4):651–664, April 2016.
[12] Carsten Tradowsky, Enrique Cordero, Christoph Orsinger, Malte Vesper, and Jürgen Becker. A Dynamic Cache Architecture for Efficient Memory Resource Allocation in Many-Core Systems. Springer International Publishing, Cham, 2016. [ DOI ]
[13] Carsten Tradowsky, Enrique Cordero, Christoph Orsinger, Malte Vesper, and Jürgen Becker. Adaptive Cache Structures. Springer International Publishing, Cham, 2016. [ DOI ]
[14] Carsten Tradowsky, Tanja Harbaum, Leonard Masing, and Jürgen Becker. A novel adl-based approach to design adaptive application-specific processors. In Best of IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 2016.
[15] Artjom Grudnitsky. A Reconfigurable Processor for Heterogeneous Multi-Core Architectures. Dissertation, Chair for Embedded Systems (CES), Department of Computer Science, Karlsruhe Institute of Technology (KIT), Germany, December 21, 2015.
[16] Johny Paul, Walter Stechele, Benjamin Oechslein, Christoph Erhardt, Jens Schedel, Daniel Lohmann, Wolfgang Schröder-Preikschat, Manfred Kröhnert, Tamim Asfour, Éricles R. Sousa, Vahid Lari, Frank Hannig, Jürgen Teich, Artjom Grudnitsky, Lars Bauer, and Jörg Henkel. Resource-awareness on heterogeneous MPSoCs for image processing. Journal of Systems Architecture, 61(10):668–680, November 6, 2015. [ DOI ]
[17] Lars Bauer, Artjom Grudnitsky, Marvin Damschen, Srinivas Rao Kerekare, and Jörg Henkel. Floating point acceleration for stream processing applications in dynamically reconfigurable processors. In IEEE Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia), October 2015. Invited Paper for the Special Session “Dynamics and Predictability in Stream Processing – A Contradiction?”. [ DOI ]
[18] C. Diniz, M. Shafique, S. Bampi, and J. Henkel. A reconfigurable hardware architecture for fractional pixel interpolation in high efficiency video coding. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 34(2), February 2015.
[19] Fazal Hameed. DRAM aware Last-Level-Cache policies for Multi-core Systems. Dissertation, Chair for Embedded Systems (CES), Department of Computer Science, Karlsruhe Institute of Technology (KIT), Germany, February 6, 2015.
[20] Peter Figuli, Carsten Tradowsky, Jose Martinez, Harry Sidiropoulos, Kostas Siozios, Holger Stenschke, Dimitrios Soudris, and Jürgen Becker. A novel concept for adaptive signal processing on reconfigurable hardware. In Applied Reconfigurable Computing, volume 9040 of Lecture Notes in Computer Science, pages 311–320. Springer International Publishing, 2015.
[21] Artjom Grudnitsky, Lars Bauer, and Jörg Henkel. COREFAB: Concurrent reconfigurable fabric utilization in heterogeneous multi-core systems. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), October 2014. [ DOI ]
[22] Martin HaaƟ, Lars Bauer, and Jörg Henkel. Automatic custom instruction identification in memory streaming algorithms. In International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), October 2014. [ DOI ]
[23] Jörg Henkel, Lars Bauer, Artjom Grudnitsky, and Hongyan Zhang. Adaptive embedded computing with i-Core. In ACM SIGBED Review – Special Issue on the 6th Workshop on Adaptive and Reconfigurable Embedded Systems, volume 11, pages 20–21, October 2014. Extended Abstract for Keynote Talk. [ DOI ]
[24] Fazal Hameed, Lars Bauer, and Jörg Henkel. Reducing latency in an SRAM/DRAM cache hierarchy via a novel tag-cache architecture. In IEEE/ACM Design Automation Conference (DAC), June 2014. [ DOI ]
[25] Jörg Henkel. Adaptive embedded computing with i-Core. Keynote Talk, 6th Workshop on Adaptive and Reconfigurable Embedded Systems, CPSWeek (APRES), April 14, 2014.
[26] Carsten Tradowsky, Martin Schreiber, Malte Vesper, Ivan Domladovec, Maximilian Braun, Hans-Joachim Bungartz, and Jürgen Becker. Towards dynamic cache and bandwidth invasion. In Reconfigurable Computing: Architectures, Tools, and Applications, volume 8405 of Lecture Notes in Computer Science, pages 97–107. Springer International Publishing, April 2014. [ DOI ]
[27] Artjom Grudnitsky, Lars Bauer, and Jörg Henkel. MORP: Makespan optimization for processors with an embedded reconfigurable fabric. In Proceedings of the 22nd ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pages 127–136, February 2014. [ DOI ]
[28] C. Tradowsky, T. Gädeke, T. Bruckschlögl, W. Stork, K.-D. Müller-Glaser, and J. Becker. Smartlocore: A concept for an adaptive power-aware localization processor. In Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on, pages 478–481, February 2014. [ DOI ]
[29] Muhammad Shafique, Lars Bauer, and Jörg Henkel. Adaptive energy management for dynamically reconfigurable processors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 33(1):50–63, January 2014. [ DOI ]
[30] Timo Stripf. Softwareframework für Prozessoren mit variablen Befehlssatzarchitekturen. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), December 11, 2013.
[31] Peter Figuli, Carsten Tradowsky, Nadine Gaertner, and Jürgen Becker. Visa: A highly efficient slot architecture enabling multi-objective ASIP cores. In International Symposium on System on Chip (SoC), pages 1–8, October 2013. [ DOI ]
[32] Manuel Mohr, Artjom Grudnitsky, Tobias Modschiedler, Lars Bauer, Sebastian Hack, and Jörg Henkel. Hardware acceleration for programs in SSA form. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), Montreal, Canada, October 2013. [ DOI ]
[33] Fazal Hameed, Lars Bauer, and Jörg Henkel. Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies. In International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES), September 2013. [ DOI ]
[34] Fazal Hameed, Lars Bauer, and Jörg Henkel. Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache. In International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), September 2013. [ DOI ]
[35] Carsten Tradowsky, Tanja Harbaum, Shaver Deyerle, and Jürgen Becker. Limbic: An adaptable architecture description language model for developing an application-specific image processor. In IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pages 34–39, August 2013. [ DOI ]
[36] Lars Braun. Methoden zur Erstellung eines laufzeitadaptiven und zweidimensional rekonfigurierbaren Systems. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), February 19, 2013.
[37] Carsten Tradowsky, Enrique Cordero, Thorsten Deuser, Michael Hübner, and Jürgen Becker. Determination of on-chip temperature gradients on reconfigurable hardware. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig), pages 1–8, December 2012. [ DOI ]
[38] Michael Hübner, Diana Göhringer, Carsten Tradowsky, Jörg Henkel, and Jürgen Becker. Adaptive processor architecture. In International Conference on Embedded Computer Systems (SAMOS), pages 244–251, July 2012. Invited paper. [ DOI ]
[39] Carsten Tradowsky, Florian Thoma, Michael Hübner, and Jürgen Becker. Lisparc: Using an architecture description language approach for modelling an adaptive processor microarchitecture. In 7th IEEE International Symposium on Industrial Embedded Systems (SIES'12), pages 279–282, June 2012. Best Work-in-Progress (WiP) Paper Award. [ DOI ]
[40] Jörg Henkel. i-Core: Adaptive computing for multi-core architectures. Embedded System Design from MultiMedia to Cloud, Hong Kong, Invited Talk, May 18, 2012.
[41] Lars Bauer, Artjom Grudnitsky, Muhammad Shafique, and Jörg Henkel. PATS: a performance aware task scheduler for runtime reconfigurable processors. In 20th Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 208–215, May 2012. [ DOI ]
[42] Carsten Tradowsky, Florian Thoma, Michael Hübner, and Jürgen Becker. On dynamic run-time processor pipeline reconfiguration. In IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pages 419–424, May 2012. [ DOI ]
[43] Artjom Grudnitsky, Lars Bauer, and Jörg Henkel. Partial online-synthesis for mixed-grained reconfigurable architectures. In Proceedings of Design, Automation and Test in Europe Conference (DATE), pages 1555–1560, March 2012. [ DOI ]
[44] Peter Figuli, Michael Hübner, Romuald Girardey, F. Bapp, Thomas Bruckschlögl, Florian Thoma, Jörg Henkel, and Jürgen Becker. A heterogeneous SoC architecture with embedded virtual FPGA cores and runtime core fusion. In NASA/ESA 6th Conference on Adaptive Hardware and Systems (AHS), pages 96–103, 2012. [ DOI ]
[45] Jörg Henkel, Andreas Herkersdorf, Lars Bauer, Thomas Wild, Michael Hübner, Ravi Kumar Pujari, Artjom Grudnitsky, Jan Heisswolf, Aurang Zaib, Benjamin Vogel, Vahid Lari, and Sebastian Kobbe. Invasive manycore architectures. In Proceedings of the 17th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 193–200, January 2012. [ DOI ]
[46] Alexander Klimm. Computing Architectures for Security Applications on Reconfigurable Hardware in Embedded Systems. Dissertation, Institut für Technik der Informationsverarbeitung (ITIV), Fakultät für Elektrotechnik und Informationstechnik, Karlsruher Institut für Technologie (KIT), December 22, 2011.
[47] M. Hübner, C. Tradowsky, D. Göhringer, L. Braun, F. Thoma, J. Henkel, and J. Becker. Dynamic processor reconfiguration. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig), pages 123–128, November 2011. [ DOI ]
[48] Jörg Henkel, Lars Bauer, Michael Hübner, and Artjom Grudnitsky. i-Core: A run-time adaptive processor for embedded multi-core systems. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), July 2011. Invited paper.
[49] Lars Bauer, Muhammad Shafique, and Jörg Henkel. Concepts, architectures, and run-time systems for efficient and adaptive reconfigurable processors. In NASA/ESA 6th Conference on Adaptive Hardware and Systems (AHS), pages 80–87, June 2011. Invited paper; Received the MaXentric Technologies AHS Best Paper Award. [ DOI ]
[50] Michael Hübner, Peter Figuli, Romuald Girardey, Dimitrios Soudris, Kostas Siozios, and Jürgen Becker. A heterogeneous multicore system on chip with run-time reconfigurable virtual fpga architecture. In Proceedings of the International Parallel and Distributed Processing Symposium Workshops (IPDPSW), May 2011.
[51] Jürgen Teich, Jörg Henkel, Andreas Herkersdorf, Doris Schmitt-Landsiedel, Wolfgang Schröder-Preikschat, and Gregor Snelting. Invasive computing: An overview. In Michael Hübner and Jürgen Becker, editors, Multiprocessor System-on-Chip – Hardware Design and Tool Integration, pages 241–268. Springer, Berlin, Heidelberg, 2011. [ DOI ]
[52] Jürgen Teich. Invasive algorithms and architectures. it - Information Technology, 50(5):300–310, 2008.
[53] Diana Göhringer, Jonathan Obie, Michael Hübner, and Jürgen Becker. Impact of task distribution, processor configurations and dynamic clock frequency scaling on the power consumption of fpga-based multiprocessors. In Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip (ReCoSoC), pages 13–20. KIT Scientific Publishing.
[54] Michael Hübner, Diana Göhringer, J. Noguera, and Jürgen Becker. Fast dynamic and partial reconfiguration data path with low hardware overhead on Xilinx FPGAs. In Proceedings of the International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[55] Carsten Tradowsky, Peter Figuli, Erik Seidenspinner, Felix Held, and Jürgen Becker. A new approach to model-based development for audio signal processing. In 134th International AES Convention.
[56] Michael Hübner and Jürgen Becker, editors. Multiprocessor System-on-Chip: Hardware Design and Tool Integration. Springer.