Identifiant du topic: HORIZON-EUROHPC-JU-2024-DARE-SGA-04-01

Specific Grant Agreement for the development of European Processor and Accelerators based on RISC-V

Type d'action : HORIZON JU Research and Innovation Actions
Nombre d'étapes : Single stage
Date d'ouverture : 29 mai 2024
Date de clôture : 29 août 2024 17:00
Budget : €120 000 000
Call : Specific Grant Agreement (SGA) for Developing large-scale European High Performance Computing (HPC) technologies based on RISC-V
Call Identifier : HORIZON-EUROHPC-JU-2024-DARE-SGA-04
Description :

ExpectedOutcome:

  • European capabilities in designing, developing, and producing IP related to high-end processors and accelerators based on RISC-V.
  • A family of energy efficient high-end processors and accelerators for HPC based on RISC-V hardware and chiplet solutions, testbeds, and at least one prototype/pilot integrating these processors/accelerators.
  • A vertically integrated software stack, including key elements such as programming models and runtimes (e.g. languages, compilers, programming environments, communication), libraries (e.g. mathematical, data analytics, AI frameworks), tools (e.g. debuggers, performance, system monitoring), operating system components (e.g. schedulers, workflows, software management, firmware, drivers, security), and other elements (e.g. for networking, software deployment, system-level composability and modularity of software, etc.).
  • A small set of critical HPC applications ported and optimised for the new RISC-V based environment, based on a co-design approach.
  • Interface specifications for the software and hardware stack, with clear definition of standardization and licensing schemes of the developed Intellectual Property (IP), with mechanism to guarantee that this IP remains in the EU.
  • An agile product roadmap with a critical timeline, milestones and all the necessary activities that would be needed to guide the beneficiaries towards building and deploying post-exascale systems in Europe, using predominantly European technology.

Scope:

​​​​​​ The DARE consortium is invited to submit a Research and Innovation Action (RIA) proposal for the 1st phase of research activities and roadmap defined in the FPA.

  1. The proposal for the 1st phase of DARE will cover the design and development of European processors, accelerators and related technologies for extreme-scale, high-performance big-data, and emerging applications, in accordance with the research roadmap defined in the FPA. The proposal should leverage software/hardware co-design to achieve the next levels of performance and efficiency in RISC-V based HPC. The proposed work should target performance levels, supported by appropriate KPIs, competitive to non-EU solutions by the end of the DARE initiative.
  2. The aim of this SGA is to design and deliver energy efficient high-end tape-outs of a general-purpose processor and of two accelerators, an Artificial Intelligence (AI) Accelerator and a Vectorial Accelerator, for HPC based on RISC-V silicon and chiplet solutions with advanced memory interfaces.
  3. The proposed action should cover the design, testing and development of the high-end processors and their integration in a pilot system in view of their roll-out, uptake and use in world-class competitive supercomputers.
  4. The proposed action should also develop a functional RISC-V software stack, including key elements such as programming models, runtimes, libraries, tools, and operating system components.

The different lines of activity under consideration must be aligned, interact between themselves, and ensure reinforced cooperation and integration that result in continuous enhancements.

In particular, the proposal should cover the following points:

Hardware development Technical Areas:

  1. General-Purpose CPU: Design and development of a high-end general-purpose CPU based on RISC-V. The design should represent an evolution of already existing European RISC-V designs. The target of the design should be to provide scalable and customisable high-performance RISC-V multi-core and multi-cluster CPU implementations delivering feature and cost competitive power-performance-area metrics. The CPU ought to deliver high performance over a wide range of HPC applications featuring combinations of both parallel and sequential code. Special attention should be given to the optimisation of the memory system bandwidth at all levels. The proposed work must target KPIs comparable to non-EU solutions and be feature and price competitive and energy efficient. A detailed comparison with other solutions including monolithic CPUs, chiplet-based CPUs, and closed-source proprietary CPU IP from non-EU providers should be presented.
  2. Artificial Intelligence (AI) Accelerator: Design and development of a high-end RISC-V based accelerator designed for the efficient processing of AI workloads and applications. The design should be an evolution of existing European AI accelerator designs. Examples of applications that should be covered are AI-driven approximations of computationally expensive simulations (trained on existing data from full-scale HPC simulations), large transformer-based language models, massive neural networks, etc. A key challenge is to balance computational performance with energy efficiency. The proposed work must target KPIs comparable to non-EU solutions and be competitive on price/performance and energy efficiency.
  3. Vectorial Accelerator: Design and development of a high-end RISC-V based vectorial accelerator. The design should be an evolution of existing European vectorial accelerator designs. Capabilities should include high floating-point density, long vector and matrix architecture and wide data path. The applications targeted should include current and future HPC workloads requiring operations using 64-bit double precision floating-point support and other data types. The proposed work must target KPIs comparable to non-EU solutions and be competitive on price/performance and energy efficiency.

All software and hardware development technical areas should be industrially/commercially driven and use chiplet-based approaches providing mix-and-match customisation capabilities to address varying high-end computing workload requirements. They should target the realization of initial tape-outs of at least 7nm. within the timeframe of the first RIA. The node selection should be done based on a thorough cost/benefit analysis and corresponding industrial and market perspectives. Moreover, the consortium should indicate the advantages and disadvantages of using the target fabrication processes, assess the availability of relevant IP, availability of design tools, licenses, and also their resources and capabilities. The required EDA tools and IP should be described in detail and the timeline of the obtained licenses and cost should be detailed. EDA training requirements, availability, and experience of relevant engineering resources, etc should be taken into account.

RTL-freeze should be targeted for month 18. At this point, before moving to tape-out, the EuroHPC JU will assess the KPI[1] achievements/projections including a competitive assessment with regards to non-EU solutions worldwide for each hardware development activity and decide whether a particular technological development should be continued or halted. A single mask-set for all chiplets should be considered to reduce tape-out costs. A detailed plan to synchronise the chiplets resulting from the hardware developments should be provided and a private shuttle with a single mask set should be preferably created.

Applications and Software Technical Area:

  • Develop an optimised HPC software stack for the hardware development technical areas. The software stack should support single nodes as well as large configurations.
  • Develop a hardware-software co-design simulation framework to facilitate native hardware support to application requirements.
  • Port at least 3 realistic applications to the new hardware platforms. The selection of applications should be justified in detail with respect to coverage of projected future HPC workloads.

Pilots Technical Area

  • Build / Upgrade Software Development Vehicles to support the Applications and Software technical area until actual silicon from the project is available.
  • Once the projects’ silicon is available, integrate the results from the hardware development technical areas in testbeds and at least one prototype/pilot in pre-operational environments in supercomputing centres for user testing and validation.
  • Pilots with non-EU RISC-V off-the-shelf components are explicitly out of the scope of this initiative.

Management and Coordination: The proposal should implement a professional industrial project management approach. It should include an industry technical coordination group, consisting of the key industrial partners in the SGA, for closely overseeing technical progress in all the industrial activities related to the development of the proposed project’s hardware solutions, tightly coordinating these activities and assisting the coordinator with the strategic decisions and orientations of the proposed project, including the R&I roadmap to implement the activities. The industry technical coordination group should maintain an up-to-date risk register with clear mitigation actions and escalation procedures.

In particular:

  • The proposal should give a full product roadmap of how the HPC hardware developed through DARE will be competitive with current and future hardware coming from the worldwide competitors. This roadmap should be updated dynamically as necessary. The roadmap should include a description of all the activities that will be needed to build and deploy post-exascale systems in Europe based on the technology developed in the project.
  • The proposal should demonstrate the capacity and industrial commitment of the partners for carrying out and sustaining the technical development and maintenance as well as effective marketing and business development. It should include convincing plans for industrial exploitation of the targeted technology developments and long-term market perspectives.
  • The role of each partner in the proposed project should be described in detail. The number of the partners should be limited to the ones necessary for the achievements of the goals of the SGA. The partners should describe how soon after signing the SGA they would be able to allocate resources to the project and how many additional resources would need to be recruited, and what is the estimated onboarding process timeline The potential for long-term cooperation among partners should be described.
  • The proposal should include a preliminary analysis of barriers to market entry and appropriate mitigation procedures. Additionally, it should provide the potential impact to the project.
  • The proposal should include an end-user advisory board, consisting of a representative set of private and public end users, to provide the user requirements and additional guidance to the proposed project on its co-design activities related to the targeted processor and accelerator technology.
  • The proposal should provide for appropriate progress control mechanisms, by establishing meaningful common milestones and KPIs, to monitor the progress of the different work streams towards the goals of the overall initiative, and continuously monitor the current state-of-the art , and comparing it with the state of the RISC-V General Purpose Processor (GPP), Vector Accelerator, and AI Accelerator. In particular, the proposal should foresee an intermediate major milestone at month 18 (before tape-out) for a critical assessment of the project’s progress against the objectives and time-plan. The proposal should plan monthly monitoring meetings between the JU and the project’s management team.
  • The proposal should describe in detail the mechanisms to guarantee that all IP generated in the initiative will stay in the EU. IP management should be submitted with a clear plan of how key IP would remain in EU and not shared with non-European entities.
  • The proposal should give a detailed description of preceding work in European projects by the partners, in particular the baseline of the technology developed in those prior projects, how the outputs from those projects will impact upon the proposal, and the will to license such to the FPA partners under reasonable terms and conditions..
  • The synergies with the ETP4HPC Strategic Research Agenda and the HiPEAC Vision should be provided.
  • The proposal should provide a plan on how the consortium will establish interaction with the relevant stakeholders andRISC-V projects of the Chips JU to coordinate work on horizontal issues common to both communities and exploit synergies where relevant.

[1]The EuroHPC JU and the Consortium at the beginning of the action will define the KPIs and acceptance criteria in each technical area according to industrial standards.