Public participation 3 OPENCHIP & SOFTWARE TECHNOLOGIES 3 OPENCHIP & SOFTWARE TECHNOLOGIES Became operational in April 2024, we are a start-up. Around 200 employees across 5 Countries. 20 patents submitted. RISC-V International board member. HPC-VEC & AI accelerators. Fabless Silicon Design House. 54% Clients in 15 Countries - More than 1.500 high-tech projects GTD works across some of the most demanding industries, providing software, systems, and services for safety, mission, and business critical applications. We provide our clients with secure, reliable technology around the world. Out in space, our software orbits the Earth 24/7, 365 days a year. Closer to home, our software keeps air craft flying high, makes vehicles safer, powers smart meter networks, and does much more. For over 34 years, we’ve bee transforming the way the world uses technology. 46% 138M€
• BSC’s unique feature: there is a supercomputer in the basement of the chapel. • BSC is one of eight centers designated as part of EuroHPC, a European supercomputer consortium. • 314 petaflops MareNostrum 5 now operational • EuroHPC • LUMI (Finland) is the fastest with 375 petaflops, followed by BSC (314 petaflops) and Leonardo (Italy).
Designing RISC-V-based Accelerators for next generation Computers (DRAC) project • Joint engineering with BSC and Universitat Politècnica de Catalunya (UPC) • https://drac.bsc.es/en/home • Digital Autonomy with RISC-V in Europe (DARE) • 3-year initiative | 38 organizations | €24M | Backed by EuroHPC • Prototyping next-generation HPC and AI systems • Leveraging industry-standard chiplets and cutting-edge EU semiconductor technologies • Goal: maximum performance and energy efficiency • https://www.bsc.es/es/unete/oportunidades-de-excelencia-profesional/dare
Interest (IPCEI) by EU Commission 6 • IPCEI is an important framework for strengthening EU's technological sovereignty and building strategic value chains. ➢ A cycle that enhances companies' strengths in technologies while generating value through inter-company collaboration. • Over €91 billion total, combining public funds and private investment. • Openchip is one of 56 selected companies. https://competition-policy.ec.europa.eu/state-aid/ipcei/approved-ipceis_en • Openchip is a fabless semiconductor company developing accelerator chips for HPC (High-Performance Computing) and AI/ML/DL applications. • Established in 2021 by GTD and the Barcelona Supercomputing Center (BSC), it began operations in 2024. • Based on the RISC-V architecture with the latest silicon technologies (below 5nm) • The IPCEI and Openchip initiative is aligned with the EU's Green Deal and digital transformation strategy and supports the development of sustainable technologies.
22, 2025, with Kalray Development of HPC/AI accelerators equipped with DPUs (Data Processing Units) 7 • Kalray’s key technologies • Products based on MPPA® DPU (Massively Parallel Processor Array Data Processing Unit) are available, offering enhanced computational performance and reduced power consumption for next-generation AI data centers. • Purpose of Data Processing Units (DPU) • Semiconductors specialized in data movement and processing within a data center (DC). • By allowing CPUs and AI accelerators to focus on compute-intensive tasks, DPUs improve the overall performance of the data center. • DPU key features • AI model training and inference data transfer optimization • Offloading of network and security processing • Optimization of data transfer with storage • Reduce power consumption through optimization of CPU and AI accelerators • Openchip's HPC/AI computational engine and Kalray's DPU technology aim to develop HPC/AI accelerators with high computational speed per low power consumption.
Accelerators SIMT Earth Simulator 1984 1974 CDC STAR-100 CRAY-1 NEC SX-2 NEC SX-3 Vector Systems microprocessors Top 500 ✓ Vector systems were the general purpose supercomputers from the 1970s to beginning of 2000s. ✓ Dennard scaling and Moore‘s law pushed general purpose microprocessors to the top for ~15-20 years. ✓ Short vector SIMD: return of vector in general purpose processors: until now. ✓ End of Moore‘s law: shift to accelerators, SIMD (Xeon PHI) and SIMT (GPUs) + increasingly higher power. ✓ SIMT can be mapped to SIMD (vector) Slide courtesy of Dr. Erich Focht
units – Numeric precision reduction – Structural (fake) sparsity – Blackwell: block scaled numeric formats 10 End of Moore‘s Law vs. AI Marketing Machine
level parallelism (DLP) – One instruction keeps the pipeline(s) busy for long time – Hide latency of memory access through long vectors, temporal execution – Less sensitive to occasional latency increase (like NoC congestion) – Parallelism explicit, in the ISA! 12 From Vector To SIMD And Back CRAY-1 style vector pipeline Long Vector NEC SX-4 style parallel vector pipeline Long Vector AVX2 style SIMD pipeline Short Vector ✓ SIMD & short vectors – Cache/prefetch/OoO to hide latency – More difficult to keep pipelines full – Moderate DLP handling – Sensitive to latency increase – Good with short vectors Slide courtesy of Dr. Erich Focht
same ISA! ✓ Data parallelism → Loop vectorization ✓ Code ported to RISCV long vector runs on short vector cores, too. ✓ Investment is protected! RISC-V Vector Implementations: o Long vectors: Vitruvius, Ara, (Hwacha) o Short vectors: Spatz, Saturn, commercial 13 Vector is the better ISA RISC-V decided for a VECTOR ISA • Not a SIMD ISA! • Variable vector length o Vector length register like SX Aurora • Vector register size not prescribed o VLEN = VREG size in bits, is not fixed! o Left to the implementation o Must be power of 2 o ELEN ≥ 8 max element size in bits o ELEN ≤ VLEN ≤ 65536 Code can be VLEN agnostic! Runs on any implementation of RISCV Vector. https://github.com/riscv/riscv-v-spec NOTE: ARM SVE (Scalable Vector Extension) allows implementing 128-2048bit SIMD units.
use model • More power efficient: • Single control for certain volume of computation GPU: A huge number of single threads • Generally Inflexible use model • Less power efficient: • Control for each computation thread required 15 Processing Vector and GPGPU with RISC-V