Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Pelemay Backend: A memory-saving, fault-toleran...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Susumu Yamazaki (ZACKY)
September 07, 2023
Programming
280
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Pelemay Backend: A memory-saving, fault-tolerant and distributed collection of Nx compilers and backends for embedded systems
Susumu Yamazaki (ZACKY)
September 07, 2023
More Decks by Susumu Yamazaki (ZACKY)
See All by Susumu Yamazaki (ZACKY)
新規のC言語処理系を実装することによる 組込みシステム研究にもたらす価値 についての考察
zacky1972
1
550
ザキ研Elixir研究動向2023
zacky1972
0
440
Elixir/Nerves衛星を打ち上げる日
zacky1972
1
380
Robust, Distributed, and Parallel Processing for Enormous Images Using Supervisor, Node, Flow, Nx, and Evision
zacky1972
0
430
人工衛星高速画像処理で 社会問題を解決しよう
zacky1972
0
420
長い機械学習/画像処理でも メモリ不足を起こさないElixir新技術
zacky1972
0
330
Future Possibilities and Effectiveness of JIT from Elixir Code of Image Processing and Machine Learning into Native Code with SIMD Instructions
zacky1972
0
390
世界は意外と近い!? 「遠くへ行くならみんなで行こう」 次は貴方が発表する番だ
zacky1972
0
120
スーパーコンピュータが組込みシステムに降りてくる! 〜新時代の高性能組込みシステムの SIMD/ベクトル処理の要点を押さえる
zacky1972
0
390
Other Decks in Programming
See All in Programming
dRuby over BLE
makicamel
2
380
エンジニアと一緒にテストコードの設計と実装を改善した話
mototakatsu
0
210
RTSPクライアントを自作してみた話
simotin13
0
620
PHPで使える日時の表現と、その知り方 #frontend_phpcon_do
o0h
PRO
0
260
AIだと陥りがちなJakarta EE最新技術への移行時の落とし穴と解決策
tnagao7
0
110
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
250
Oxcを導入して開発体験が向上した話
yug1224
4
320
過去最大のMCPアップデート! 2026-07-28 RC版の謎に迫る
licux
6
370
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
21
6.8k
Make SRE Operations Easier with Azure SRE Agent
kkamegawa
0
7k
DynamoDBには集計系のクエリがないけどなんとかしたい
musan
1
180
Spec Driven Development | AI Summit Lisbon
danielsogl
PRO
0
200
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
201
75k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
200
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
170
Discover your Explorer Soul
emna__ayadi
2
1.1k
Reality Check: Gamification 10 Years Later
codingconduct
0
2.2k
Being A Developer After 40
akosma
91
590k
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
390
We Have a Design System, Now What?
morganepeng
55
8.2k
Why Your Marketing Sucks and What You Can Do About It - Sophie Logan
marketingsoph
0
170
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
850
Transcript
Pelemay Backend: A memory-saving, fault- tolerant and distributed collection of
Nx compilers and backends for embedded systems Susumu Yamazaki (ZACKY) This work was partially supported by the Asahi Kohsan Group Research Support Program of the Kitakyushu Foundation for the Advancement of Industry Science and Technology. 1 ©︎ 2023 Susumu Yamazaki
About Susumu Yamazaki (ZACKY) • This slides are in my
Speaker Deck https://speakerdeck.com/zacky1972 • From Japan 🇯🇵. • An organizer of ElixirConf JP. • Associate Professor at Univ. of Kitakyushu. • My hobby in my childhood was to describe science fiction stories! • I wanted to write longer stories, like Perry Rhodan, but my advantage was to write shorter stories… 2 ©︎ 2023 Susumu Yamazaki
Background • You know that de facto standard frameworks of
machine learning for Elixir are Nx, Axon and their ecosystem. • The talk by Sean Moriarity at this ElixirConf showed the positioning strategy of MLOps towards distributed and parallel computing with multiple GPUs for LLM! • We are quite much inspired by his talk. • However, current focuses of Nx, Axon and their ecosystem, especially EXLA, are unsuitable for most embedded systems due to lack of GPUs. • So, we have been developing Pelemay Backend, a lightweight Nx backend specialized for embedded systems, since 2022. 3 ©︎ 2023 Susumu Yamazaki
Our Positioning Strategy 4 ©︎ 2023 Susumu Yamazaki
Lesson Learned of Pelemay Backend 1st ed. • We developed
Pelemay Backend 1st ed. in 2022. • This proves utilization of OpenBLAS as an Nx backend. • BLAS means Basic Linear Algebra Subroutines, which has been developed and sophisticated since the FORTRAN era. • OpenBLAS is an open-source software compatible with BLAS, and has faster implementation with SIMD or vector instructions for most ISAs, including ARM and RISC-V, than that written in C. • We implemented a partial builder that can compile only necessary modules of OpenBLAS, and a prototype backend using it. 5 ©︎ 2023 Susumu Yamazaki
Lesson Learned of Pelemay Backend 2nd ed. • Next, we
have developed Pelemay Backend 2nd ed. since 2023. • One of its concepts is component-based for maintainability, based on Aspect-oriented programming (AOP). • That is, we will develop a backend generator to decorate the specified based backend with the functions before and after a set of functions in the backend. • The set can be specified with the style of AspectJ, an AOP language, and with grouping written in HexDocs of Nx, for example, Aggregates, Backend, Conversion, and so on. • The another is memory-saving. We proved that converting ONNX for ResNet to Axon and loading it require 9GB memory. That is too much to execute them on an embedded system. • However, the Sean’s talk shows the roadmap to realize memory-saving processing for LLM. Then, we will wait for the realization. 6 ©︎ 2023 Susumu Yamazaki
What Pelemay Backend focuses on • Thus, now, we’ll focus
on implementation of the component-based architecture with OpenBLAS. • Some module focuses on only multiplication of matrix and matrix. • Some module focuses on only addition with vectors or matrices with scalar multiplication. • Some module focuses on only scalar multiplication. • Some module focuses on only dividing large vectors or matrices into smaller pieces. • Other unfrequent operations are delegated to the default backend. • Such many simple modules collaborate to operate given numerical functions. • This makes architecture simpler to maintain than monolith. • This approach is to accumulate shorter stories towards a longer and longer story! 7 ©︎ 2023 Susumu Yamazaki
To get source of Pelemay Backend • https://github.com/zeam-vm/pelemay_backend • Look
forward to our future progress of such accumulated stories! • Thank you! 8 ©︎ 2023 Susumu Yamazaki