Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
由Spanner來看Google資料庫的前世今生
Search
Szu-Kai Hsu (brucehsu)
November 07, 2012
Technology
4
290
由Spanner來看Google資料庫的前世今生
2012年秋,網際網路資料庫 @ 國立中正大學資工所
Szu-Kai Hsu (brucehsu)
November 07, 2012
Tweet
Share
More Decks by Szu-Kai Hsu (brucehsu)
See All by Szu-Kai Hsu (brucehsu)
Running Life Lean
brucehsu
0
170
Core Unleashed Part II: Introduction to GobiesVM (and STM) @ RubyKaigi 2014
brucehsu
0
2.1k
[RubyConf.tw 2014] Cores unleashed - Exploiting Parallelism in Ruby with STM
brucehsu
0
2.2k
用 Go 打造程式語言執行環境:實例剖析 [OSDC.tw 2014]
brucehsu
3
2.4k
pickbox @ OSDC.tw 2013 Lightning Talk
brucehsu
0
58
Building Web 2.0 APIs
brucehsu
1
150
Rapid Web Development by Example
brucehsu
3
3.1k
TechWed@CCU #0
brucehsu
2
510
Chromium OS
brucehsu
2
200
Other Decks in Technology
See All in Technology
プラットフォーム転換期におけるGitHub Copilot活用〜Coding agentがそれを加速するか〜 / Leveraging GitHub Copilot During Platform Transition Periods
aeonpeople
1
220
20250910_障害注入から効率的復旧へ_カオスエンジニアリング_生成AIで考えるAWS障害対応.pdf
sh_fk2
3
270
はじめてのOSS開発からみえたGo言語の強み
shibukazu
3
950
エンジニアリングマネージャーの成長の道筋とキャリア / Developers Summit 2025 KANSAI
daiksy
2
660
スマートファクトリーの第一歩 〜AWSマネージドサービスで 実現する予知保全と生成AI活用まで
ganota
2
300
Autonomous Database - Dedicated 技術詳細 / adb-d_technical_detail_jp
oracle4engineer
PRO
4
10k
職種の壁を溶かして開発サイクルを高速に回す~情報透明性と職種越境から考えるAIフレンドリーな職種間連携~
daitasu
0
170
品質視点から考える組織デザイン/Organizational Design from Quality
mii3king
0
210
Django's GeneratedField by example - DjangoCon US 2025
pauloxnet
0
150
まずはマネコンでちゃちゃっと作ってから、それをCDKにしてみよか。
yamada_r
2
120
Apache Spark もくもく会
taka_aki
0
120
5分でカオスエンジニアリングを分かった気になろう
pandayumi
0
260
Featured
See All Featured
Into the Great Unknown - MozCon
thekraken
40
2k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
A Tale of Four Properties
chriscoyier
160
23k
The Language of Interfaces
destraynor
161
25k
The Art of Programming - Codeland 2020
erikaheidi
56
13k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
580
The Invisible Side of Design
smashingmag
301
51k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
GraphQLの誤解/rethinking-graphql
sonatard
72
11k
Rails Girls Zürich Keynote
gr2m
95
14k
Transcript
由 Spanner來看 Google資料庫 的 前世今⽣生 Szu-Kai Hsu (brucehsu)
Spanner is a scalable multi-version globally-distributed synchronously-replicated database
BigTable
Handling
Handling really
Handling really BIG DATA
key-value
key-value { “CCU”: “123”, “NCTU”: “113”, “NTU”: “112” }; key
key-value { “CCU”: “123”, “NCTU”: “113”, “NTU”: “112” }; value
distributed
Lack of transaction, think of our first project.
CAP
C A P
Consistency A P
Consistency Availability P
Consistency Availability Partition tolerance
Consistency Availability Partition tolerance Consistency
Megastore
NoSQL datastores are highly scalable, but their limited API and
loose consistency models complicate application development. “ “
In Megastore, data model is declared in a strong-typed schema
strong-typed schema CREATE TABLE User { required int64 user_id; required string name; } PRIMARY KEY(user_id), ENTITY GROUP ROOT;
Based on BigTable BigTable
PRIMARY user_id PRIMARY user_id, nyan_id
Local and Global Indexes are introduced: Local Index Find corresponding
data in entity group Global Index Find corresponding data in external groups Local Index Global Index
(user_id, born,nyan_id) For local index CREATE LOCAL INDEX NyanByBorn ON
Nyan(user_id, born); CREATE LOCAL INDEX NyanByBorn ON Nyan(user_id, born);
Consistency achieved via Paxos algorithm Paxos 2 Replicas 1 Witness
At least
Replica consists of Replication server and Coordinator Replication server Coordinator
write oversee
Witness’ Replication server only writes logs logs
Average Latency: 100-400ms Poor write throughput 100-400ms
Spanner ,finally.
We believe it is better to have application programmers deal
with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. “ “
Data model is almost identical to Megastore almost identical Basic
unit defined as Directory Directory
Data model is almost identical to Megastore almost identical Basic
unit defined as Directory Directory Same prefix key, therefore adjacent
Data model is almost identical to Megastore almost identical Basic
unit defined as Directory Directory Same prefix key, therefore adjacent Fine-grained mapping
Data model is almost identical to Megastore almost identical Basic
unit defined as Directory Directory Same prefix key, therefore adjacent Fine-grained mapping Interleaved rows gain performance
Two-phase commit for distributed transactions Two-phase commit 1Vote Coordinator Participants
Two-phase commit for distributed transactions Two-phase commit 2Commit Coordinator Participants
Locking remains a big issue Locking Especially when someone went
down, causing deadlock, literally.
Paxos is here to rescue, again Paxos will make sure
ALL logs are copied to every replicas. ALL logs
Real Innovation lies in time TrueTime API utilizes atomic clock
& GPS to determine the order of each transactions atomic clock GPS
NewSQL is the new NoSQL and Spanner is the best
example so far.