$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
120
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
140
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
130
Ghost in the State Machine
cypher
2
320
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
250
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
実はマルチモーダルだった。ブラウザの組み込みAI🧠でWebの未来を感じてみよう #jsfes #gemini
n0bisuke2
3
1.2k
まだ間に合う!Claude Code元年をふりかえる
nogu66
5
850
Claude Codeの「Compacting Conversation」を体感50%減! CLAUDE.md + 8 Skills で挑むコンテキスト管理術
kmurahama
0
290
Socio-Technical Evolution: Growing an Architecture and Its Organization for Fast Flow
cer
PRO
0
360
React Native New Architecture 移行実践報告
taminif
1
160
MAP, Jigsaw, Code Golf 振り返り会 by 関東Kaggler会|Jigsaw 15th Solution
hasibirok0
0
250
チームをチームにするEM
hitode909
0
340
FluorTracer / RayTracingCamp11
kugimasa
0
240
LLMで複雑な検索条件アセットから脱却する!! 生成的検索インタフェースの設計論
po3rin
3
820
Tinkerbellから学ぶ、Podで DHCPをリッスンする手法
tomokon
0
130
堅牢なフロントエンドテスト基盤を構築するために行った取り組み
shogo4131
8
2.4k
Rubyで鍛える仕組み化プロヂュース力
muryoimpl
0
140
Featured
See All Featured
A Modern Web Designer's Workflow
chriscoyier
698
190k
Writing Fast Ruby
sferik
630
62k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.3k
Building an army of robots
kneath
306
46k
Balancing Empowerment & Direction
lara
5
800
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Practical Orchestrator
shlominoach
190
11k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is