Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
93
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
100
Ghost in the State Machine
cypher
2
300
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
220
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.8k
Other Decks in Programming
See All in Programming
remix + cloudflare workers (DO) docker上でいい感じに開発する
yoshidatomoaki
0
130
Defying Front-End Inertia: Inertia.js on Rails
skryukov
0
460
SEAL - Dive into the sea of search engines - Symfony Live Berlin 2025
alexanderschranz
1
130
Making TCPSocket.new "Happy"!
coe401_
1
120
Ruby's Line Breaks
yui_knk
2
470
MCP世界への招待: AIエンジニアが創る次世代エージェント連携の世界
gunta
4
880
custom_lintで始めるチームルール管理
akaboshinit
0
200
Preact、HooksとSignalsの両立 / Preact: Harmonizing Hooks and Signals
ssssota
1
1.4k
Boost Your Performance and Developer Productivity with Jakarta EE 11
ivargrimstad
0
1.2k
AWSで雰囲気でつくる! VRChatの写真変換ピタゴラスイッチ
anatofuz
0
140
Compose Hot Reload is here, stop re-launching your apps! (Android Makers 2025)
zsmb
1
480
「影響が少ない」を自分の目でみてみる
o0h
PRO
2
970
Featured
See All Featured
Gamification - CAS2011
davidbonilla
81
5.2k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.8k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.7k
Why Our Code Smells
bkeepers
PRO
336
57k
Code Reviewing Like a Champion
maltzj
522
39k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
5
520
Rebuilding a faster, lazier Slack
samanthasiow
80
8.9k
Code Review Best Practice
trishagee
67
18k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
Thoughts on Productivity
jonyablonski
69
4.6k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is