Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
87
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
98
Ghost in the State Machine
cypher
2
290
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
220
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.7k
Other Decks in Programming
See All in Programming
[Fin-JAWS 第38回 ~re:Invent 2024 金融re:Cap~]FaultInjectionServiceアップデート@pre:Invent2024
shintaro_fukatsu
0
410
SRE、開発、QAが協業して挑んだリリースプロセス改革@SRE Kaigi 2025
nealle
3
4.2k
DROBEの生成AI活用事例 with AWS
ippey
0
130
動作確認やテストで漏れがちな観点3選
starfish719
6
1k
一休.com のログイン体験を支える技術 〜Web Components x Vue.js 活用事例と最適化について〜
atsumim
0
320
How mixi2 Uses TiDB for SNS Scalability and Performance
kanmo
35
14k
Conform を推す - Advocating for Conform
mizoguchicoji
3
690
Amazon Bedrock Multi Agentsを試してきた
tm2
1
280
Honoのおもしろいミドルウェアをみてみよう
yusukebe
1
200
Software Architecture
hschwentner
6
2.1k
2,500万ユーザーを支えるSREチームの6年間のスクラムのカイゼン
honmarkhunt
6
5.2k
2024年のkintone API振り返りと2025年 / kintone API look back in 2024
tasshi
0
220
Featured
See All Featured
Side Projects
sachag
452
42k
The Cost Of JavaScript in 2023
addyosmani
47
7.3k
The Pragmatic Product Professional
lauravandoore
32
6.4k
Become a Pro
speakerdeck
PRO
26
5.1k
Optimising Largest Contentful Paint
csswizardry
34
3.1k
GitHub's CSS Performance
jonrohan
1030
460k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
27
1.5k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
45
2.3k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Site-Speed That Sticks
csswizardry
3
370
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is