Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
80
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
92
Ghost in the State Machine
cypher
2
280
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
210
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
7.9k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.7k
Other Decks in Programming
See All in Programming
subpath importsで始めるモック生活
10tera
0
320
よくできたテンプレート言語として TypeScript + JSX を利用する試み / Using TypeScript + JSX outside of Web Frontend #TSKaigiKansai
izumin5210
6
1.8k
聞き手から登壇者へ: RubyKaigi2024 LTでの初挑戦が 教えてくれた、可能性の星
mikik0
1
130
ヤプリ新卒SREの オンボーディング
masaki12
0
130
Flutterを言い訳にしない!アプリの使い心地改善テクニック5選🔥
kno3a87
1
210
見せてあげますよ、「本物のLaravel批判」ってやつを。
77web
7
7.8k
Arm移行タイムアタック
qnighy
0
340
Micro Frontends Unmasked Opportunities, Challenges, Alternatives
manfredsteyer
PRO
0
110
TypeScriptでライブラリとの依存を限定的にする方法
tutinoko
3
700
色々なIaCツールを実際に触って比較してみる
iriikeita
0
330
Jakarta EE meets AI
ivargrimstad
0
710
Pinia Colada が実現するスマートな非同期処理
naokihaba
4
230
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
65
4.4k
How To Stay Up To Date on Web Technology
chriscoyier
788
250k
Agile that works and the tools we love
rasmusluckow
327
21k
Mobile First: as difficult as doing things right
swwweet
222
8.9k
Embracing the Ebb and Flow
colly
84
4.5k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Testing 201, or: Great Expectations
jmmastey
38
7.1k
Being A Developer After 40
akosma
87
590k
A Philosophy of Restraint
colly
203
16k
Unsuck your backbone
ammeep
668
57k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
It's Worth the Effort
3n
183
27k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is