Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
83
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
120
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
93
Ghost in the State Machine
cypher
2
280
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
210
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
7.9k
How to Become a Better Developer
cypher
1
220
A Very Short Overview of Vagrant
cypher
0
7.7k
Other Decks in Programming
See All in Programming
EC2からECSへ 念願のコンテナ移行と巨大レガシーPHPアプリケーションの再構築
sumiyae
1
100
わたしの星のままで一番星になる ~ 出産を機にSIerからEC事業会社に転職した話 ~
kimura_m_29
0
180
命名をリントする
chiroruxx
1
420
GitHubで育つ コラボレーション文化 : ニフティでのインナーソース挑戦事例 - 2024-12-16 GitHub Universe 2024 Recap in ZOZO
niftycorp
PRO
0
100
快速入門可觀測性
blueswen
0
400
htmxって知っていますか?次世代のHTML
hiro_ghap1
0
350
Haze - Real time background blurring
chrisbanes
1
520
数十万行のプロジェクトを Scala 2から3に完全移行した
xuwei_k
0
310
Kaigi on Railsに初参加したら、その日にLT登壇が決定した件について
tama50505
0
100
Mermaid x AST x 生成AI = コードとドキュメントの完全同期への道
shibuyamizuho
0
170
情報漏洩させないための設計
kubotak
3
480
php-conference-japan-2024
tasuku43
0
350
Featured
See All Featured
Building Applications with DynamoDB
mza
91
6.1k
Designing Experiences People Love
moore
138
23k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
127
18k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
28
2.1k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
665
120k
Become a Pro
speakerdeck
PRO
26
5k
Writing Fast Ruby
sferik
628
61k
Keith and Marios Guide to Fast Websites
keithpitt
410
22k
Side Projects
sachag
452
42k
Bootstrapping a Software Product
garrettdimon
PRO
305
110k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Producing Creativity
orderedlist
PRO
342
39k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is