Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
120
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
150
A crash intro to deliberate practice
cypher
0
120
Keeping Your PostgreSQL Data Save
cypher
0
140
Ghost in the State Machine
cypher
2
330
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
250
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.3k
How to Become a Better Developer
cypher
1
240
A Very Short Overview of Vagrant
cypher
0
8k
Other Decks in Programming
See All in Programming
高速開発のためのコード整理術
sutetotanuki
1
420
生成AIを活用したソフトウェア開発ライフサイクル変革の現在値
hiroyukimori
PRO
0
120
IFSによる形状設計/デモシーンの魅力 @ 慶應大学SFC
gam0022
1
330
日本だけで解禁されているアプリ起動の方法
ryunakayama
0
340
プロダクトオーナーから見たSOC2 _SOC2ゆるミートアップ#2
kekekenta
0
240
Rails Girls Tokyo 18th GMO Pepabo Sponsor Talk
yutokyokutyo
0
120
CSC307 Lecture 08
javiergs
PRO
0
680
iOSアプリでフロントエンドと仲良くする
ryunakayama
0
110
AI Agent の開発と運用を支える Durable Execution #AgentsInProd
izumin5210
7
2.4k
要求定義・仕様記述・設計・検証の手引き - 理論から学ぶ明確で統一された成果物定義
orgachem
PRO
1
310
izumin5210のプロポーザルのネタ探し #tskaigi_msup
izumin5210
1
190
余白を設計しフロントエンド開発を 加速させる
tsukuha
7
2.1k
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
432
66k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
260
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
69
Designing for humans not robots
tammielis
254
26k
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
0
200
Done Done
chrislema
186
16k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
0
150
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.6k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
1.9k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.2k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
180
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is