Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
310
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.1k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
250830 IaCの選定~AWS SAMのLambdaをECSに乗り換えたときの備忘録~
east_takumi
0
390
Testing Trophyは叫ばない
toms74209200
0
860
Oracle Database Technology Night 92 Database Connection control FAN-AC
oracle4engineer
PRO
1
440
How Android Uses Data Structures Behind The Scenes
l2hyunwoo
0
410
アプリの "かわいい" を支えるアニメーションツールRiveについて
uetyo
0
230
時間軸から考えるTerraformを使う理由と留意点
fufuhu
15
4.7k
実用的なGOCACHEPROG実装をするために / golang.tokyo #40
mazrean
1
260
デザイナーが Androidエンジニアに 挑戦してみた
874wokiite
0
300
rage against annotate_predecessor
junk0612
0
160
個人軟體時代
ethanhuang13
0
320
Deep Dive into Kotlin Flow
jmatsu
1
310
Flutter with Dart MCP: All You Need - 박제창 2025 I/O Extended Busan
itsmedreamwalker
0
150
Featured
See All Featured
GraphQLとの向き合い方2022年版
quramy
49
14k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Fireside Chat
paigeccino
39
3.6k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.1k
Building an army of robots
kneath
306
46k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
The Invisible Side of Design
smashingmag
301
51k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
BBQ
matthewcrist
89
9.8k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is