Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
140
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
180
A crash intro to deliberate practice
cypher
0
140
Keeping Your PostgreSQL Data Save
cypher
0
150
Ghost in the State Machine
cypher
2
350
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
270
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.4k
How to Become a Better Developer
cypher
1
270
A Very Short Overview of Vagrant
cypher
0
8.1k
Other Decks in Programming
See All in Programming
Inside Stream API
skrb
1
740
メソッドのジェネリクスでGoの夢は広がるか? / Kyoto.go #65
utgwkk
3
850
IBM Bobを活用したレガシーアプリの最新化
oniak3ibm
PRO
1
200
OSもどきOS
arkw
0
570
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
170
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
270
The NotImplementedError Problem in Ruby
koic
1
850
Java × distroless で 軽量なコンテナイメージを / Java on Distroless
contour_gara
0
550
New "Type" system on PicoRuby
pocke
1
980
Creating Composable Callables in Contemporary C++
rollbear
0
150
キャリア迷子上等 ─ "ない道"は自分で作ればいい
16bitidol
3
2.2k
AI時代のUIはどこへ行く?その2!
yusukebe
22
7.4k
Featured
See All Featured
First, design no harm
axbom
PRO
2
1.2k
Joys of Absence: A Defence of Solitary Play
codingconduct
1
400
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
3.5k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
Automating Front-end Workflow
addyosmani
1370
210k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.7k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
10k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Visualization
eitanlees
152
17k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
630
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is