Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Markus Wein
October 02, 2014
Programming
130
0
Share
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
170
A crash intro to deliberate practice
cypher
0
140
Keeping Your PostgreSQL Data Save
cypher
0
150
Ghost in the State Machine
cypher
2
340
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
270
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.3k
How to Become a Better Developer
cypher
1
270
A Very Short Overview of Vagrant
cypher
0
8.1k
Other Decks in Programming
See All in Programming
Copilot CLI の継戦能力を高める コンテキスト管理
nozomutu
1
1.1k
開発体験を左右するライブラリの API 設計 - GraphQL スキーマ構築ライブラリから考える #tskaigi
izumin5210
2
1.4k
tsserverとは何だったのか、これからどうなるのか
nowaki28
1
430
色即是空、空即是色、データサイエンス
kamoneggi
1
250
iOS26時代の新規アプリ開発
yuukiw00w
0
220
JJUG CCC 2026 Spring: JSpecify で実現する Kotlin フレンドリーな Java API 設計
ternbusty
1
110
Transactional Change Stream Processing With Debezium and Apache Flink
gunnarmorling
1
140
気づいたらRubyで100作品 ー クリエイティブコーディングが生活の一部になるまで / 100 Ruby Sketches Later: How Creative Coding Became Part of My Life
chobishiba
3
500
Claspは野良GASの夢をみるか
takter00
0
140
CLIであることを活かしたGitHub Copilot CLI活用術 / GitHub Copilot CLI Pro Tips & Tricks
nao_mk2
1
1.2k
不変条件と整合性境界—ビジネスが決める設計判断と実現パターン / Invariants and Consistency Boundaries
nrslib
11
3.1k
oxlintはeslint/typescript-eslintを置き換えられるのか
shomafujita
2
290
Featured
See All Featured
A better future with KSS
kneath
240
18k
A Modern Web Designer's Workflow
chriscoyier
698
190k
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
1
230
Utilizing Notion as your number one productivity tool
mfonobong
4
310
Art, The Web, and Tiny UX
lynnandtonic
304
21k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
65
55k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.2k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.8k
Docker and Python
trallard
47
3.9k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.5k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is