Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
In the beginning was TXT
Search
Markus Wein
October 02, 2014
Programming
0
110
In the beginning was TXT
A very short overview of the history of encodings, given at Vienna.rb on 2014-10-02
Markus Wein
October 02, 2014
Tweet
Share
More Decks by Markus Wein
See All by Markus Wein
Command Line Productivity
cypher
1
130
A crash intro to deliberate practice
cypher
0
110
Keeping Your PostgreSQL Data Save
cypher
0
120
Ghost in the State Machine
cypher
2
310
n Things You Didn't Know About PostgreSQL (Rubyslava & PyVo 2014 Edition)
cypher
1
240
How to Become a Better Developer
cypher
2
1.8k
An Introduction to Rust
cypher
1
8.2k
How to Become a Better Developer
cypher
1
230
A Very Short Overview of Vagrant
cypher
0
7.9k
Other Decks in Programming
See All in Programming
AIエージェント時代における TypeScriptスキーマ駆動開発の新たな役割
bicstone
4
1.5k
いま中途半端なSwift 6対応をするより、Default ActorやApproachable Concurrencyを有効にしてからでいいんじゃない?
yimajo
2
340
iOSアプリの信頼性を向上させる取り組み/ios-app-improve-reliability
shino8rayu9
0
150
私はどうやって技術力を上げたのか
yusukebe
43
17k
詳しくない分野でのVibe Codingで困ったことと学び/vibe-coding-in-unfamiliar-area
shibayu36
3
4.5k
大規模アプリのDIフレームワーク刷新戦略 ~過去最大規模の並行開発を止めずにアプリ全体に導入するまで~
mot_techtalk
0
390
Domain-centric? Why Hexagonal, Onion, and Clean Architecture Are Answers to the Wrong Question
olivergierke
0
110
Reduxモダナイズ 〜コードのモダン化を通して、将来のライブラリ移行に備える〜
pvcresin
2
690
CSC509 Lecture 01
javiergs
PRO
1
430
Conquering Massive Traffic Spikes in Ruby Applications with Pitchfork
riseshia
0
150
Web技術を最大限活用してRAW画像を現像する / Developing RAW Images on the Web
ssssota
2
1.2k
dynamic!
moro
9
6.7k
Featured
See All Featured
Bash Introduction
62gerente
615
210k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
Code Reviewing Like a Champion
maltzj
525
40k
The Invisible Side of Design
smashingmag
301
51k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Practical Orchestrator
shlominoach
190
11k
Into the Great Unknown - MozCon
thekraken
40
2.1k
Fireside Chat
paigeccino
40
3.7k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
114
20k
Site-Speed That Sticks
csswizardry
11
880
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.7k
Transcript
In the beginning was TXT
!
EBCDIC
Source: http://en.wikipedia.org/wiki/EBCDIC
ASCII
"#$%&
None
ä, ö, or å and Ø?
Latin-1 ISO/IEC 8859-1
Latin-*
Windows code pages
Then came the €
(
None
Shift-JIS
This sucks
Unicode!
Unicode!
✈️ (planes!)
Basic Multilingual Plane
Code Points
U+0041 (LATIN SMALL LETTER A)
Source: http://codepoints.net/U+0041
Grapheme
a a a a a a a
Composite characters
U+0065 U+0301 or U+00E9
e+´ => é é
´ != ´
Unicode… is not an encoding
UTF-32
UCS-2/UTF-16
UTF-8
Source: http://en.wikipedia.org/wiki/File:UnicodeGrow2b.png
What does it look like?
Codepoint Char ASCII Latin-1 ISO-8859-15 UTF-8 UTF-16 U+0041 A 0x41
0x41 0x41 0x41 0x00 0x41 U+00C4 Ä - 0xc4 0xc4 0xc3 0x84 0x00 0xc4 U+20AC € - - 0xa4 0xe3 0x82 0xac 0x20 0xac U+C218 ࣻ - - - 0xec 0x88 0x98 0xc2 0x18 Encoding comparison Source: http://perlgeek.de/en/article/encodings-and-unicode
Remember: Just because someone claims it’s UTF-8, doesn’t mean it
is