This presentation explores common mistakes made by programmers when dealing with Unicode support and character encodings on the Web. For each mistake, I explain how to fix/prevent it, but also how it could possibly be exploited.
'"foo\u2028"' (contains the raw, unescaped // Unicode symbol) var escaped = jsesc(data, { 'json': true }); // h '"foo\\u2028"' (contains an escape sequence // for the Unicode symbol h safer) JSON.parse(serialized) == JSON.parse(escaped); // h true (both strings unserialize to the same value) https://mths.be/jsesc
'"foo\u2028"' (contains the raw, unescaped // Unicode symbol) var escaped = jsesc(data, { 'json': true }); // h '"foo\\u2028"' (contains an escape sequence // for the Unicode symbol h safer) JSON.parse(serialized) == JSON.parse(escaped); // h true (both strings unserialize to the same value) https://mths.be/jsesc
// lone surrogate var data = JSON.stringify(string); // the same string as JSON-formatted data storeInDatabaseAsUtf8(data); // h error/crash sendOverWebSocketConnection(data); // h error/crash/DoS
'"foo\uD800"' (contains the raw, unescaped // Unicode symbol) var escaped = jsesc(data, { 'json': true }); // h '"foo\\uD800"' (contains an escape sequence // for the Unicode symbol h safer) JSON.parse(serialized) == JSON.parse(escaped); // h true (both strings unserialize to the same value) https://mths.be/jsesc
'"foo\uD800"' (contains the raw, unescaped // Unicode symbol) var escaped = jsesc(data, { 'json': true }); // h '"foo\\uD800"' (contains an escape sequence // for the Unicode symbol h safer) JSON.parse(serialized) == JSON.parse(escaped); // h true (both strings unserialize to the same value) https://mths.be/jsesc
whether data has been serialize()d, which allows remote attackers to execute arbitrary code by triggering erroneous PHP unserialize() operations.” https://mths.be/brq
only if it’s an array or an object, or if is_serialized($data) returns true (double serialization) After retrieving data from the database, it gets unserialized only if is_serialized($data) returns true https://mths.be/brq
only if it’s an array or an object, or if is_serialized($data) returns true (double serialization) After retrieving data from the database, it gets unserialized only if is_serialized($data) returns true https://mths.be/brq uses MySQL’s ✌utf8✌
attackers to conduct PHP object injection attacks and execute arbitrary PHP code via the HTTP User-Agent header, as exploited in the wild in December 2015.” https://mths.be/bvg