Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
マルウェアを機械学習する前に
Search
Yuma Kurogome
February 13, 2016
Programming
3
1.6k
マルウェアを機械学習する前に
Kaggle - Malware Classification Challenge勉強会 connpass.com/event/25007/ 発表資料
Yuma Kurogome
February 13, 2016
Tweet
Share
More Decks by Yuma Kurogome
See All by Yuma Kurogome
The Art of De-obfuscation
ntddk
16
27k
死にゆくアンチウイルスへの祈り
ntddk
55
39k
Windows Subsystem for Linux Internals
ntddk
10
3k
なぜマルウェア解析は自動化できないのか
ntddk
6
4.2k
Linear Obfuscation to Drive angr Angry
ntddk
4
840
CAPTCHAとボットの共進化
ntddk
2
1.2k
Peeling Onions
ntddk
7
3.6k
仮想化技術を用いたマルウェア解析
ntddk
8
27k
An Introduction to Drawbridge(ja)
ntddk
11
3.4k
Other Decks in Programming
See All in Programming
RubyKaigiで得られる10の価値 〜Ruby話を聞くことだけが RubyKaigiじゃない〜
tomohiko9090
0
100
「兵法」から見る質とスピード
ickx
0
200
コンポーネントライブラリで実現する、アクセシビリティの正しい実装パターン
schktjm
1
670
JVM の仕組みを理解して PHP で実装してみよう
m3m0r7
PRO
1
250
ts-morph実践:型を利用するcodemodのテクニック
ypresto
1
540
從零到一:搭建你的第一個 Observability 平台
blueswen
0
220
Interface vs Types ~型推論が過多推論~
hirokiomote
1
230
Proxmoxをまとめて管理できるコンソール作ってみました
karugamo
1
410
當開發遇上包裝:AI 如何讓產品從想法變成商品
clonn
0
2.6k
Rails産でないDBを Railsに引っ越すHACK - Omotesando.rb #110
lnit
1
110
Building an Application with TDD, DDD and Hexagonal Architecture - Isn't it a bit too much?
mufrid
0
370
AI Coding Agent Enablement in TypeScript
yukukotani
17
7.2k
Featured
See All Featured
Practical Orchestrator
shlominoach
188
11k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.7k
Code Reviewing Like a Champion
maltzj
523
40k
How to Think Like a Performance Engineer
csswizardry
23
1.6k
Bash Introduction
62gerente
614
210k
Designing for humans not robots
tammielis
253
25k
Building Adaptive Systems
keathley
41
2.6k
Designing Experiences People Love
moore
142
24k
Designing for Performance
lara
608
69k
Building an army of robots
kneath
306
45k
Into the Great Unknown - MozCon
thekraken
39
1.8k
Testing 201, or: Great Expectations
jmmastey
42
7.5k
Transcript
@ntddk Kaggle - Malware Classification Challenge 2016.02.13 1
• http://ntddk.github.io/ • 2
3
4
Kaggle 5 https://www.kaggle.com/
6 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
7 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
8 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
9 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
10 http://blog.kaggle.com/
11 x η g a b c x …
12 x η g a b c x …
13 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
14 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
15 • • • •
16 https://www.av-test.org/en/statistics/malware/
17 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf
18 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf http://www.mcafee.com/jp/resources/reports/rp-threats-predictions-2016.pdf
19 • KERNEL32!VirtualAllocStub • KERNEL32!VirtualProtectStub • KERNEL32!OpenProcessStub • KERNEL32!OpenThreadStub •
…
20 CSEC: MWS: http://www.iwsec.org/mws/2015/about.html
21 https://www.kaggle.com/c/malware-classification/data 16
22 • https://virusshare.com/ • http://malware-traffic-analysis.net/
23 • • • •
24 • • • • API PE
25 https://github.com/corkami/
26 • • • • • •
27 #include <windows.h> typedef int (WINAPI *LPFNMESSAGEBOXW)(HWND, LPCWSTR, LPCWSTR, UINT);
int main() { HMODULE hmod = LoadLibrary(TEXT("user32.dll")); LPFNMESSAGEBOXW lpfnMessageBoxW = (LPFNMESSAGEBOXW)GetProcAddress(hmod, "MessageBoxW"); lpfnMessageBoxW(NULL, L"Hello, world!", L"Test", MB_OK); FreeLibrary(hmod); return 0; } •
28 { "category": "registry", "status": true, "return": "0x00000000", "timestamp": "2015-05-24
02:46:50,773", "thread_id": "3220", "repeated": 0, "api": "NtOpenKey", "arguments": [ { "name": "DesiredAccess", "value": "33554432" }, { "name": "KeyHandle", "value": "0x00000154" }, { "name": "ObjectAttributes", "value": "¥¥REGISTRY¥¥USER¥¥S-1-5-21-916742657-1382504153-4155998892-1001" } ], "id": 83 },
29 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
30 • AdaBoost, Gradient Boosting • Kaggle
DAF 31 Mohammad M. Masud, Latifur Khan, Bhavani Thuraisingham, A
scalable multi-level feature extraction technique to detect malicious executables, Information Systems Frontiers, Vol.10, Issue.1, pp.33-45, 2008. 16 DAF: Derived Assembly Features BFS: Binary N-gram Features