Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
マルウェアを機械学習する前に
Search
Yuma Kurogome
February 13, 2016
Programming
3
1.6k
マルウェアを機械学習する前に
Kaggle - Malware Classification Challenge勉強会 connpass.com/event/25007/ 発表資料
Yuma Kurogome
February 13, 2016
Tweet
Share
More Decks by Yuma Kurogome
See All by Yuma Kurogome
The Art of De-obfuscation
ntddk
16
27k
死にゆくアンチウイルスへの祈り
ntddk
55
38k
Windows Subsystem for Linux Internals
ntddk
10
2.9k
なぜマルウェア解析は自動化できないのか
ntddk
6
4.1k
Linear Obfuscation to Drive angr Angry
ntddk
4
810
CAPTCHAとボットの共進化
ntddk
2
1.1k
Peeling Onions
ntddk
7
3.5k
仮想化技術を用いたマルウェア解析
ntddk
8
27k
An Introduction to Drawbridge(ja)
ntddk
11
3.2k
Other Decks in Programming
See All in Programming
A New Era of Testing
mannodermaus
2
490
実践!難読化ガイド
mitchan
0
160
マイグレーションコード自作して File-Based Routing に自動移行!! ~250 ページの歴史的経緯を添えて~
cut0
1
260
Android開発以外のAndroid開発経験の活かしどころ
konifar
2
970
What is Parser
yui_knk
9
4.1k
『ドメイン駆動設計をはじめよう』中核の業務領域
masuda220
PRO
5
990
いつか使える ObjectSpace / Maybe useful ObjectSpace
euglena1215
2
130
Why Prism?
kddnewton
4
1.7k
Method Swizzlingを行うライブラリにおけるマルチモジュール設計
yoshikma
0
120
ドメイン駆動設計を実践するために必要なもの
bikisuke
4
330
メモリ最適化を究める!iOSアプリ開発における5つの重要なポイント
yhirakawa333
0
410
Our Websites Need a Lifestyle Change, Not a Diet
ryantownsend
0
140
Featured
See All Featured
Clear Off the Table
cherdarchuk
91
320k
A better future with KSS
kneath
235
17k
How To Stay Up To Date on Web Technology
chriscoyier
786
250k
Debugging Ruby Performance
tmm1
72
12k
Fantastic passwords and where to find them - at NoRuKo
philnash
48
2.8k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
23
1.7k
Imperfection Machines: The Place of Print at Facebook
scottboms
263
13k
GraphQLの誤解/rethinking-graphql
sonatard
65
9.8k
Building a Scalable Design System with Sketch
lauravandoore
458
32k
What the flash - Photography Introduction
edds
67
11k
Intergalactic Javascript Robots from Outer Space
tanoku
268
26k
5 minutes of I Can Smell Your CMS
philhawksworth
202
19k
Transcript
@ntddk Kaggle - Malware Classification Challenge 2016.02.13 1
• http://ntddk.github.io/ • 2
3
4
Kaggle 5 https://www.kaggle.com/
6 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
7 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
8 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
9 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
10 http://blog.kaggle.com/
11 x η g a b c x …
12 x η g a b c x …
13 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
14 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
15 • • • •
16 https://www.av-test.org/en/statistics/malware/
17 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf
18 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf http://www.mcafee.com/jp/resources/reports/rp-threats-predictions-2016.pdf
19 • KERNEL32!VirtualAllocStub • KERNEL32!VirtualProtectStub • KERNEL32!OpenProcessStub • KERNEL32!OpenThreadStub •
…
20 CSEC: MWS: http://www.iwsec.org/mws/2015/about.html
21 https://www.kaggle.com/c/malware-classification/data 16
22 • https://virusshare.com/ • http://malware-traffic-analysis.net/
23 • • • •
24 • • • • API PE
25 https://github.com/corkami/
26 • • • • • •
27 #include <windows.h> typedef int (WINAPI *LPFNMESSAGEBOXW)(HWND, LPCWSTR, LPCWSTR, UINT);
int main() { HMODULE hmod = LoadLibrary(TEXT("user32.dll")); LPFNMESSAGEBOXW lpfnMessageBoxW = (LPFNMESSAGEBOXW)GetProcAddress(hmod, "MessageBoxW"); lpfnMessageBoxW(NULL, L"Hello, world!", L"Test", MB_OK); FreeLibrary(hmod); return 0; } •
28 { "category": "registry", "status": true, "return": "0x00000000", "timestamp": "2015-05-24
02:46:50,773", "thread_id": "3220", "repeated": 0, "api": "NtOpenKey", "arguments": [ { "name": "DesiredAccess", "value": "33554432" }, { "name": "KeyHandle", "value": "0x00000154" }, { "name": "ObjectAttributes", "value": "¥¥REGISTRY¥¥USER¥¥S-1-5-21-916742657-1382504153-4155998892-1001" } ], "id": 83 },
29 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
30 • AdaBoost, Gradient Boosting • Kaggle
DAF 31 Mohammad M. Masud, Latifur Khan, Bhavani Thuraisingham, A
scalable multi-level feature extraction technique to detect malicious executables, Information Systems Frontiers, Vol.10, Issue.1, pp.33-45, 2008. 16 DAF: Derived Assembly Features BFS: Binary N-gram Features