Learning Systems Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
California, Berkeley. Background: • Deep learning systems are increasingly used: • Safety-critical: self-driving cars • Security-critical: malware detection Problem: • How to test DL systems to expose erroneous behaviors for corner cases? 2
California, Berkeley. Background: • Deep learning systems are increasingly used: • Safety-critical: self-driving cars • Security-critical: malware detection Problem: • How to test DL systems to expose erroneous behaviors of corner cases? 3
of California, Berkeley. • How to test traditional software to expose erroneous behaviors for corner cases? Software System Test Input Test Output • concolic execution • random testing • coverage-guided fuzz testing • … 4
of California, Berkeley. • How to test DL systems to expose erroneous behaviors for corner cases? Test Input Test Output • concolic execution? • random testing? • coverage-guided fuzz testing? • … 5
of California, Berkeley. • How to test DL systems to expose erroneous behaviors for corner cases? Test Input Test Output Their solution • coverage-guided & differential-guided Testing 6
of California, Berkeley. Test Input Test Output Research Questions: • How to define coverage for DL system? • What is differential-guided? • How to fuzz test input based on those metrics? • How to get test oracle? coverage-guided & differential-guided fuzz testing 7
of California, Berkeley. Test Input Test Output • LOC coverage? ~100% • Coverage based on the # of neurons processed? 100% • Coverage based on the # of neurons activated? Interesting… How to define coverage for DL system? 8
of California, Berkeley. How to define coverage for DL system? • Coverage based on the # of activated neurons Neurons often correspond to self- extracted features at different levels. My Comment: activating neurons triggering conditionals in programs. 9
of California, Berkeley. How to define coverage for DL system? • coverage based on the # of activated neurons • activating neurons triggering conditionals in programs All neurons: All inputs: Output of neuron n given input x : Threshold for activation: t 10
of California, Berkeley. How to define coverage for DL system? • coverage based on the # of activated neurons • activating neurons triggering conditionals in programs All neurons: All inputs: Output of neuron n given input x : Threshold for activation: t N 11
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? • How to generate input based on feedback? 12
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? (oracle problem) Neural networks rarely crash… Key Idea: differential testing Test Input Test Output Test Output If different, one NN might be wrong. 13
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? (oracle problem) Neural networks rarely crash… Key Idea: differential testing Test Input Test Output Test Output If different, one NN might be wrong. 14
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? (oracle problem) Neural networks rarely crash… Key Idea: differential testing Test Input Test Output Test Output If different, one NN might be wrong. 15
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? (oracle problem) Neural networks rarely crash… Key Idea: differential testing Test Input Test Output Test Output If different, one NN might be wrong. 16
of California, Berkeley. Think of it as feedback-directed fuzz testing. • How to detect erroneous output? (oracle problem) Neural networks rarely crash… Key Idea: differential testing Max the diff! Test Input Test Output Test Output If different, one NN might be wrong. 17
of California, Berkeley. Think of it as feedback-directed fuzz testing. Now, given different DNNs, we want to generate the next input that is: • Coverage-guided: Maximize the neuron activation • Differential-guided: Maximize the diff of NN outputs Research Question: How to guide the input generation based on those metrics? • an optimization problem: 18
of California, Berkeley. Think of it as feedback-directed fuzz testing. Now, given different DNNs, we want to generate the next input that is: • Coverage-guided: Maximize the neuron activation • Differential-guided: Maximize the diff of NN outputs Research Question: How to guide the input generation based on those metrics? • an optimization problem: 19
of California, Berkeley. Think of it as feedback-directed fuzz testing. Now, given different DNNs, we want to generate the next input that is: • Coverage-guided: Maximize the neuron activation • Differential-guided: Maximize the diff of NN outputs Research Question: How to guide the input generation based on those metrics? • an optimization problem: 20
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label Very similar problem! So we can reuse gradient descent and back propagation with a few modifications. 21
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label Very similar problem! So we can reuse gradient descent and back propagation with a few modifications. 22
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff • Loss based on: | y1 – y2 | + output of inactivated neurons • Maximize the loss NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label • Loss based on: | y – y | • Minimize the loss Modify the objective (loss function) Gradient ascend 23
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff • Loss based on: | y1 – y2 | + output of inactivated neurons • Maximize the loss NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label • Loss based on: | y – y | • Minimize the loss Modify the objective (loss function) Gradient ascend 24
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff • Differentiate w.r.t. input • Add delta to input NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label • Differentiate w.r.t. weights • Add delta to weights Modify the gradient (differentiation equation) 25
Computer Science, University of California, Berkeley. NN Test Generation • Fix NN parameters • Adjust input • Maximize coverage + diff • Differentiate w.r.t. input • Add delta to input NN Training • Fix input • Adjust NN parameters • Minimize diff between output and label • Differentiate w.r.t. weights • Add delta to weights Modify the gradient (differentiation equation) 26
0.1 C 0.8 C1 0.15 C2 0.1 C 0.75 Objective function: Maximize the diff Maximize the cov Sum of outputs of inactive neurons Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley. 28
0.1 C 0.8 C1 0.15 C2 0.1 C 0.75 Objective function: Let’s diff this NN from the others. Maximize the diff Maximize the cov Sum of outputs of inactive neurons Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley. 29
0.1 C 0.8 C1 0.15 C2 0.1 C 0.75 Objective function: Let’s diff this NN from the others. Maximize the diff Maximize the cov Sum of outputs of inactive neurons Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley. 30
0.1 C 0.8 C1 0.15 C2 0.1 C 0.75 Objective function: Let’s diff this NN from the others. grad = Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley. 31
0.6 C 0.1 C1 0.05 C2 0.05 C 0.9 Objective function: Let’s diff this NN from the others. grad = Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley. 32
of California, Berkeley. Dataset of benign and malicious PDF documents • 5000 benign PDF files • 12,205 malicious PDF files Extract 135 static features as DNN input DNN Model Variations: • 1 input layer • 2 - 4 fully connect layers • 1 softmax output layer (benign or malicious) 34
Science, University of California, Berkeley. grad = constraint( ) For Android/PDF malware dataset : • turning binary features from 0 to 1 • add features (add permissions in the manifest files) • Deleting features (1 to 0) is not allowed • ensure no functionality changes due to insufficient permissions 46
Computer Science, University of California, Berkeley. L1 distance between generated inputs Higher numbers are better. Neuron Coverage # of difference inducing inputs Optimize without NC Optimize with NC 50