SCT Topic 10: Advanced Strategies
Stealthiness Strategies
Before launch the attack, the attacker should check AV(anti-virus)/Filters detection, otherwise, their actions will leave some warnings or tracks in the target system
- Encoding (e.g.,
msfencode, also use multiple at once), change the encoding of the file, the checking signature of malware analysis may not be triggered. - Transient malware, e.g., run only in memory
- Mimicry, simulate legitimate app process to escape from the anti-virus
- Packers, obfuscate malicious code, and unpack routine at runtime
Note: Offline version of AVs offer less functionality
Persistence Strategies
Attackers also want to keep their malware running on target system
-
Payloads in Metasploit (e.g, reverse shell, meterpreter) for further interactions
-
Scheduled Tasks, launch everytime after users rebooting systems
-
Backdoors, leave backdoor for further interaction and exploitation
-
In BeEf Stored XSS will remain valid whenever the victims visits the page
Possible to do some “privilege escalation” and “sandbox evasion”
Keep up to date
Keep track of what bad guys are doing
- Technical Reports
- MITRE CVEs - https://cve.mitre.org/about/documents.html
- Google Project Zero
- Exploit DB
- https://www.reddit.com/r/netsec/
Adversarial Machine Learning
- Definition of machine Learning
- A computer program is said to learn from
experience Ewith respect to someclass of tasks Tandperformance measure P, if its performance at tasks in T, as measured by P, improves with experience E- Tom Mitchell, Machine Learning
Machine Learning for Security
- 5 Phases of ML for Security
- Data Collection
- Pre-processing and Feature Engineering
- Model Selection and Training
- Testing and Evaluation
- Evaluation against Time Evolution and Adversaries
Machine Learning Algorithms Categories
- Classification: given a labeled dataset, find a model that separates instances into classes
- Regression: given some points, try to generalise and predict real-valued numbers
- Clustering: given an unlabeled dataset, try to group similar elements
Performance metrics
I used to confuse about the metrics used in ML, I think the difficult point is how to define the Positive and Negative
We assume we are doing a malware classification, thus our target is to determine where a sample is a malware or not?
If we decide one program is a malware, we called it Positive result, otherwise, it is a Negative result.
- Thus, a
True Positivemeans, we think a program is a malware, it is actually a real malware - a
True Negativemeans, we think a program is a goodware, it is actually a goodware - a
False Positivemeans, we think a program is a malware, but it is actually a goodware, which we also calledfalse alarm - a
False Negativemeans, we think a program is a goodware, but it is actually a malware
-
Precision and Recall
Precision= TP / (TP + FP), it reflects the machine learning algorithm’s performance on “How many times you are right?”, TP + FP indicates the total sample are labeled by ML as malware (positive)Recall= TP / (TP + FN), it reflects “How many malware found by you”, TP + FN means all the actual malware within the datasetF1-Scoreis defined as theharmonic meanof Precision and RecallF1-Score = 2 x (Precision x Recall) / (Precision + Recall)
Accuracy= (TP + TN) / (TP + FP + TN + FN), indicates that all correct decision made by the ML module- Is Accuracy a good metric in Security?
- Accuracy can be
misleadingwhen datasets are very imbalanced - In a malware dataset, the ratio of goodware and malware could be like 1000 : 1, thus it is easily to find benign program than malware, the dataset can easily become 99%
- Accuracy can be
- Is Accuracy a good metric in Security?
Evasion
An attacker may try to evade detection or poison training data
-
Spam Filtering
Features: presence/absence of words Attacks: bad word obfuscation / good word insertion
Adversarial Machine Learning: Taxonomy
-
Test-time Evasion
-
Training-time Poisoning
-
Inference Attacks
-
Model Stealing
- Membership Inference
- Shadow Model Estimation