Successfully reported this slideshow.
You’ve unlocked unlimited downloads on SlideShare!
Single detections… rarely indicate security-interest Malware detected….what next? Suspicious process launched…what next? Unusual logins…what next?
-> O365 has 150 detections in the pipeline; ArcSight has 100 or so detections that come out of the box. N detections, k top alerts surfaced = N*k alerts for the analyst to triage
Hand written rules, they might be useful volume of the alerts. Interpretable - rules might be more interpretable. Be interpretable but also have low noise.
Classification probably isn’t the right way to think about approaching ad hoc IR: Classification problems: Map to a unordered set of classes Regression problems: Map to a real value Ordinal regression problems: Map to an ordered set of classes A fairly obscure sub-branch of statistics, but what we want here This formulation gives extra power: Relations between relevance levels are modeled Documents are good versus other documents for query given collection; not an absolute scale of goodness
Windows Security Events Data
On average, an online service in O365 produces 30 billion
sessions/day; 82 TB/day
Data: Sequences of Windows security event IDs from user
• Examples: User logs into machine, process start, credential
• 367 unique security event IDs
- We built separate models to detect
our goal of compromised
- The models, independently assess if
the account is acting suspiciously
probability of logging
sequences of events
Constantly changing environment…
….but you can account for it during training
and adding metadata
In the beginning, there will be false positives…
….but you will reduce your attack surface
No labelled data…
….but you can get away with a good red team
Combine alert streams
Make your alerts interpretable
Capture feedback and close the last mile
Check out ranking algorithms – they are