Fall 2010 Student Project info Contact Nathaniel Boggs, boggs@cs.columbia.edu
Overall Project Goal Use network based anomaly detection to detect zero day web attacks via multi-site collaboration with GMU
Potential Student Projects - Improve project code and self monitoring in preparation for online deployment
- Develop ongoing statistical analysis
Project Overview A huge obstacle in cyber security is the lack of data sharing. Without knowing what new attacks attacks exist in real time, it is difficult to defend against the always evolving attacker community. The largest obstacle to sharing of information is the privacy implications as network traffic is highly sensitive. We propose that by using privacy preserving data structures attack data can be shared without compromising the privacy of normal network traffic.
Our prototype uses alerts from different sites to identify common attackers/attacks and also improve the training of anomaly detectors. We also show how a collaborative approach that combines models from
different networks or domains can further refine the sanitization
process to thwart targeted training or mimicry attacks against a single
site.
STAND
The
efficacy of Anomaly Detection (AD) sensors depends heavily on the
quality of the data used to train them. Artificial or contrived
training data may not provide a realistic view of the deployment
environment. Most realistic data sets are dirty; that is, they contain
a number of attacks or anomalous events. The size of these high-quality
training data sets makes manual removal or labeling of attack data
infeasible. As a result, sensors trained on this data can miss attacks
and their variations.
We propose extending the training
phase of AD sensors (in a manner agnostic to the underlying AD
algorithm) to include a sanitization phase. This phase generates
multiple models conditioned on small slices of the training data. We
use these "micro-models" to produce provisional labels for each
training input, and we combine the micro-models in a voting scheme to
determine which parts of the training data may represent attacks. Our
results suggest that this phase automatically and significantly
improves the quality of unlabeled training data by making it as
"attack-free" and "regular" as possible in the absence of absolute
ground truth.
Current research sponsored by AFOSR MURI Contract: GMU 107151AA “MURI: Autonomic Recovery of Enterprise-wide Systems After Attack or Failure with Forward Correction”
Papers
- Nathaniel Boggs, Sharath Hiremagalore, Angelos Stavrou, Salvatore J. Stolfo, "Experimental Results of Cross-Site Exchange of Web Content Anomaly Detector Alerts", IEEE International Conference on Technologies for Homeland Security, November 2010. [PDF]
- Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo "Adaptive Anomaly Detection via Self-Calibration and Dynamic Updating" To appear in the Proceedings of the International Symposium On Recent Advances In Intrusion Detection. September 2009, Saint-Malo, Brittany,France.
- Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo, Angelos D.
Keromytis "Casting out Demons: Sanitizing Training Data for Anomaly Sensors"
In the Proceedings of the IEEE Symposium on Security & Privacy. May 2008, Oakland, CA. [PDF]
-
Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J.
Stolfo "Extended Abstract: Online Training and Sanitization of AD
Systems" NIPS Workshop on Machine Learning in Adversarial Environments for Computer Security, December 2007, Vancouver, B.C., Canada [PDF]
- Gabriela F. Cretu, Angelos Stavrou, Salvatore J. Stolfo, Angelos D.
Keromytis "Data Sanitization: Improving the Forensic Utility of Anomaly
Detection Systems"
In the Proceedings of the Third Workshop on Hot Topics in System
Dependability, June 2007, Edinburgh, UK [PDF]
- Gabriela F. Cretu, Angelos Stavrou, Slavatore J. Stolfo,
Angelos D. Keromytis "STAND: Sanitization Tool for ANomaly Detection;
Tech Report cucs-022-07, Department of Computer Science, Columbia
University, May 2007 [PDF]
|