Home‎ > ‎Current Projects‎ > ‎


Fall 2010 Student Project info
Contact Nathaniel Boggs, boggs@cs.columbia.edu

Overall Project Goal
Use network based anomaly detection to detect zero day web attacks via multi-site collaboration with GMU

Potential Student Projects
  • Improve project code and self monitoring in preparation for online deployment
  • Develop ongoing statistical analysis

Project Overview
A huge obstacle in cyber security is the lack of data sharing. Without knowing what new attacks attacks exist in real time, it is difficult to defend against the always evolving attacker community. The largest obstacle to sharing of information is the privacy implications as network traffic is highly sensitive. We propose that by using privacy preserving data structures attack data can be shared without compromising the privacy of normal network traffic.

Our prototype uses alerts from different sites to identify common attackers/attacks and also improve the training of anomaly detectors. We also show how a collaborative approach that combines models from different networks or domains can further refine the sanitization process to thwart targeted training or mimicry attacks against a single site.


The efficacy of Anomaly Detection (AD) sensors depends heavily on the quality of the data used to train them. Artificial or contrived training data may not provide a realistic view of the deployment environment. Most realistic data sets are dirty; that is, they contain a number of attacks or anomalous events. The size of these high-quality training data sets makes manual removal or labeling of attack data infeasible. As a result, sensors trained on this data can miss attacks and their variations.

We propose extending the training phase of AD sensors (in a manner agnostic to the underlying AD algorithm) to include a sanitization phase. This phase generates multiple models conditioned on small slices of the training data. We use these "micro-models" to produce provisional labels for each training input, and we combine the micro-models in a voting scheme to determine which parts of the training data may represent attacks. Our results suggest that this phase automatically and significantly improves the quality of unlabeled training data by making it as "attack-free" and "regular" as possible in the absence of absolute ground truth.

Current research sponsored by AFOSR MURI Contract: GMU 107151AA “MURI: Autonomic Recovery of Enterprise-wide Systems After Attack or Failure with Forward Correction”

  • Nathaniel Boggs, Sharath Hiremagalore, Angelos Stavrou, Salvatore J. Stolfo, "Experimental Results of Cross-Site Exchange of Web Content Anomaly Detector Alerts", IEEE International Conference on Technologies for Homeland Security, November 2010. [PDF]
  • Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo "Adaptive Anomaly Detection via Self-Calibration and Dynamic Updating" To appear in the Proceedings of the International Symposium On Recent Advances In Intrusion Detection. September 2009, Saint-Malo, Brittany,France. 
  • Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo, Angelos D. Keromytis "Casting out Demons: Sanitizing Training Data for Anomaly Sensors" In the Proceedings of the IEEE Symposium on Security & Privacy. May 2008, Oakland, CA. [PDF]
  • Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo "Extended Abstract: Online Training and Sanitization of AD Systems" NIPS Workshop on Machine Learning in Adversarial Environments for Computer Security, December 2007, Vancouver, B.C., Canada  [PDF]
  • Gabriela F. Cretu, Angelos Stavrou, Salvatore J. Stolfo, Angelos D. Keromytis "Data Sanitization: Improving the Forensic Utility of Anomaly Detection Systems" In the Proceedings of the Third Workshop on Hot Topics in System Dependability, June 2007, Edinburgh, UK [PDF]
  • Gabriela F. Cretu, Angelos Stavrou, Slavatore J. Stolfo, Angelos D. Keromytis "STAND: Sanitization Tool for ANomaly Detection; Tech Report cucs-022-07, Department of Computer Science, Columbia University, May 2007 [PDF]