Study on Robustness of Anomaly Detection Algorithm in Big Data Environment
Abstract
As a critical component in big data value extraction, anomaly detection fundamentally depends on robustness to deliver reliable
outcomes in real-world scenarios where noise, conceptual drift, adversarial attacks, and system failures coexist. This study systematically investigates robustness from four dimensions: data, models, systems, and evaluation. It begins with formalizing robustness through quantifiable
metrics, followed by analyzing statistical challenges posed by high-dimensional sparse data, heavy-tailed distributions, and heterogeneous
data sources. The research then compares robust mechanisms and failure patterns across four algorithm categories: statistical methods, machine learning, deep learning, and graph neural networks. An enhanced framework is proposed featuring adversarial training, integrated distillation, trusted execution environments, and causal intervention. A reproducible benchmarking suite is developed and validated through experiments on 120 million records from financial transaction logs and industrial sensor streams. Results demonstrate that the proposed framework
achieves a 4.7-fold increase in adversarial tolerance while reducing concept drift adaptation latency by 62%, with F1 scores showing no more
than 3% degradation. These findings validate both theoretical frameworks and practical implementations.
outcomes in real-world scenarios where noise, conceptual drift, adversarial attacks, and system failures coexist. This study systematically investigates robustness from four dimensions: data, models, systems, and evaluation. It begins with formalizing robustness through quantifiable
metrics, followed by analyzing statistical challenges posed by high-dimensional sparse data, heavy-tailed distributions, and heterogeneous
data sources. The research then compares robust mechanisms and failure patterns across four algorithm categories: statistical methods, machine learning, deep learning, and graph neural networks. An enhanced framework is proposed featuring adversarial training, integrated distillation, trusted execution environments, and causal intervention. A reproducible benchmarking suite is developed and validated through experiments on 120 million records from financial transaction logs and industrial sensor streams. Results demonstrate that the proposed framework
achieves a 4.7-fold increase in adversarial tolerance while reducing concept drift adaptation latency by 62%, with F1 scores showing no more
than 3% degradation. These findings validate both theoretical frameworks and practical implementations.
Keywords
Anomaly detection; Robustness; Big data; Adversarial attacks; Concept drift; Trusted computing
Full Text:
PDFReferences
[1] Research on Antagonistic Robustness Enhancement Based on Adaptive Feature Optimization [J]. Song Xiaoxue; Liu Wanping; Huang
Dong. Software Journal: 12.
[2] System Run-RoboTest Method Based on Software Fault Injection [J]. Fang Jiajuan & Li Wanqing. Industrial Control Computer. 2025,
38(08):121-123.
[3] Detection of Data Poisoning Attacks and Robust Cleaning Algorithm Design for AI Model Training [J]. Zhang Rong. Information Technology Research. 2025, 51(04):79-82+87.
DOI: http://dx.doi.org/10.70711/aitr.v3i2.7854
Refbacks
- There are currently no refbacks.