Optimizing Distributed Log Tracing for Intelligent Cloud 
System Operations

Guan Yang

doi:10.70711/aitr.v3i3.8042

Optimizing Distributed Log Tracing for Intelligent Cloud System Operations

Guan Yang

Abstract

This paper investigates optimization strategies for distributed log tracing in intelligent cloud system operations. As modern
computing shifts toward large-scale distributed infrastructures, efficient log tracing has become a cornerstone of observability, fault detection, and automated decision-making. The study proposes an AI-driven optimization framework combining adaptive sampling, semantic correlation modeling, and graph-based reasoning to enhance trace accuracy and efficiency. The proposed approach leverages machine
learning techniques to dynamically adjust trace collection policies and infer causal relationships among microservices. Experimental
evaluation demonstrates improved observability accuracy, reduced storage overhead, and accelerated anomaly detection compared with
conventional tracing systems. The results highlight the potential of intelligent tracing in building self-optimizing and resilient cloud operation environments.

Keywords

Distributed tracing; Cloud operations; Artificial intelligence; observability; Optimization; Log analysis

Full Text:

PDF

Included Database

References

[1] Sigelman, B., et al. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Google Technical Report, 2010.

[2] Chen, Z., and Liu, J. AI-Driven Observability for Cloud-Native Systems. IEEE Transactions on Cloud Computing, 2023.

[3] Wang, T., and Zhao, Q. Adaptive Sampling in Distributed Tracing Systems. ACM Symposium on Cloud Computing, 2022.

[4] Luo, Y., et al. Graph Neural Networks for Root-Cause Analysis in Cloud Operations. Journal of Systems and Software, 2021.

[5] Kumar, S., and Li, P. Semantic Log Analysis with Transformer Models. Proceedings of the IEEE BigData Conference, 2023.

DOI: http://dx.doi.org/10.70711/aitr.v3i3.8042

Refbacks

There are currently no refbacks.