Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/89625

TítuloHorus: non-intrusive causal analysis of distributed systems logs
Autor(es)Neves, Francisco
Machado, Nuno
Vilaca, Ricardo
Pereira, José
Data2021
EditoraIEEE
RevistaInternational Conference on Dependable Systems and Networks
CitaçãoNeves, F., Machado, N., Vilaca, R., & Pereira, J. (2021, June). Horus: Non-Intrusive Causal Analysis of Distributed Systems Logs. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE. http://doi.org/10.1109/dsn48987.2021.00035
Resumo(s)Logs are still the primary resource for debugging distributed systems executions. Complexity and heterogeneity of modern distributed systems, however, make log analysis extremely challenging. First, due to the sheer amount of messages, in which the execution paths of distinct system components appear interleaved. Second, due to unsynchronized physical clocks, simply ordering the log messages by timestamp does not suffice to obtain a causal trace of the execution. To address these issues, we present Horus, a system that enables the refinement of distributed system logs in a causally-consistent and scalable fashion. Horus leverages kernel-level probing to capture events for tracking causality between application-level logs from multiple sources. The events are then encoded as a directed acyclic graph and stored in a graph database, thus allowing the use of rich query languages to reason about runtime behavior. Our case study with TrainTicket, a ticket booking application with 40+ microservices, shows that Horus surpasses current widely-adopted log analysis systems in pinpointing the root cause of anomalies in distributed executions. Also, we show that Horus builds a causally-consistent log of a distributed execution with much higher performance (up to 3 orders of magnitude) and scalability than prior state-of-the-art solutions. Finally, we show that Horus' approach to query causality is up to 30 times faster than graph database built-in traversal algorithms.
TipoArtigo em ata de conferência
URIhttps://hdl.handle.net/1822/89625
ISBN978-1-6654-1194-3
e-ISBN978-1-6654-3572-7
DOI10.1109/DSN48987.2021.00035
ISSN1530-0889
e-ISSN2158-3927
Versão da editorahttps://ieeexplore.ieee.org/document/9505126
Arbitragem científicayes
AcessoAcesso restrito UMinho
Aparece nas coleções:HASLab - Artigos em atas de conferências internacionais (texto completo)

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Horus_Non-Intrusive_Causal_Analysis_of_Distributed_Systems_Logs.pdf
Acesso restrito!
264,72 kBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID