Repositori institucional URV
Español Català English

TÍTOL:

A Seer knows best: Auto-tuned object storage shuffling for serverless analytics - imarina:9330486

Autor/s de la URV:Eizaguirre Suárez, Germán Telmo / Sanchez Artigas, Marc
Autor segons l'article:Eizaguirre, GT; Sánchez-Artigas, M
Adreça de correu electrònic de l'autor:germantelmo.eizaguirre@urv.cat
germantelmo.eizaguirre@urv.cat
marc.sanchez@urv.cat
marc.sanchez@urv.cat
Identificador de l'autor:0000-0002-2865-9873
0000-0002-2865-9873
0000-0002-9700-7318
0000-0002-9700-7318
Any de publicació de la revista:2024-01-01
Tipus de publicació:Journal Publications
Referència de l'ítem segons les normes APA:Eizaguirre, GT; Sánchez-Artigas, M (2024). A Seer knows best: Auto-tuned object storage shuffling for serverless analytics. Journal Of Parallel And Distributed Computing, 183(), 104763-. DOI: 10.1016/j.jpdc.2023.104763
Referència a l'article segons font original:Journal Of Parallel And Distributed Computing. 183 104763-
Resum:Serverless platforms offer high resource elasticity and pay-as-you-go billing, making them a compelling choice for data analytics. To craft a “pure” serverless solution, the common practice is to transfer intermediate data between serverless functions via serverless object storage (IBM COS; AWS S3). However, prior works have led to inconclusive results about the performance of object storage systems, since they have left large margin for optimization. To verify that object storage has been underrated, we devise a novel shuffle manager for serverless data analytics called SEER. Specifically, SEER dynamically chooses between two shuffle algorithms to maximize performance. The algorithm choice is made online based on some predictive models, and very importantly, without end users having to specify intermediate shuffle data sizes at the time of the job submission. We integrate SEER with PyWren-IBM [31], a well-known serverless analytics framework, and evaluate it against both serverful (e.g., Spark) and serverless systems (e.g., Google BigQuery, Caerus [46] and SONIC [22]). Our results certify that our new shuffle manager can deliver performance improvements over them.
DOI de l'article:10.1016/j.jpdc.2023.104763
Enllaç font original:https://www.sciencedirect.com/science/article/pii/S0743731523001338
Versió de l'article dipositat:info:eu-repo/semantics/publishedVersion
Accès a la llicència d'ús:https://creativecommons.org/licenses/by/3.0/es/
Departament:Enginyeria Informàtica i Matemàtiques
URL Document de llicència:https://repositori.urv.cat/ca/proteccio-de-dades/
Àrees temàtiques:Theoretical computer science
Software
Hardware and architecture
Computer science, theory & methods
Computer networks and communications
Ciência da computação
Artificial intelligence
Paraules clau:Shuffle
Serverless computing
Object storage
I/o optimization
Entitat:Universitat Rovira i Virgili
Data d'alta del registre:2026-05-09
Cerca el teu registre a:

Fitxers disponibles
FitxerDescripcióFormat
DocumentPrincipalDocumentPrincipalapplication/pdf


Informació

© 2011 Universitat Rovira i Virgili