Articles producció científica> Enginyeria Informàtica i Matemàtiques

MLLESS: Achieving cost efficiency in serverless machine learning training

  • Datos identificativos

    Identificador: imarina:9330487
    Autores:
    Sarroca, PGSánchez-Artigas, M
    Resumen:
    Function-as-a-Service (FaaS) has raised a growing interest in how to “tame” serverless computing to enable domain-specific use cases such as data-intensive applications and machine learning (ML), to name a few. Recently, several systems have been implemented for training ML models. Certainly, these research articles are significant steps in the correct direction. However, they do not completely answer the nagging question of when serverless ML training can be more cost-effective compared to traditional “serverful” computing. To help in this endeavor, we propose MLLESS, a FaaS-based ML training prototype built atop IBM Cloud Functions. To boost cost-efficiency, MLLESS implements two innovative optimizations tailored to the traits of serverless computing: on one hand, a significance filter, to make indirect communication more effective, and on the other hand, a scale-in auto-tuner, to reduce cost by benefiting from the FaaS sub-second billing model (often per 100 ms). Our results certify that MLLESS can be 15X faster than serverful ML systems [27] at a lower cost for sparse ML models that exhibit fast convergence such as sparse logistic regression and matrix factorization. Furthermore, our results show that MLLESS can easily scale out to increasingly large fleets of serverless workers.
  • Otros:

    Autor según el artículo: Sarroca, PG; Sánchez-Artigas, M
    Departamento: Enginyeria Informàtica i Matemàtiques
    Autor/es de la URV: Sanchez Artigas, Marc
    Palabras clave: Serverless computing Machine learning Function-as-a-service
    Resumen: Function-as-a-Service (FaaS) has raised a growing interest in how to “tame” serverless computing to enable domain-specific use cases such as data-intensive applications and machine learning (ML), to name a few. Recently, several systems have been implemented for training ML models. Certainly, these research articles are significant steps in the correct direction. However, they do not completely answer the nagging question of when serverless ML training can be more cost-effective compared to traditional “serverful” computing. To help in this endeavor, we propose MLLESS, a FaaS-based ML training prototype built atop IBM Cloud Functions. To boost cost-efficiency, MLLESS implements two innovative optimizations tailored to the traits of serverless computing: on one hand, a significance filter, to make indirect communication more effective, and on the other hand, a scale-in auto-tuner, to reduce cost by benefiting from the FaaS sub-second billing model (often per 100 ms). Our results certify that MLLESS can be 15X faster than serverful ML systems [27] at a lower cost for sparse ML models that exhibit fast convergence such as sparse logistic regression and matrix factorization. Furthermore, our results show that MLLESS can easily scale out to increasingly large fleets of serverless workers.
    Áreas temáticas: Theoretical computer science Software Matemática / probabilidade e estatística Interdisciplinar Hardware and architecture Engenharias iv Engenharias iii Computer science, theory & methods Computer networks and communications Ciência da computação Artificial intelligence
    Acceso a la licencia de uso: https://creativecommons.org/licenses/by/3.0/es/
    Direcció de correo del autor: marc.sanchez@urv.cat
    Identificador del autor: 0000-0002-9700-7318
    Fecha de alta del registro: 2024-08-03
    Versión del articulo depositado: info:eu-repo/semantics/publishedVersion
    URL Documento de licencia: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referencia al articulo segun fuente origial: Journal Of Parallel And Distributed Computing. 183
    Referencia de l'ítem segons les normes APA: Sarroca, PG; Sánchez-Artigas, M (2024). MLLESS: Achieving cost efficiency in serverless machine learning training. Journal Of Parallel And Distributed Computing, 183(), -. DOI: 10.1016/j.jpdc.2023.104764
    Entidad: Universitat Rovira i Virgili
    Año de publicación de la revista: 2024
    Tipo de publicación: Journal Publications
  • Palabras clave:

    Artificial Intelligence,Computer Networks and Communications,Computer Science, Theory & Methods,Hardware and Architecture,Software,Theoretical Computer Science
    Serverless computing
    Machine learning
    Function-as-a-service
    Theoretical computer science
    Software
    Matemática / probabilidade e estatística
    Interdisciplinar
    Hardware and architecture
    Engenharias iv
    Engenharias iii
    Computer science, theory & methods
    Computer networks and communications
    Ciência da computação
    Artificial intelligence
  • Documentos:

  • Cerca a google

    Search to google scholar