Articles producció científica> Enginyeria Informàtica i Matemàtiques

Transparent serverless execution of Python multiprocessing applications

  • Dades identificatives

    Identificador: imarina:9286991
    Autors:
    Arjona AFinol GLópez PG
    Resum:
    Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-parallel scientific applications over flexible disaggregated resources in the Cloud. This paper presents a performance evaluation where we assess the feasibility of access transparency over state-of-the-art Cloud disaggregated resources for Python multiprocessing applications. We have interfaced the multiprocessing module with an implementation that transparently runs processes on serverless functions and uses an in-memory data store for shared state. To evaluate transparency, we run in the Cloud four unmodified applications: Uber Research's Evolution Strategies, Baselines-AI's Proximal Policy Optimization, Pandaral.lel's dataframe, and Scikit Learn's Hyperparameter tuning. We compare execution time and scalability of the same application running over disaggregated resources using our library, with the single-machine Python multiprocessing libraries in a large VM. For equal resources, applications efficiently using message-passing abstractions achieve comparable results despite the significant overheads of remote communication. Other shared-memory intensive applications do not perform due to high remote memory latency. The results show that Python's multiprocessing library design is an enabler towards transparency: legacy applications using efficient disaggregated abstractions can transparently scale beyond VM limited resources for increased parallelism without changing the underlying code or architecture.
  • Altres:

    Autor segons l'article: Arjona A; Finol G; López PG
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Arjona Perez, Aitor
    Paraules clau: Transparency Serverless Parallel programming Multiprocessing Faas Access transparency serverless parallel programming multiprocessing faas access transparency
    Resum: Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-parallel scientific applications over flexible disaggregated resources in the Cloud. This paper presents a performance evaluation where we assess the feasibility of access transparency over state-of-the-art Cloud disaggregated resources for Python multiprocessing applications. We have interfaced the multiprocessing module with an implementation that transparently runs processes on serverless functions and uses an in-memory data store for shared state. To evaluate transparency, we run in the Cloud four unmodified applications: Uber Research's Evolution Strategies, Baselines-AI's Proximal Policy Optimization, Pandaral.lel's dataframe, and Scikit Learn's Hyperparameter tuning. We compare execution time and scalability of the same application running over disaggregated resources using our library, with the single-machine Python multiprocessing libraries in a large VM. For equal resources, applications efficiently using message-passing abstractions achieve comparable results despite the significant overheads of remote communication. Other shared-memory intensive applications do not perform due to high remote memory latency. The results show that Python's multiprocessing library design is an enabler towards transparency: legacy applications using efficient disaggregated abstractions can transparently scale beyond VM limited resources for increased parallelism without changing the underlying code or architecture.
    Àrees temàtiques: Software Saúde coletiva Medicina ii Medicina i Matemática / probabilidade e estatística Interdisciplinar Hardware and architecture Engenharias iv Engenharias iii Engenharias i Comunicação e informação Computer science, theory & methods Computer networks and communications Ciências sociais aplicadas i Ciências biológicas ii Ciências biológicas i Ciência da computação
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: aitor.arjona@urv.cat aitor.arjona@urv.cat
    Identificador de l'autor: 0000-0001-5451-4865 0000-0001-5451-4865
    Data d'alta del registre: 2024-08-03
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    Enllaç font original: https://www.sciencedirect.com/science/article/pii/S0167739X22003612
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: Future Generation Computer Systems-The International Journal Of Escience. 140 436-449
    Referència de l'ítem segons les normes APA: Arjona A; Finol G; López PG (2023). Transparent serverless execution of Python multiprocessing applications. Future Generation Computer Systems-The International Journal Of Escience, 140(), 436-449. DOI: 10.1016/j.future.2022.10.038
    DOI de l'article: 10.1016/j.future.2022.10.038
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2023
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Computer Networks and Communications,Computer Science, Theory & Methods,Hardware and Architecture,Software
    Transparency
    Serverless
    Parallel programming
    Multiprocessing
    Faas
    Access transparency
    serverless
    parallel programming
    multiprocessing
    faas
    access transparency
    Software
    Saúde coletiva
    Medicina ii
    Medicina i
    Matemática / probabilidade e estatística
    Interdisciplinar
    Hardware and architecture
    Engenharias iv
    Engenharias iii
    Engenharias i
    Comunicação e informação
    Computer science, theory & methods
    Computer networks and communications
    Ciências sociais aplicadas i
    Ciências biológicas ii
    Ciências biológicas i
    Ciência da computação
  • Documents:

  • Cerca a google

    Search to google scholar