Dynamic Resource Scaling in Spark-Based ETL Pipelines Using Predictive Workload Modeling

Authors

  • Sarvesh Kumar Gupta Author

DOI:

https://doi.org/10.64180/

Keywords:

Apache Spark, ETL Pipelines, Dynamic Resource Scaling, Predictive Workload Modeling, Cloud Computing, Big Data Analytics

Abstract

The increasing adoption of cloud-native data platforms has led to widespread use of Apache Spark for large-scale Extract, Transform, and Load (ETL) operations. Spark-based ETL pipelines process massive volumes of structured and unstructured data, enabling organizations to support real-time analytics, business intelligence, and data-driven decision-making. However, the performance of these pipelines is highly dependent on efficient resource allocation. Traditional static provisioning and reactive scaling mechanisms often struggle to handle fluctuating workloads, resulting in resource underutilization, increased operational costs, execution delays, and reduced system efficiency. These challenges become more significant in cloud environments where workload characteristics change dynamically over time. 

Downloads

Published

2023-10-10

How to Cite

Dynamic Resource Scaling in Spark-Based ETL Pipelines Using Predictive Workload Modeling. (2023). Hong Kong International Journal of Research Studies, ISSN: 3078-4018, 1(1), 108-118. https://doi.org/10.64180/

Similar Articles

11-20 of 48

You may also start an advanced similarity search for this article.