The deployment of virtual machines (VMs) within the Infrastructure as a Service (IaaS) layer across public, private, or hybrid cloud infrastructures is prevalent in various organisational settings for hosting essential business services. However, achieving rapid elasticity, or autoscaling, and ensuring quality of service amidst fluctuating service demands and available computing resources present significant challenges. Unlike the Platform as a Service (PaaS) and Software as a Service (SaaS) layers, where cloud providers offer managed elasticity features, the VMs at the IaaS layer often lack such capabilities. This paper scrutinises the constraints surrounding the rapid elasticity of VMs within single and hybrid cloud environments at the IaaS layer. It provides a critical analysis of the existing research gaps, emphasising the necessity for the horizontal elasticity of VMs extended across hybrid clouds, coupled with predictive capabilities integrated into the elasticity mechanism. This paper’s focus is particularly beneficial in scenarios where workloads require VM provisioning from multiple clouds to eliminate vendor lock-in and enhance quality of service (QoS) assurances, especially in instances of platform failures. Through critical examination, several research challenges are identified, delineating the existing research gap and outlining future research directions. This paper contributes to the research challenges of VM elasticity in complex cloud environments and underscores the imperative for innovative solutions to address these challenges effectively.
In today’s IT infrastructures, a significant number of services are hosted on virtual machines (VMs), which are sourced from self-managed on-premise private clouds, public clouds, or a mixture of both. In all three cases, “rapid elasticity”
[1], or autoscaling, is identified as one of the key characteristics of cloud computing, and is used to provide the on-demand scalable computing resources. This is achieved by either scaling the number of VMs (Horizontal Autoscaling) or adjusting the computing resources provided to a VM from a resource pool (Vertical Autoscaling). However, finding an appropriate and timely equilibrium between horizontal and vertical resource allocation required for both autoscaling approaches while managing associated costs to ensure that service level agreements (SLAs) are met, is a challenging task across all three cloud deployment models
[2]. Furthermore, these autoscaling techniques are mainly divided into two main categories for both horizontal and vertical scaling, and this division is mainly based on how and when the autoscaling actions are triggered. These two categories are known as reactive and proactive autoscaling
[3]. Reactive autoscaling adjusts cloud resources in real-time based on workload changes using thresholds or rules. Proactive autoscaling uses historical data to predict future demands and allocate resources in advance, ensuring resource readiness for future workload resource requirements.
Motivation and Scope of the Paper
In cloud computing environments, VMs operate at the IaaS layer. IaaS is one of the three service models along with the other two service models, known as PaaS and SaaS layers
[4]. Currently, it is not possible to extend VM autoscaling across multiple cloud platforms, and this can be seen from the comparisons provided in Section 5 “Commercial Autoscaling Approaches in Public Cloud Environments”. Also, a challenging task is the accurate number of VMs required to cater a particular workload and then make the proactive autoscaling decisions. While public cloud platforms like AWS, Azure, and Google Cloud provide auto-scaling capabilities, the offerings are mainly reactive with limited proactive autoscaling support with limitations that need to be addressed. For example, proactive scaling necessitates precise workload prediction using machine learning (ML)-based models, a capability that is not typically included in standard cloud services. Although there are some ML capabilities available, such as time series-based predictions, the lack of integrated advanced prediction technologies forces consumers to rely on in-house-developed or third-party solutions for workload predictions and resource allocations. In addition, the inherent delays introduced by the VM provisioning and boot times and the in-return delays imposed on the scale-out actions impact the application performance during the peak and sudden demands. To address these gaps, the authors believe that the autoscaling offerings in public cloud platforms should be equipped with the following capabilities:
-
Precisely forecasting the future demand by analysing the historical trends with anomaly detection.
-
Accurate autoscaling decision execution to address the VM provisioning or boot time delays (cold start).
-
More flexibility in cloud native VM autoscaling solutions, where the VM autoscaling can be extended across hybrid clouds.
-
Greater versatility in defining custom autoscaling policies using more user-defined and workload-specific autoscaling metrics.
These factors have motivated this research to explore the reactive and proactive autoscaling techniques currently available in the most popular public cloud platforms and identify the current gaps, especially when providing proactive (predictive) autoscaling for VMs and extending the VM autoscaling across hybrid cloud platforms. Therefore, this comprehensive systematic literature review focuses on the challenges of extending autoscaling across multiple cloud platforms (autoscaling in hybrid clouds) at the IaaS layer and reviews proactive autoscaling solutions. The main scope and objective of this review is to identify the gaps in horizontal autoscaling, specifically in proactive autoscaling scenarios, in both commercial implementations and current academic research. This is further broadened to investigate the potential of extending the horizontal autoscaling at the IaaS layer across hybrid cloud platforms, removing the limitation of IaaS horizontal autoscaling restricted to a single cloud platform. Although the primary focus of this work lies in the IaaS layer, there has been some research work selected from the PaaS layer as well, with the intention of re-examining the approaches employed in those works that enabled proactive and hybrid cloud autoscaling and adopting those for VM autoscaling at the IaaS layer. The main contributions of this paper are:
-
Reviewing the current research work related to VM autoscaling to identify the research directions and gaps in proactive autoscaling and hybrid cloud autoscaling.
-
Identifying how proactive autoscaling and workload classification are introduced for container autoscaling using ML-based technologies.
-
Performing a comparison of the autoscaling offerings of three major public cloud platforms and one open-source cloud platform to identify the current gaps and issues.
-
Define future research directions to address the gaps identified in this review.
Structure of the Paper
To provide a better understanding of the contents discussed in the reviews, Figure 1 depicts an overall taxonomy of concepts related to this research. In addition, commonly used autoscaling techniques in cloud computing (both the reactive and proactive autoscaling categories) are briefly presented in Section 2. Then, to present a holistic view, a review of the autoscaling issues at the PaaS layer has also been performed. The motive behind this is to identify the extent to which the approaches and techniques proposed at the PaaS layer are suitable to enhance proactive autoscaling at the IaaS layer. In Section 3, a critical review of cloud autoscaling is presented. It focuses on three aspects: reactive autoscaling, proactive autoscaling in a single-cloud infrastructure, and proactive autoscaling across multiple cloud infrastructures. In Section 4, research methods are presented, followed by the classification of autoscaling techniques in Section 5. Then, the discussions on commercial deployments of autoscaling are presented in Section 6. This section focuses on three selected commercial cloud infrastructures. Section 7 summarises the challenges identified during the review of autoscaling approaches in previous research works and commercial deployments. Finally, conclusions and future research directions are presented in Section 8. Table 1 below provides an overview of this paper’s structure.
Figure 1. Breakdown of taxonomy. Boxes filled in green indicate the scope of this research.
Table 1. An overview of this paper’s structure.
|
Section
|
Section Heading
|
Brief Description
|
|
Section 2
|
Rapid elasticity for VMs in Cloud Platforms and current challenges
|
Discusses the commonly used autoscaling techniques in cloud computing within both the reactive and proactive autoscaling categories.
|
|
Section 3
|
Autoscaling in Clouds—Previous Work
|
Provides an overview of the selected papers, discussing the strengths and weaknesses of each work.
|
|
Section 4
|
Methods and Materials
|
Outlines the methodology used for the systematic review.
|
|
Section 5
|
Classification of Autoscaling Techniques
|
Offers a classification of commonly employed autoscaling techniques, along with a brief description of each method.
|
|
Section 6
|
Commercial Autoscaling Approaches in Public Cloud Environments
|
Discusses the autoscaling approaches used in three widely used commercial public cloud providers and one open-source-based cloud platform.
|
|
Section 7
|
What are the Challenges to Achieve Autoscaling at IaaS Layer?
|
Summarises the challenges from the review of commercial deployments (Section 5) and autoscaling approaches proposed in the reviewed papers (Section 6).
|
|
Section 8
|
Conclusions and future research directions
|
Concludes and offers future research directions are presented in this section.
|