Machine Learning in Cloud Computing

Machine Learning in Cloud Computing: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Computer Science, Information Systems

Contributor:

Seok-Wook Park

A. S. M. Sanwar Hosen

In-Ho Ra

Cloud Computing is one of the emerging fields in the modern-day world. Due to the increased volume of job requests, job schedulers have received updates one at a time. The evolution of machine learning in the context of cloud schedules has had a significant impact on cost reduction in terms of energy consumption and makespan.

cloud computing
scheduling
machine learning

1. Introduction

Cloud Computing (CC) has been one of the most emerging areas in the computation world where the users are associated with a cloud platform to get services from it. For any computation platform, the requests are viewed as jobs and are processed under the datacenter of CC. A CC data center is a physically measured component that uses computational elements to process requests from users. Physical Machines (PMs) are the main handler of the jobs and receive instructions from the data center. For example, Netflix is a cloud platform that provides a significant number of movies to its associated and authorized users and Netflix uses hardware resources to store and process the computation of the jobs. A data center can consume the same amount of energy as that consumed by 25,000 households if they are left in working mode for an hour. Hence, energy consumption and carbon emissions are a few of the serious issues in the CC architecture. In general, a CC architecture is made up of three layers namely Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). The data center handles the jobs and assigns the jobs at the IaaS layer through the SaaS layer. In the early days of development, a CC architecture was only designed to speed up the efficiency of the computation but due to increasing load and energy consumption through cloud data centers, energy efficacy has become a vital issue. Jobs are supplied to the datacenter with a constraint deadline and hence several algorithm architectures illustrate the operation based on execution time [1]. For example, Minimum Execution Time (MET) selects the PM based on the time contracts that are associated with the job. When it comes to connected jobs, viz. if the dependency of the job is on the outcome of another job, it also becomes vital to execute the dependent jobs first before executing the primary job. Based on the connected job models, the CyberShake algorithm was introduced to the world in early 2013 and has received updates over time [2]. The CyberShake algorithm uses a dependency map based on Heterogenous Earliest Finish Time (HEFT) to check the overall cost of execution. The CyberShake algorithm utilizes Statistical Machine Learning (S-ML) for the evaluation of the total cost of the job over different physical machines or data centers. Machine learning has spread into various sectors of computation including cloud computing and big data analytics [3]. Machine learning works on the behavior pattern of the entire process and requires a bulk amount of data to learn and to reach a conclusion.

2. Machine Learning in Cloud Computing

The cloud computing industry has been completely upended by the emergence of multi-clouds as a successor to CC architectures. Using this idea, the performance of different cloud designs can be improved by having multiple cloud components pool their resources together.

In multi-cloud computing, where resources are frequently underutilized due to a lack of effective resource allocation methodologies, efficient resource allocation is a crucial challenge. The current methods have difficulty optimizing both available and utilized cloud resources, which can result in costly performance inefficiencies for end users.

The major objectives have been outlined to this end.

To begin, a labeling architecture will be created so that the system may be trained to recognize labels of interest.
Second, a Q-learning-based pattern of behavior analysis is used to help make decisions at any stage of the work distribution process.
Finally, the suggested method will be investigated for quality of service (QoS) characteristics and compared with state of the art algorithms to demonstrate its superior performance.

When applied to multi-cloud computing environments, the proposed learning pattern has the potential to greatly increase the efficiency with which computer resources are allocated. Compared to standard practices, the learning pattern can save time, cost, and effort by making use of both unused and underutilized cloud computing resources. The overarching goal of this effort is to present a more efficient and effective method for allocating multi-cloud computing resources that will be cost-effective for end users and boost overall performance.

In recent years, cloud computing’s popularity has skyrocketed due to its capacity to supply customers with versatile and inexpensive computing resources. Multi-cloud computing, in which many cloud providers are employed to host applications, increases the potential benefits of cloud computing. The benefits include increased scalability, dependability, and cost-effectiveness. However, there are also substantial obstacles to overcome when it comes to optimizing resource allocation in multi-cloud computing. There is a need for more advanced methodologies that can optimize resource allocation across several clouds in a way that is more energy-efficient, cost-effective, and environmentally friendly than traditional approaches to multi-cloud computing resource allocation.

While several researchers have developed models for allocating resources in multi-cloud environments, these approaches frequently have their own set of problems.

Some models, for instance, rely on a single person making all the decisions, which might introduce instabilities and prevent the system from expanding. Some models may not account for the heterogeneity of cloud resources, which can lead to inefficient use of those assets. The majority of the currently available models also fail to account for fluctuations in the job, which can lead to subpar performance and unnecessary resource consumption. The contributions of several researchers in this field are as follows.

Hu et al. (2018) proposed a multi-objective scheduling algorithm for multi-cloud environments to reduce the workflow makespan and scheduling cost of data centers. Their approach utilized Particle Swarm Optimization to customize job scheduling based on location and order of data transmission. Although their simulation study showed promising results, their algorithm did not consider the impact of renewable energy sources or the carbon footprint of data centers [1]. Xu and Buyya (2020) proposed a workload shift approach that addressed the CO2 emissions issue by shifting jobs among multi-clouds located in different time zones. They also utilized renewable energy sources, such as solar energy, to reduce the usage of non-renewable energy. Their approach was effective in reducing CO2 emissions by 40% while maintaining a near-average response time for user requests. However, their approach did not consider load balancing or resource utilization, which are important factors in multi-cloud environments [2]. Cai et al. (2021) proposed a distributed job scheduling approach for multi-cloud environments that considered multiple objectives, including cost, energy consumption, time consumed, throughput, load balancing, and resource utilization. Their approach utilized intelligent algorithms based on the aforementioned objectives, as well as a sine function for model implementation. Although their simulation analysis demonstrated high scheduling efficiency with enhanced security, their approach did not consider the impact of renewable energy sources on energy consumption or CO2 emissions [3]. Renugadevi and Geetha (2021) developed a model for a geographically distributed multi-cloud environment that utilized solar energy as the main source of energy. Their model considered electricity prices, favorable location, CO2 emissions, and carbon taxes in energy management and resource allocation. The type of task was customized in response to the deadline of the task, which resulted in adaptive management of the multi-cloud model and workload algorithm. However, their approach did not consider load balancing or resource utilization, which are critical factors for optimal performance in multi-cloud environments [4]. Gaurang Patel et al. (2015) present study of task scheduling algorithms and modification of load balanced min-min algorithm. The proposed algorithm is based on survey of load balancing algorithm for static meta-task scheduling in grid computing. It select task based on maximum completion time and improve resources utilization and makespan [5]. Zhang et al. (2021) proposed a distributed deployment approach for tasks in multi-cloud environments, based on reinforcement learning. Their approach performed two steps: job offloading and task scheduling based on cloud center decisions. Their simulation analysis showed that their approach had the least latency in a heterogeneous environment, compared to existing approaches. However, their approach did not consider the impact of energy sources on energy consumption or CO2 emissions [6]. Sangeetha et al. (2022) utilized deep learning Neural Networks to control routing capabilities in multi-cloud environments to manage space and task allocation. Their approach minimized the delay associated with data processing and storage across clouds. Their simulation analysis demonstrated improved performance in terms of delay and costs associated with resource allocation in multi-cloud environments. However, their approach did not consider the impact of renewable/non-renewable energy sources on energy consumption or CO2 emissions [7]. Cao et al. (2022) presented the carbon footprint of data centers as a major source of intensive CO2 emissions and suggested that data centers increasingly switch to renewable energy to minimize the negative effects of CO2 emissions and boost energy circulation. They further suggested the integration of Artificial Intelligence to address the challenges and put the framework in place in real-world scenarios. However, they did not propose a specific approach to integrate AI or address the challenge of load balancing or resource utilization in multi-cloud environments [8]. Jun-Qing Li et al. (2021) proposed a hybrid technique of greedy and simulated annealing for scheduling crane transportation process. The objective was to reduce the completion time and energy consumption. It works in two steps: the first step is the scheduling of jobs and the second step is the assignment of machines [9]. Yu Du et al. (2022) develop a multi-objective optimization algorithm for distribution algorithm estimation and deep q-network to solve the problem of scheduling job shops. The problem was the processing speed of the machine, idle time, setup time, and transportation between machines. The proposed model was validated on CPLEX. The results showed that the proposed method performed effectively for job shop scheduling [10]. Ju Du (2022) presented a deep Q-network (DQN) model for multi-objective flexible job shop scheduling for crane transportation and setup times. It included 12 states and 7 actions for the scheduling process. The result showed that the proposed method produced an effective and efficient output. DQN also has the option to use dispatching rules according to situations [11]. Haider Ali et al. (2021) explored IOT and its application in the area of smart cities and also discussed wireless sensors. The authors also discussed the multiprocessor chip on-chip (MPSoC) used in daily IOT-based gadgets. At the end of this survey, the author also mentioned the future directions [12]. Umair Ullah Tariq et al. (2021) proposed a novel energy-aware scheduler with constraints on Voltage Frequency Island (VFI)-based heterogeneous NoC-MPSoCs deploying re-timing integrated with DVFS for real-time streaming applications. The authors proposed R-CTG which is a task-level timing approach integrated with non-linear programming and a voltage-level approach. The comparison results showed that the proposed method performed better in terms of latency and energy consumption [13].

More recent efforts have aimed to optimize multi-cloud resource allocation using machine learning and deep reinforcement learning (DRL) approaches. However, there is still a large knowledge gap in the application of DRL to multi-cloud resource allocation, especially in the areas of energy efficiency, CO2 emissions, and cost optimization.

This entry is adapted from the peer-reviewed paper 10.3390/electronics12081810

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.