Models of Workload Generators: Comparison
Please note this is a comparison between Version 2 by Beatrix Zheng and Version 1 by Tomasz Rak.

Simulation is a powerful process for perfectly planning and dimensioning web systems. However, a successful analysis using a simulation model usually requires variable load intensities. Furthermore, as the client’s behavior is subject to frequent changes in modern web systems, such models need to be adapted as well. Based on web systems observation, wthe researchers come across the need for tools that allow flexible definitions of web systems load profiles. 

  • simulation analysis
  • performance analysis
  • workload characterization

1. Introduction

Modeling next-generation networks requires designers to prepare a traffic generator tailored to the needs of the generated request parameters. To deliver the required load it is necessary to prepare the load generation system. OuThe researchers' approach to implementing such a system is based on the Timed Colored Petri Nets (TCPN) formalism. WeThe researchers propose a configurable TCPN-based generator that could be used for various real system models. Using simulation, wethe researchers show the proposed request generator models have expected functionality. It is an adaptation to alternative design demands achievable by minimal configuration changes.
Workload modeling problems have been addressed over the past years, resulting in models for generation workloads similar to those observed in the real world. Workload characterization plays a key role in many performance engineering studies. The produced workloads could also be used to drive simulations for system prototyping, testing, and benchmarking of computer systems and networks. Today there are many different areas of workload generator application, such as: cloud computing infrastructures, web systems, networking structures, and video services. In the remaining work the weresearchers will use a term event generator as a generalization of the mentioned concepts of the request generator and workload generator. The event generator will be considered to be a configurable source of a stream of discrete events, similarly as in [1,2,3,4][1][2][3][4]. Approaches presented in [4,5,6][4][5][6] parameterize the system model using measurements gathered from real networks. They use workload models that allow the simulation of the network traffic created by real services in various possible scenarios. On the other hand, wthe researchers can also find works based on an analytical approach [2,3,7,8][2][3][7][8].
Input stream models generated from navigational patterns cause many problems in software engineering. Workload modeling is also challenging when applied in a highly dynamic environment. The derivation of such a model is non-trivial, which is confirmed by [1,2,3][1][2][3]
Many works are focused on simulating models of different systems using the generated load. WThe researchers need an input stream, but it could be obtained only from an actual system [5] or produced artificially [9,10][9][10] also based on the system log [11,12][11][12].
Workload scenarios may have three basic sources: benchmarks, real traces, and workload models. The objective is to get a workload model as close to the real workload as possible. The best option is to start with measured workload data. Downloading such a stream is not an easy task. Therefore, in most cases, an analytical model is used. When preparing the system model, designers do not study the nature of the input stream. It is mostly general and it is not the representation of reality that wthe researchers expected. Some existing works have proposed workload prediction [13,14][13][14] or prediction of user behavior [15], but without adapting it to real needs.

2. Previous Models of Workload Generators

Engineers need to have varied realistic workloads for studying the system environments. A workload model is an important way to specify and produce workloads for: cloud systems [7], web systems [11], Big Data systems [17][16], network systems [18][17], streaming systems [19][18], etc. Therefore, a deeper understanding of user behavior, workload properties, and patterns is required. Cloud computing providers expect an understanding of the typical workload patterns of their services. WThe researchers can find a large number of successful commercial streaming services, such as Netflix, Amazon Prime, or Youtube. Providers stream live videos to a highly variable audience. Network traffic generators play an important role in the design and development of networks and in the security field. Some authors prepared scalable workload generators for testing and benchmarking of high-volume data processing systems [5].
Every month, billions of users access web systems. A large number of users and the huge amount of data processed by these applications make modeling web systems a challenging task. Various models are proposed to capture the behavioral patterns of different user profiles.
Workload generators become crucial tools, as they help system constructors to plan the system design. A survey on workload generators for web systems [20][19] presents reviewed work on this domain for different types of applications.
For instance, the authors of [21][20] presented a behavior of a Google Maps client. Based on this characterization, they propose a model of client actions as a simple workload generator. In this context, the authors of [11] proposed reconstructing users’ web behaviors from web server logs. It is also possible that the use of simulation models enables the production of system logs based on realistic scenarios [5].
The trend of the computer world is no longer envisaging the operation of one single computer without interacting or cooperating with other computers. These distributed systems have to be designed to meet the new requirements. Creation of them may be facilitated by modeling. Some of the models use variants of Petri nets, and they are applied in a generic context for stream processing of distributed web systems. There are several generators in Petri networks of different classes. One can find publications on web system models in which the query generator [22,23][21][22] is described in detail but one can also find publications in which there is a generator in general form, which has not been described in detail due to its simplicity [24,25][23][24] (TCPN), [26][25] (QPN).
CPN (Colored Petri Nets) is a graphical language for modeling, simulating and validating concurrent and distributed systems [27][26]. It combines the strength of Petri nets (to model synchronization of concurrent processes) with programming languages (to define various data types and manipulate values). Usually, further extensions like TCPN (Timed CPN) or HTCPN (Hierarchical TCPN) are considered. Models developed in CPN (also TCPN or HTCPN) formalism are often implemented in CPN Tools [27][26]. It is a software tool that facilitates modeling, formal analysis and simulation of the CPN/TCPN/HTCPN models. Furthermore, it can generate a reachability graph, thus some CPN behavioral properties may be verified, for example in [28,29][27][28]. The TCPN models presented in this preseaperrch have also been developed using CPN Tools.
QPN (Queueing Petri Net) is another formalism based on Queueing Nets and Petri Nets. Queueing Nets are suitable for modeling competition of equipment. To analyze any queue system it is necessary to determine: the arrival process, service distribution, service discipline, and waiting room (scheduling strategies). QPN is a tuple [30][29]: CPN (a finite and non-empty set of places, a finite and non-empty set of transitions, a color function, the backward and forward incidence functions, an initial marking) and a set of timed and immediate queueing places and a set of timed and immediate transitions. The primary type of QPN place is a queueing place composed of a queue and a depository for tokens that completed their service at a queue. QPN models are typically implemented in QPME (Queueing Petri net Modeling Environment) [31][30]. It is an open-source tool for stochastic modeling and analysis, consisting of two components: QPE (QPN Editor) and SimQPN (Simulator for QPNs). Furthermore, the QPN models presented in this preseaperrch have been developed using QPME.
TCPN formalism is often used for modeling a web system or its part. In [22][21], several web services interaction models have been proposed. The TCPN models have been prepared, evaluated and simulated in CPN Tools – it is the same formalism and software tool as used in this preseaperrch. One of the models presented in [22][21] is the model of the generator for web service applications. In fact, it generates a stream of discrete events, similar to the generators considered here. The model proposed in [22][21] uses the formula ()@+expTime(100) to determine the distribution of application receipts under the exponential law with an intensity of 100 applications per unit of time. Thus, such a generator is an example of a timed stochastic generator, according to the classification introduced in Section 3 of this paper.  In [23][22], the server side of web applications is considered and some models of web and database servers are presented. A separate model of a generator has not been applied there, instead, the server models contain a very simple generator consisting of a single place–transition pair. Such a trivial generator could be considered as an untimed deterministic one, according to Section 3. Petri nets with different extensions have been successfully used to analyze different types of systems, with web applications being one of them [24][23]. Rak et al. [25][24] presented a programming tool based on CPN that supports modeling and performance evaluation of system architectures for distributed web system environments. The studies [24,25][23][24] have generally used CPN Tools as a Petri net modeling tool with good results. On the superior level of timed stochastic model description [24][23] they defined the arrival process of the queueing network (two places). The timer place and transition constitute a clock-like structure that produces requests according to random, exponentially distributed frequency. These tokens are accumulated in a form of a timed multiset in a place and then forwarded into the queueing-based model of the web system. Tokens generated by the arrival process are transferred in sequence by models of web system layers. Each token is equipped with an attached data value, the token color. A similar simple model was prepared using QPN. Client think time is modeled in [26][25] by the Infinite Server scheduling strategy (queueing place). The number of clients is configured by initial marking in this place. The token represents a client’s requests in queueing place and generates a certain number of requests per second.
While there are a lot of articles regarding workload [18[17][31],32], little is known about generators with the expected by designers’ workload. WThe researchers address this issue by designing a new requests generator that identifies resources and users so it satisfies most of the system constructors. There are benefits of building specialized web system models with particular groups of users with similar behavior patterns as a workload, as opposed to using a single class for all users. The proposed workload models can be used in the construction of performance models used by many research domains.

2.1. Early CPN Models of Generators

Different formalisms may be used to model a generator of discrete events. Initially, Coloured Petri Nets (CPN) [27][26] were the ourresearchers' first choice, to facilitate integration with existing CPN models of the web systems. Such early models of CPN request generators have been thoroughly analyzed in [3]. The previous results are briefly summarized here to make this preseaperrch self-contained.
The basic CPN model of the generator is shown in Figure 1. The places Res and Users model available resources and users accordingly. When the transition Generator fires, one of the users and one of the resources are randomly chosen and the resulting token with appropriate marking is created in the place Stream.
Figure 1.
Simple CPN model of the workload generator.
Unfortunately, such a model is so simple that it cannot be realistic. It has been assumed that each user can freely choose each resource, which is a rare case. Let us consider the simple restriction—User1 may use both resources, whereas User2 is allowed to only use resource 2. Such a limitation may be imposed in different ways in the model, e.g., in the transition Generator guard or in the inscription of the arc connecting Generator transition with Stream place. The first case is shown in Figure 2a whereas the second one is given in Figure 2b.
Figure 2.
Restrictions in CPN models.
Both models restrain User2 from using resource 1. However, simulation results for them are completely different and may appear to be a bit counter-intuitive at a first glance. Let us assume that the transition Generator has been fired 100,000 times in each model. The exemplary results are shown in Figure 3.
Figure 3.
Simulation result for both models.
It could be seen that the distribution of the tokens in the Stream place, representing the mapping between users and resources in the generated stream, is different. In the first model, approximately half of the resulting tokens are related to the User1 and resource 1 case, twenty-five percent of the tokens represent User1 and resource 2 mapping, and finally, twenty-five percent connect User2 with resource 2. The total number of tokens in the Stream is 100,000 because each firing of the Generator transition produces one token.
In the second model each mapping (User1 and resource 1, User1– resource 2, User2–resource 2) have approximately equal representation in the resulting stream. However the total number of tokens in the Stream place does not sum to 100,000. Transition Generator has been fired 100,000 times, but if the restricted binding (User2 and resource 1) is randomly chosen, the inscription on the arc does not produce any token.
Detailed analysis of the causes of this phenomenon could be found in [3], so only brief conclusions are presented here. Such behavior is related to the exact execution semantics defined by CPN. In the first model, the restriction in the transition guard is applied during the binding of the tokens to the variables, before the transition fires. Thus, in approximately half of the cases, User1 and resource 1 are chosen, whereas resource 2 (for any user) is chosen in the other half. In the second model, the binding for firing the transition is randomly chosen without any restriction and after the transition fires, the resulting token with restricted results (User2, resource 1) is dropped if necessary. Unfortunately, it means that the modeling of the generators in the CPN formalism is not so trivial as it may seem, even for the basic case with two users and two resources, if precise control on the distribution of the mappings between users and resources is expected. It is necessary to ensure that the generated stream meets designer requirements. It would be troublesome for a more sophisticated case with numerous users, resources, and complex restrictions. Therefore, wthe researchers feel that more flexible models should be created to simplify future works. Considering the possible ways of applying the restrictions to CPN models as presented in Figure 2 and their different behavior (Figure 3), which may be counter-intuitive for an inexperienced engineer, one of the main reasons for the method proposed in the following sections is to facilitate the creation of complex models.

2.2. Early QPN Models of Generators

In the next step, based on Queueing Petri Nets (QPN) [33][32], wthe researchers modeled a generator of requests. The study results show that appropriately adjusting queueing Petri net models could help produce expected streams of tokens. The general mathematical generator model of QPN was defined in [2]. The second model (Figure 4a) allows uspeople to freely join the user tokens with the resource tokens to obtain any distribution (populations with the expected distribution of the output stream). Transition modes and firing weights (Figure 4b) could be used to model the probability of choosing the appropriate binding of the transition. The main logic lies in the firing of transitions with weights and handling of the token colors.
Figure 4.
QPN model.
The generator model is represented by a set of places (RESOURCES and STREAM), queueing place (USERS), and the immediate transition GENERATOR. Requests and resources are modeled by tokens of different colors. Place USERS generates tokens of requests as user1 and user2. RESOURCES generates tokens of resources such as resource1 and resource2. After acquiring a resource from the Res place, the requests are placed into the STREAM place. The STREAM place consists of user1resource1, user1resource2, user2resource1 and user2resource2 tokens. Place USERS has an infinite server queue with an initial population of tokens. Firing the GENERATOR transition creates a token in the STREAM place and assigns one of the resources to one of the users. STREAM is an ordinary place that contains the resulting stream of the mean token population. The restriction is modeled by setting the appropriate firing weight.
Detailed analysis of this approach could be found in [2], so only brief conclusions are presented here. The QPME tool [31][30] generates a report showing the predicted population for the individual model configuration. WThe researchers can find that the mean token population is appropriately distributed for tokens (user1resource1, user1resource2, user2resource1, user2resource2). Some examples can be found in the article [2]. The quantities of the generated tokens in the STREAM place for each mode are consistent with the appropriate firing weights. Therefore, various distributions of the tokens in the stream can easily be modeled by appropriate changes in the firing weights.

References

  1. Seshadri, K.; Pavana, C.; Sindhu, K.; Kollengode, C. Unsupervised Modeling of Workloads as an Enabler for Supervised Ensemble-based Prediction of Resource Demands on a Cloud; Verma, P., Charan, C., Fernando, X., Ganesan, S., Eds.; Advances in Data Computing, Communication and Security; Springer: Singapore, 2022; pp. 109–120.
  2. Rak, T.; Rzonca, D. Recommendations for Using QPN Formalism for Preparation of Incoming Request Stream Generator in Modeled System. Appl. Sci. 2021, 11, 11532.
  3. Rzonca, D.; Rzasa, W.; Samolej, S. Consequences of the Form of Restrictions in Coloured Petri Net Models for Behaviour of Arrival Stream Generator Used in Performance Evaluation; Gaj, P., Sawicki, M., Suchacka, G., Kwiecień, A., Eds.; Computer Networks; Springer International Publishing: Cham, Switzerland, 2018; pp. 300–310.
  4. Abad, C.L.; Yuan, M.; Cai, C.X.; Lu, Y.; Roberts, N.; Campbell, R.H. Generating request streams on Big Data using clustered renewal processes. Perform. Eval. 2013, 70, 704–719.
  5. Rak, T.; Żyła, R. Using Data Mining Techniques for Detecting Dependencies in the Outcoming Data of a Web-Based System. Appl. Sci. 2022, 12, 6115.
  6. Gonçalves, G.D.; Drago, I.; Vieira, A.B.; Couto da Silva, A.P.; Almeida, J.M.; Mellia, M. Workload models and performance evaluation of cloud storage services. Comput. Netw. 2016, 109, 183–199.
  7. St-Onge, C.; Benmakrelouf, S.; Kara, N.; Tout, H.; Edstrom, C.; Rabipour, R. Generic SDE and GA-Based Workload Modeling for Cloud Systems. J. Cloud Comput. 2021, 10, 6.
  8. Rak, T. Modeling Web Client and System Behavior. Information 2020, 11, 337.
  9. An, C.; Zhou, J.t.; Mou, Z. A Generic Arrival Process Model for Generating Hybrid Cloud Workload; Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H., Eds.; Computer Supported Cooperative Work and Social Computing; Springer: Singapore, 2019; pp. 100–114.
  10. Sun, J.; Zhao, H.; Mu, S.; Li, Z. Purchasing Behavior Analysis Based on Customer’s Data Portrait Model. In Proceedings of the 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; Volume 1, pp. 352–357.
  11. Liu, S.; Wang, J.; Wang, H.; Wang, H.; Liu, Y. WRT: Constructing Users’ Web Request Trees from HTTP Header Logs. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–7.
  12. Magalhães, D.; Calheiros, R.N.; Buyya, R.; Gomes, D.G. Workload Modeling for Resource Usage Analysis and Simulation in Cloud Computing. Comput. Electr. Eng. 2015, 47, 69–81.
  13. Daradkeh, T.; Agarwal, A.; Zaman, M.; S, R.M. Analytical Modeling and Prediction of Cloud Workload. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, QC, Canada, 14–23 June 2021; pp. 1–6.
  14. An, C.; Zhou, J.t. Resource Demand Forecasting Approach Based on Generic Cloud Workload Model. In Proceedings of the 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Guangzhou, China, 8–12 October 2018; pp. 554–563.
  15. Rizothanasis, G.; Carlsson, N.; Mahanti, A. Identifying User Actions from HTTP(S) Traffic. In Proceedings of the 2016 IEEE 41st Conference on Local Computer Networks (LCN), Dubai, United Arab Emirates, 7–10 November 2016; pp. 555–558.
  16. Ajwani, D.; Ali, S.; Katrinis, K.; Li, C.H.; Park, A.J.; Morrison, J.P.; Schenfeld, E. A Flexible Workload Generator for Simulating Stream Computing Systems. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, Singapore, 25–27 July 2011; pp. 409–417.
  17. Bikmukhamedov, R.F.; Nadeev, A.F. Multi-Class Network Traffic Generators and Classifiers Based on Neural Networks. In Proceedings of the 2021 Systems of Signals Generating and Processing in the Field of on Board Communications, Moscow, Russia, 16–18 March 2021; pp. 1–7.
  18. Guarnieri, T.; Drago, I.; Cunha, Í.; Almeida, B.; Almeida, J.M.; Vieira, A.B. Modeling large-scale live video streaming client behavior. Multimed. Syst. 2021, 27, 1101–1124.
  19. Curiel, M.; Pont, A. Workload Generators for Web-Based Systems: Characteristics, Current Status, and Challenges. IEEE Commun. Surv. Tutorials 2018, 20, 1526–1546.
  20. Braga, V.G.; Correa, S.L.; Cardoso, K.V.; Viana, A.C. Data-Driven Characterization and Modeling of Web Map System Workload. IEEE Access 2021, 9, 26983–27002.
  21. Gozhyj, A.; Kalinina, I.; Gozhyj, V.; Vysotska, V. Web Service Interaction Modeling with Colored Petri Nets. In Proceedings of the 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Metz, France, 18–21 September 2019; Volume 1, pp. 319–323.
  22. Gaur, N.; Joshi, P.; Jain, V.; Srivastava, R. Coloured Petri Nets Model for Web Architectures of Web and Database Servers. Int. J. Comput. Inf. Eng. 2015, 9, 2066–2075.
  23. Rak, T.; Samolej, S. Distributed Internet Systems Modeling Using TCPNs. In Proceedings of the International Multiconference on Computer Science and Information Technology, Wisla, Poland, 20–22 October 2008; pp. 515–522.
  24. Samolej, S.; Rak, T. Simulation and Performance Analysis of Distributed Internet Systems Using TCPNs. Inform.-J. Comput. Inform. 2009, 33, 405–415.
  25. Rak, T. Response Time Analysis of Distributed Web Systems Using QPNs. Math. Probl. Eng. 2015, 2015, 490835.
  26. Jensen, K.; Kristensen, L.M. Coloured Petri Nets: Modelling and Validation of Concurrent Systems; Springer: Berlin/Heidelberg, Germany, 2009.
  27. Rezig, S.; Achour, Z.; Rezg, N.; Kammoun, M.A. Supervisory control based on minimal cuts and Petri net sub-controllers coordination. Int. J. Syst. Sci. 2016, 47, 3425–3435.
  28. Rezig, S.; Rezg, N.; Hajej, Z. Online Activation and Deactivation of a Petri Net Supervisor. Symmetry 2021, 13, 2218.
  29. Bause, F. Queueing Petri Nets-A formalism for the combined qualitative and quantitative analysis of systems. In Proceedings of the 5th International Workshop on Petri Nets and Performance Models, Toulouse, France, 19–22 October 1993; pp. 14–23.
  30. Kounev, S.; Lange, K.D.; von Kistowski, J. Systems Benchmarking: For Scientists and Engineers; Springer: Berlin/Heidelberg, Germany, 2020.
  31. Patil, A.G.; Surve, A.R.; Gupta, A.K.; Sharma, A.; Anmulwar, S. Survey of synthetic traffic generators. In Proceedings of the 2016 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–27 August 2016; Volume 1, pp. 1–3.
  32. Rak, T. Performance Analysis of Distributed Internet System Models using QPN Simulation. In Proceedings of the Federated Conference on Computer Science and Information Systems (FedCSIS), Warsaw, Poland, 7–10 September 2014; Volume 2, pp. 769–774.
More
Video Production Service