Supervised Log Anomaly with Probabilistic Polynomial Approximation

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Devharsh Trivedi	--	1916	2023-10-20 23:34:44	\|
2	format correct	Catherine Yang	Meta information modification	1916	2023-10-23 03:30:55	\|

Audit and Security log collection and storage are essential for organizations worldwide to recognize security breaches and are required by law. Logs often contain sensitive information about an organization or its customers. Fully Homomorphic Encryption (FHE) allows calculations on encrypted data, thus very useful for privacy-preserving tasks such as log anomaly detection. While word-wise FHE schemes can perform additions and multiplications, complex functions such as Sigmoid need to be approximated. Probabilistic polynomial approximations using a Perceptron can achieve lower errors compared to deterministic approaches like Taylor and Chebyshev.

sigmoid function approximation private machine learning fully homomorphic encryption log anomaly detection supervised machine learning probabilistic polynomial approximation

1. Anomaly Detection

Information security tools like the Intrusion Detection System (IDS), Intrusion Prevention System (IPS), and Security Information and Event Management (SIEM) are designed to help organizations defend against cyberattacks. A Security Operations Center (SOC) uses these security tools to analyze logs collected from endpoints, such as computers, servers, and mobile devices. The logs collected from endpoints are typically unstructured textual data. The logs can contain information about system events, user activity, and security incidents. These data can be challenging to analyze manually. SIEM tools can help automate the analysis of these logs and identify potential threats. SIEM tools collect logs from various sources, known as Security Analytics Sources (SASs). An SAS can be a mobile or stationary host or an information and data security tool such as an IDS. SIEM tools use data to monitor for security threats in near-real time. If a threat is detected, the SIEM tool can generate an alert and take appropriate action, such as blocking traffic or isolating an infected system. The SOC uses this information to identify anomalies and potential threats. The SOC may generate an alert to notify the appropriate personnel (SOC analyst/administrator) if an anomaly is detected.

Enterprises frequently employ a third-party cloud vendor for the SOC. Third-party cloud services lessen the complexity and deliver flexibility for organizations. Nonetheless, Cloud Service Consumers (CSCs) must commission their data---and their customers' data---to Cloud Service Providers (CSPs), who are often incentivized to monetize these data. Meanwhile, ordinances such as the US Consumer Online Privacy Rights Act (COPRA)^[1], the US State of California Consumer Privacy Act (CCPA)^[2], and the EU General Data Protection Regulation (GDPR)^[3] strive to safeguard consumers' privacy. Non-compliant institutions are subjected to stringent fines and deteriorated reputations. This outcome is a trade-off between data utility and privacy.

Exporting log data to an SIEM deployed on a third-party CSP is perilous, as the CSP requires access to plaintext (unencrypted) log data for alert generation. Moreover, the CSP may have adequate incentives to accumulate user data. These data are stored in the CSP's servers and thus encounter diverse privacy and security threats like data leakage and the misuse of information^[4]^[5]^[6]^[7]^[8]^[9]. Thus, shielding these logs' privacy and confidentiality is crucial. Fully Homomorphic Encryption (FHE) permits CSCs to ensure privacy without sabotaging their ability to attain insights from their data.

Traditional cloud storage and computation approaches using contemporaneous cryptography mandate that customer data be decrypted before operating on them. Thus, security policies are deployed to avert unauthorized admission to decrypted data. CSCs must entrust the Access Control Policies (ACPs) incorporated by their CSPs for data privacy (Figure 1). With FHE, data privacy is accomplished by the CSC via cryptography, leveraging rigid mathematical proofs. Consequently, the CSP will not be admitted to unencrypted customer data for computation and storage without a valid Secret Key (SK).

Traditional cloud model (left) vs. FHE cloud model (right).

Figure 1. Traditional cloud model (left) vs. FHE cloud model (right).

FHE allows calculations to be performed on encrypted data without decrypting them first. The results of these computations are stored in an encrypted form. Still, when decrypted, they are equivalent to the results that would have been obtained if the computations had been performed on the unencrypted data. Plaintexts are unencrypted data, while ciphertexts are encrypted data. FHE can enable privacy-preserving storage and computation and process encrypted data in commercial cloud environments. It is a promising technology with a wide range of potential applications.

2. Related Work

For privacy-preserving log anomaly detection, one can use a hardware-based solution (e.g., a Trusted Execution Environment (TEE)) or a software-based approach (e.g., FHE)^[10]^[11]. SGX-Log^[12] and Custos^[13] achieved private log anomaly detection using a TEE with Intel Software Guard Extensions (SGX)^[14]. However, TEEs have limitations as to how much data can be stored. For example, Intel SGX has a limit of 128 MB. Hence, bit-wise FHE schemes like Fast Fully Homomorphic Encryption over the Torus (TFHE)^[15] or word-wise FHE schemes like Brakerski-Fan-Vercauteren (BFV)^[16]^[17] and Cheon-Kim-Kim-Song (CKKS)^[18] are better for more significant amounts of data. Concrete-ML from Zama^[19]^[20] used TFHE, which is efficient for smaller-scale arithmetic. Still, it is inefficient for more significant arithmetic operations (while amortized performance in CKKS can be improved with batching). For word-wise FHE schemes, one can employ BFV for integers and CKKS for approximate arithmetic. Hence, for Machine Learning (ML) tasks, CKKS is a better choice.

Boudguiga et al.^[21] used BFV for Support Vector Machines (SVM) with a linear kernel. They experimentally calculated the best scaling factor value to convert floats to integers for better accuracy, which is not required in CKKS. (SigML^[22] and SigML++^[23] used CKKS for Logistic Regression (LR) and SVM.) Boudguiga et al.^[21] examined the feasibility of using FHE to furnish a privacy-preserving log management architecture. They utilized SVM with a linear kernel to assess the FHE classification of Intrusion Detection System (IDS) alerts from the NSL-KDD dataset^[24]^[25]. In their scheme, they encrypted the input data from an SAS using the BFV scheme and performed FHE calculations on the encrypted data using the SIEM weights in plaintext. The encrypted results for each log entry were then sent back to the SAS for decryption. However, this approach could be vulnerable to inference attacks by malicious SAS, such as attribute inference, membership inference, and model inversion attacks. ("Aggregate'' configuration presented in SigML^[22] and SigML++^[23] helps prevent most of these attacks, as it only sends a total anomaly score (sum) per block instead of predictions or labels per input, thus minimizing the data inferred by the attacker.)

SigML^[22] compares three approximations (using deterministic methods such as Taylor^[26] and Chebyshev^[27]) of the sigmoid function: σ¹(x), σ³(x), and σ⁵(x). These approximations are used for a Logistic Regression (LR) and Support Vector Machine (SVM) model. The authors observed that the LR and SVM models trained from scikit-learn^[28] did not perform well with the sigmoid activation for the "Aggregate'' configuration. Therefore, they designed Sigmoid-LR (σ_LR) to improve performance. Sigmoid-LR uses a kernel A = X W + b to reduce the errors of σ(A) with the learning rate r_learn and the number of iterations r_iter. The inputs and labels are X, Y ∈ [0,1]. SigML++^[23] improves the results of SigML^[22] with LR and SVM models using a novel probabilistic ANN approximation. SigML++^[23] also evaluates third-order polynomials in the intervals [-10,10] and [-50,50].

3. Probabilistic Polynomial

While Artificial Neural Networks (ANNs) are known for their universal function approximation properties, they are often treated as black boxes and used to calculate the output value. SigML++^[23] (and Trivedi^[29]) proposes the use of a basic three-layer Perceptron (Figure 2) consisting of an input layer, a hidden layer, and an output layer, with both hidden and output layers having linear activations to generate the coefficients for an approximation polynomial of a given order. In this architecture, the input layer is dynamic, with the input nodes corresponding to the desired polynomial degrees. While having a variable number of hidden layers is possible, the authors fix it at a single layer with a single node to minimize the computation. They show coefficient calculations for a third-order polynomial (d=3); a univariate function f(x) = y; and an input x, actual output y, and predicted output y_out. The input layer weights are {w₁, w₂, ..., w_d} = {w₁, w₂, w₃} = {x, x², x³}, and the biases are {b₁, b₂, b₃} = b_h. Thus, the output of the hidden layer is y_h = w₁ x + w₂ x² + w₃ x³ + b_h.

Polynomial approximation using ANN.

Figure 2. Polynomial approximation using ANN.^[23]^[29]

The predicted output is calculated by
y_out= w_out⋅y_h+ b_out= w₁⋅w_out⋅x + w₂⋅w_out⋅x² + w₃⋅w_out⋅x³ + (b_h⋅w_out + b_out)

The layer weights {w₁ w_out, w₂ w_out, w₃ w_out} are the coefficients for the approximating polynomial of order three, and the constant term is b_h w_out + b_out.

Since the predicted output (y_out) is probabilistic, it must be fine-tuned with hyperparameter tuning, as incorrect results lead to erroneous (inefficient) approximations.

Trivedi^[29] provided third and seventh-degree polynomial approximations for univariate Sign(x) ∈ {-1,0,1} and Compare(a-b) ∈ {0,1/2,1} functions in the intervals [-1,1] and [-5,5]. Trivedi^[29] empirically showed that their novel ANN polynomials improved up to 15% accuracy (regarding losses) over Chebyshev's.

4. Sigmoid Approximation

Barring message expansion and noise growth, applying the sigmoid activation function is a substantial challenge in implementing ML with FHE. The sigmoid function is used in LR and SVM during classification, so the authors in SigML^[22] and SigML++^[23] determined to make it homomorphic. They further describe techniques to approximate this activation function with a polynomial for word-wise FHE and compare various polynomial approximations in terms of accuracy, precision, recall, F1-score, and the Σ-ratio of the predicted sum from sigmoid values to the sum of all actual binary labels for the test dataset. They denote M^d_i, where M is an approximation method like Taylor (T)^[26], Chebyshev (C)^[27], Remez (R)^[30], or ANN (A); d is the degree; and i is the interval [-i, i] of the polynomial. They approximate the class C[a,b] of continuous functions on the interval [a,b] by order-n polynomials in P_n using the L^∞-norm to measure the fit. This is called minimax polynomial approximation since the best (or minimax) approximation solves

\begin{equation}
p^*_n = \arg ~ \min_{p_n \in \mathcal{P}_n} ~ \max_{a \leq x \leq b} ~ |f(x) - p_n(x)|
\end{equation}

A minimax approximation is a technique to discover the polynomial p as shown in the above equation, i.e., the Remez algorithm^[30] is an iterative minimax approximation and outputs the following results^[31] for the interval [-5, 5] and order three:
\begin{equation}
\mathbf{R^3_5(x)} = 0.5 + 0.197x - 0.004x^3
\end{equation}

The Taylor series (around point 0) of degree three is given by
\begin{equation}
\mathbf{T^3(x)} = 0.5 + 0.25x - 0.0208333x^3
\end{equation}

The Chebyshev series of degree three for the interval [-10,10] is
\[0.5 + 0.139787x + (3.03084e-26)x^2 - 0.00100377x^3\]

Omit the term for x² to obtain
\begin{equation}
\mathbf{C^3_{10}(x)} = 0.5 + 0.139787x - 0.00100377x^3
\end{equation}

Similarly, obtain the Chebyshev series of degree three for the interval [-50,50]
\begin{equation}
\mathbf{C^3_{50}(x)} = 0.5 + 0.0293015x - (8.65914e-6)x^3
\end{equation}

Derive the ANN polynomials of degree three for [-10,10]
\begin{equation}
\mathbf{A^3_{10}(x)} = 0.49961343 + 0.12675145x - 0.00087002286x^3
\end{equation}
and for the interval [-50,50]
\begin{equation}
\mathbf{A^3_{50}(x)} = 0.49714848 + 0.026882438x - (7.728304e-06)x^3
\end{equation}

Table 1. Polynomial approximation losses for the intervals [-10, 10] and [-50, 50].

Interval	Method	MAE	MSLE	Huber	Hinge	Logcosh
[-10, 10]	C³₁₀	0.0793	0.0020	0.0039	0.5593	0.0039
[-10, 10]	A³₁₀	0.0691	0.0024	0.0031	0.5646	0.0031
[-50, 50]	C³₅₀	0.1363	0.0115	0.0138	0.5475	0.0136
[-50, 50]	A³₅₀	0.1255	0.0124	0.0132	0.5534	0.0131

SigML++^[23] compared the Chebyshev and ANN approximations for the sigmoid functions, as shown in Table 1. They calculated the Mean Absolute Error (MAE); Mean Squared Log Error (MSLE); and Huber, Hinge, and Logcosh losses^[32]^[33] for the Chebyshev polynomials described in C³₁₀, C³_50, and ANN polynomials from A³₁₀, A³₅₀.

ANN losses relative to Chebyshev losses for the intervals [-10, 10] and [-50, 50].

Figure 3. ANN losses relative to Chebyshev losses for the intervals [-10, 10] and [-50, 50].

A³₁₀ recorded an MAE loss of 0.0691 compared to 0.0793 for C³₁₀. The lower losses (closer to 0) reflect fewer errors, showing that a better approximation was achieved using the ANN approach. Comparing loss ratios of ANN over Chebyshev yields 0.0691/0.0793=0.8712, SigML++^[23] observed a ~14% improvement (Figure 3).

References

S.3195 - Consumer Online Privacy Rights Act . Congress.GOV. Retrieved 2023-10-20
TITLE 1.81.5. California Consumer Privacy Act of 2018 . leginfo.legislature.ca.gov. Retrieved 2023-10-20
Directive 95/46/EC (General Data Protection Regulation) . eur-lex.europa.eu. Retrieved 2023-10-20
Zakir Durumeric; Zane Ma; Drew Springall; Richard Barnes; Nick Sullivan; Elie Bursztein; Michael Bailey; J. Alex Halderman; Vern Paxson. The Security Impact of HTTPS Interception; Internet Society: Reston, VA, United States, 2017; pp. 1-14.
Kaspersky’s approach toward data processing . usa.kaspersky.com. Retrieved 2023-10-20
Israel hacked Kaspersky, then tipped the NSA that its tools had been breached . www.washingtonpost.com. Retrieved 2023-10-20
How Israel Caught Russian Hackers Scouring the World for U.S. Secrets . www.nytimes.com. Retrieved 2023-10-20
AVG can sell your browsing and search history to advertisers . www.wired.co.uk. Retrieved 2023-10-20
Is Your Antivirus Software Spying on You? . restoreprivacy.com. Retrieved 2023-10-20
Decoding 5 Crypto Acronyms: MPC, ZK, FHE, TEE And HSM . www.forbes.com. Retrieved 2023-10-20
MPC, FHE, DP, ZKP, TEE and where Partisia Blockchain fits in . medium.com. Retrieved 2023-10-20
Karande, V.M., Bauman, E., Lin, Z., & Khan, L. (2017). SGX-Log: Securing System Logs With SGX. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security.
Paccagnella, R., Datta, P., Hassan, W.U., Bates, A., Fletcher, C.W., Miller, A.K., & Tian, D.J. (2020). Custos: Practical Tamper-Evident Auditing of Operating Systems Using Trusted Execution. Network and Distributed System Security Symposium.
What Is Intel® SGX? . www.intel.com. Retrieved 2023-10-20
Ilaria Chillotti; Nicolas Gama; Mariya Georgieva; Malika Izabachène. Faster Fully Homomorphic Encryption: Bootstrapping in Less Than 0.1 Seconds; Springer Science and Business Media LLC: Dordrecht, GX, Netherlands, 2016; pp. 3-33.
Zvika Brakerski. Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP; Springer Science and Business Media LLC: Dordrecht, GX, Netherlands, 2012; pp. 868-886.
Fan, J., & Vercauteren, F. (2012). Somewhat Practical Fully Homomorphic Encryption. IACR Cryptol. ePrint Arch., 2012, 144.
Cheon, J.H., Kim, A., Kim, M., & Song, Y. (2017). Homomorphic Encryption for Arithmetic of Approximate Numbers. International Conference on the Theory and Application of Cryptology and Information Security.
Fréry, J., Stoian, A., Bredehoft, R., Montero, L., Kherfallah, C., Chevallier-Mames, B., & Meyre, A. (2023). Privacy-Preserving Tree-Based Inference with Fully Homomorphic Encryption. ArXiv, abs/2303.01254.
What is Concrete ML? . docs.zama.ai. Retrieved 2023-10-20
Boudguiga, A., Stan, O., Sedjelmaci, H., & Carpov, S. (2020). Homomorphic Encryption at Work for Private Analysis of Security Logs. International Conference on Information Systems Security and Privacy.
Trivedi, D., Boudguiga, A., Triandopoulos, N. (2023). SigML: Supervised Log Anomaly with Fully Homomorphic Encryption. In: Dolev, S., Gudes, E., Paillier, P. (eds) Cyber Security, Cryptology, and Machine Learning. CSCML 2023. Lecture Notes in Computer Science, vol 13914. Springer, Cham. https://doi.org/10.1007/978-3-031-34671-2_26
Trivedi D, Boudguiga A, Kaaniche N, Triandopoulos N. SigML++: Supervised Log Anomaly with Probabilistic Polynomial Approximation. Cryptography. 2023; 7(4):52. https://doi.org/10.3390/cryptography7040052
M. Tavallaee, E. Bagheri, W. Lu, and A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set,” Submitted to Second IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), 2009.
NSL-KDD dataset . www.unb.ca. Retrieved 2023-10-20
George Arfken; Jon Mathews; Mathematical Methods for Physicists. Am. J. Phys. 1972, 40, 642-642.
Frederick N. Fritsch; William T. Vetterling; Saul A. Teukolsky; William H. Press; Brian P. Flannery; Numerical Recipes Example Book (FORTRAN).. Math. Comput. 1988, 50, 348.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Louppe, G., Prettenhofer, P., Weiss, R., Weiss, R.J., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. ArXiv, abs/1201.0490.
Trivedi, D. (2023). Brief Announcement: Efficient Probabilistic Approximations for Sign and Compare. In: Dolev, S., Schieber, B. (eds) Stabilization, Safety, and Security of Distributed Systems. SSS 2023. Lecture Notes in Computer Science, vol 14310. Springer, Cham. https://doi.org/10.1007/978-3-031-44274-2_21
Remez, E. Y. (1934). Sur le calcul effectif des polynomes d’approximation de Tschebyscheff. CR Acad. Sci. Paris, 199(2), 337-340.
Chen, H., Gilad-Bachrach, R., Han, K., Huang, Z., Jalali, A., Laine, K., & Lauter, K.E. (2018). Logistic regression over encrypted data from fully homomorphic encryption. BMC Medical Genomics, 11.
Module: tf.keras.losses . www.tensorflow.org. Retrieved 2023-10-20
sklearn.metrics: Metrics . scikit-learn.org. Retrieved 2023-10-20

©Text is available under the terms and conditions of the Creative Commons-Attribution ShareAlike (CC BY-SA) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Computer Science, Interdisciplinary Applications

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Devharsh Trivedi

Aymen Boudguiga

Nesrine Kaaniche

Nikos Triandopoulos

View Times: 202

Update Date: 23 Oct 2023

Table of Contents

Video Upload Options

Confirm