Multi-Label Classification Based on Associations

Multi-Label Classification Based on Associations: Comparison

Please note this is a comparison between Version 2 by Catherine Yang and Version 3 by Catherine Yang.

Associative classification (AC) has been shown to outperform other methods of single-label classification for over 20 years. In order to create rules that are both more precise and simpler to grasp, AC combines the rules of mining associations with the task of classification.

associative classification
classification
machine learning
multi-label classification

1. Introduction

In data mining, classification is a common activity. The goal is to properly anticipate the class label of unseen instances using the rules or functions learned from a labeled set, or training set ^[1][2]. Many researchers ^{[3][4][5][6][7][8]} have been attracted to classification in recent decades, and have used a wide variety of learning approaches and strategies, including decision trees, neural networks, fuzzy logic, Bayesian and statistical approaches, rule-set induction, and more to create highly accurate classifiers ^[9]. In categorization, there are three major categories ^[10]. Each data point in the first two categories must match only one of the predefined classes. The third category ^[11], on the other hand, enables numerous class labels to be assigned to specific dataset instances. The first, referred to as a “binary classification” has just two class labels, but the second, referred to as a “multi-class classification” contains more than two ^[12][13]. The more general multi-label classification (MLC) system ^[11][14] is the third classification scheme. This study focuses on a particular categorization strategy that employs a single-label classification (SLC) to handle the multi-label problem.

Associative classification (AC) is one of the primary approaches that has been actively used in addressing the classification problem ^[15]. AC is a rule-set induction approach that uses the Association Rule Mining (ARM) task to solve the cassification issue ^[1]. In general, the AC approach has several distinguishable features over other learning approaches, such as the highly accurate rules produced by AC algorithms, the simplicity of representing the learned rules through the “IF-THEN” format, and its applicability to a wide range of real-life classification problems, i.e., medical diagnosis, e-mail phishing, fraud detection, and software defects ^[16]. Most AC-based methods have only been used for binary and multi-class classification problems ^[17]. In contrast, only a few efforts have been presented to apply AC in a broader form of classification termed MLC ^[16].

2. MLC

MLC is a general classification type with distinguishable features over conventional single-label classification (binary and multi-class classification) ^[18][19][20]. First, in MLC, an instance could be associated with more than one class label simultaneously, whereas single-label classification requires each instance to be associated with only one class label ^[21]. Second, because more than one class label could apply to the same instance simultaneously, the labels in MLC are not mutually exclusive to each other as they are in single-label classification ^[21]. Finally, the complexity of SLC is very low compared with MLC ^[22]. MLC has recently attracted the interest of numerous researchers due to its applicability to a wide variety of contemporary domains, including video and image annotation ^[23][24][25], classifying songs based on the invoked emotions ^[26], prediction of gene functionality ^[27][28][29], protein functionality detection ^[30][31], drug discovery ^[32], mining social networks ^[33][34][35], direct marketing ^[36], and Web mining ^[37]. Two main strategies are being used to address the MLC issue. The first strategy involves converting the input multi-label dataset into a single-label dataset or several single-label datasets. The modified dataset(s) are then used to train single-label classification algorithm ^[22]. This strategy has been referred to as the problem transformation method (PTM). Very few AC-based algorithms have been utilized as a basis classifier in this method, according to the literature ^[15]. The second method ^[6] extends a classification algorithm for an SLC to a dataset with multiple labels. This strategy is known as the algorithm adaptation method (AAM). Several single-label classification algorithms, including C4.5 ^[37], k-nearest neighbor (KNN) ^[38], back propagation ^[39], AdaBoost ^[40], and naive Bayes (NV) ^[41], have been modified to address the MLC issue. Unfortunately, according to the literature ^[15], no AC-based algorithm has been modified to address the MLC issue.

3. Utilizing AC in MLC

According to the previous studies, relatively few efforts to solve the MLC issue have used AC. Multi-class multi-label associative classification (MMAC) is among the first methods ^[42] to try to use AC in MLC. MMAC turns the original multi-label dataset into a single-label one by replicating each instance associated with more than one class label a number of times equals to the number of the class label it is associated with, using or without using a weight. Hence, the dataset becomes SL dataset but, with more instances than the original one. After that, MMAC applies any SL classifier such as CBA or msCBA on the newly transformed dataset. MMAC then generates its rules by combining the outcomes of single-label rules with the same antecedent ending with multi-label rules. Unfortunately, MMAC has only been tested on datasets with single label, and it may be too complicated if the original dataset has many labels as well as high number of instances ^[43]. A novel multi-label method based on AC is presented in ^[44]. The multi-label classifier based on associative Classification (MCAC) developed a revolutionary rule discovery approach that creates multi-label rules from a single-label dataset without the need for learning. These multi-label rules reflect important information that most earlier AC algorithms often disregard. The correlative lazy associative classifier (CLAC) method, described in ^[45], is a hybrid algorithm that combines the principles of AC and lazy learning. CLAC generates classification association rules (CARs) that are graded according to their support and confidence ratings. Each class predicted by CLAC is immediately modified as a new characteristic to predict a different class. In comparison to the BoosTexter method, CLAC performed well on three textual datasets. The authors of ^[46] presented an identical AC-based method to the MMAC algorithm. In contrast to MMAC, the suggested method has been examined using one multi-label dataset (Scene) and emphasizes the importance of adopting AC in addressing the MLC issue.

4. CBA and msCBA Algorithms

CBA is one of the earliest algorithms that merge the ARM and classification tasks. CBA was introduced in ^[47]. Since then, several more techniques based on the combination of ARM and classification have been presented. The MMAC algorithm ^[42] and the multi-class associative classification (MAC) algorithm ^[48] are examples of algorithms that adhere to the AC methodology. CBA employs the a priori method in a classification dataset by the use of three key phases. At first, all continuous attributes are discretized.discretization is the step of converting any continuous variable or attribute into a discrete one. This step is compulsory for any AC-based classifier. Then, CARs are generated. CARs consider rules with arbitrary combinations of elements on antecedent (the left-hand side) and a single class on the consequent (the right-hand side). CARs are chosen using two metrics (support and confidence). The objective of the final phase is to construct a classifier using the best CARs ^[49]. CBA was subsequently enhanced in ^[50] by removing two flaws in the original CBA algorithm. The first problem is the use of a single minsup (minimum support) threshold value, which may result in an unbalanced class distribution. Using several minsup criteria, the modified version has addressed this problem. The exponential increase in the number of rules issued by CBA is the second flaw of the original CBA. This problem was fixed by combining CBA to a decision tree, as in C4.5, resulting in more precise rules. The modified version of CBA is referred to as CBA2 or msCBA, which is short for multiple support classification based on associations. Algorithm 1 illustrates the first CBA algorithm. Although msCBA demonstrated higher performance in single-label classification compared to other classifiers from different learning strategies ^[16], it is incapable of handling multi-label datasets. The msCBA method assumes that each instance input has a single class label associated with it. Hence, it generates single-label rules with a single class label as the rule’s consequence. When extending the msCBA method to accommodate multi-label datasets, this assumption should thus be discarded. In addition, the msCBA method captures the global relationships between features (attributes) and class labels, despite the fact that local dependencies and associations outperform global dependencies and associations ^[51][52].

Algorithm 1 CBA algorithm.

F_{1} = {l a r g e 1 - r u l e i t e m s}

;

C A R_{k} = g e n R u l e s (F_{1})

;

p r C A R_{1} = p r u n e R u l e s (C A R_{1})

;

4: for

(k = 2; F_{k - 1} \neq ϕ; k + +)

C_{k} = c a n d i d a t e G e n t (F_{k - 1})

;

6: for each data case

d \in D

C

=ruleSubset(

C_{k}

,d);

8: for each candidate

c \in C

9: c.condsupCont++;

10: if

d . c l a s s = c . c l a s s

then

11: c.rulesupCount++;

12: end if

13: end

14: end

15:

F_{k} = {C \in C_{k} | c . r u l e s u p C o u n t \geq m i n s u p}

;

16:

C A R_{k} = g e n R u l e s (F_{k})

;

17:

p r C A R_{k} = p r u n e R u l e s (C A R_{k})

;

18: end for

19:

C A R_{s} = U_{k} C A R_{k}

;

20:

p r C A R_{s} = U_{k} p r C A R_{k}

;

References

Hadi, W.; Al-Radaideh, Q.A.; Alhawari, S. Integrating associative rule-based classification with Naïve Bayes for text classification. Appl. Soft Comput. 2018, 69, 344–356.
Zeng, C.; Zhou, W.; Li, T.; Shwartz, L.; Grabarnik, G.Y. Knowledge guided hierarchical multi-label classification over ticket data. IEEE Trans. Netw. Serv. Manag. 2017, 14, 246–260.
Huang, J.; Li, G.; Wang, S.; Xue, Z.; Huang, Q. Multi-label classification by exploiting local positive and negative pairwise label correlation. Neurocomputing 2017, 257, 164–174.
Mohana, G.; Chitra, S. Design and development of an efficient hierarchical approach for multi-label protein function prediction. Biomed. Res. Health Sci. Bio Converg. Technol. Ed. II 2017, 370–379. Available online: https://www.semanticscholar.org/paper/Design-and-development-of-an-efficient-hierarchical-MohanaPrabha-Chitra/a8b4c905f2d083801b2a7b06356eed9ad49be797 (accessed on 11 February 2023).
Sousa, R.; Gama, J. Multi-label classification from high-speed data streams with adaptive model rules and random rules. Prog. Artif. Intell. 2018, 7, 177–187.
Xu, S.; Yang, X.; Yu, H.; Yu, D.J.; Yang, J.; Tsang, E.C. Multi-label learning with label-specific feature reduction. Knowl.-Based Syst. 2016, 104, 52–61.
Gamallo, P.; Almatarneh, S. Naive-Bayesian Classification for Bot Detection in Twitter. In Proceedings of the CLEF, Lugano, Switzerland, 9–12 September 2019.
Almatarneh, S.; Gamallo, P.; ALshargabi, B.; Al-Khassawneh, Y.; Alzubi, R. Comparing traditional machine learning methods for COVID-19 fake news. In Proceedings of the 2021 22nd International Arab Conference on Information Technology (ACIT), Muscat, Oman, 21–23 December 2021; IEEE: New York, NY, USA, 2021; pp. 1–4.
Lin, Q.; Man, Z.; Cao, Y.; Wang, H. Automated Classification of Whole-Body SPECT Bone Scan Images with VGG-Based Deep Networks. Int. Arab. J. Inf. Technol. 2023, 20, 1–8.
Alazaidah, R.; Thabtah, F.; Al-Radaideh, Q. A multi-label classification approach based on correlations among labels. Int. J. Adv. Comput. Sci. Appl. 2015, 6, 52–59.
Gibaja, E.; Ventura, S. A tutorial on multilabel learning. ACM Comput. Surv. 2015, 47, 1–38.
Suri, J.S.; Bhagawati, M.; Paul, S.; Protogerou, A.D.; Sfikakis, P.P.; Kitas, G.D.; Khanna, N.N.; Ruzsa, Z.; Sharma, A.M.; Saxena, S.; et al. A powerful paradigm for cardiovascular risk stratification using multiclass, multi-label, and ensemble-based machine learning paradigms: A narrative review. Diagnostics 2022, 12, 722.
Hegazy, H.I.; Tag Eldien, A.S.; Tantawy, M.M.; Fouda, M.M.; TagElDien, H.A. Real-time locational detection of stealthy false data injection attack in smart grid: Using multivariate-based multi-label classification approach. Energies 2022, 15, 5312.
El-Hasnony, I.M.; Elzeki, O.M.; Alshehri, A.; Salem, H. Multi-label active learning-based machine learning model for heart disease prediction. Sensors 2022, 22, 1184.
Abdelhamid, N.; Jabbar, A.A.; Thabtah, F. Associative classification common research challenges. In Proceedings of the 2016 45th International Conference on Parallel Processing Workshops (ICPPW), Philadelphia, PA, USA, 16–19 August 2016; IEEE: New York, NY, USA, 2016; pp. 432–437.
Abdelhamid, N.; Thabtah, F. Associative classification approaches: Review and comparison. J. Inf. Knowl. Manag. 2014, 13, 1450027.
Li, B.; Li, H.; Wu, M.; Li, P. Multi-label Classification based on Association Rules with Application to Scene Classification. In Proceedings of the 2008 The 9th International Conference for Young Computer Scientists, Hunan, China, 18–21 November 2008; pp. 36–41.
Alazaidah, R.; Ahmad, F.K.; Mohsen, M.F.M. A comparative analysis between the three main approaches that are being used to. Int. J. Soft Comput. 2017, 12, 218–223.
Massidda, L.; Marrocu, M.; Manca, S. Non-intrusive load disaggregation by convolutional neural network and multilabel classification. Appl. Sci. 2020, 10, 1454.
Wu, X.; Gao, Y.; Jiao, D. Multi-label classification based on random forest algorithm for non-intrusive load monitoring system. Processes 2019, 7, 337.
Alluwaici, M.; Junoh, A.K.; Alazaidah, R. New problem transformation method based on the local positive pairwise dependencies among labels. J. Inf. Knowl. Manag. 2020, 19, 2040017.
Alluwaici, M.; Junoh, A.K.; Ahmad, F.K.; Mohsen, M.F.M.; Alazaidah, R. Open research directions for multi label learning. In Proceedings of the 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang Island, Malaysia, 28–29 April 2018; pp. 125–128.
Dimou, A.; Tsoumakas, G.; Mezaris, V.; Kompatsiaris, I.; Vlahavas, I. An empirical study of multi-label learning methods for video annotation. In Proceedings of the 2009 Seventh International Workshop on Content-Based Multimedia Indexing, Crete, Greece, 3–5 June 2009; IEEE: New York, NY, USA, 2009; pp. 19–24.
Peters, S.; Denoyer, L.; Gallinari, P. Iterative annotation of multi-relational social networks. In Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, Denmark, 9–11 August 2010; IEEE: New York, NY, USA, 2010; pp. 96–103.
Wang, J.; Neskovic, P.; Cooper, L.N. Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recognit. Lett. 2007, 28, 207–213.
Trohidis, K.; Tsoumakas, G.; Kalliris, G.; Vlahavas, I.P. Multi-label classification of music into emotions. In Proceedings of the ISMIR, Philadelphia, PA, USA, 14–18 September 2008; Volume 8, pp. 325–330.
Barutcuoglu, Z.; Schapire, R.E.; Troyanskaya, O.G. Hierarchical multi-label prediction of gene function. Bioinformatics 2006, 22, 830–836.
Elisseeff, A.; Weston, J. A kernel method for multi-labelled classification. In Advances in Neural Information Processing Systems 14 (NIPS 2001); Dietterich, T., Becker, S., Ghahramani, Z., Eds.; The MIT Press: Cambridge, MA, USA, 2001; Volume 14.
Skabar, A.; Wollersheim, D.; Whitfort, T. Multi-label classification of gene function using MLPs. In Proceedings of the 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, BC, Canada, 16–21 July 2006; IEEE: New York, NY, USA, 2006; pp. 2234–2240.
Chan, A.; Freitas, A.A. A new ant colony algorithm for multi-label classification with applications in bioinfomatics. In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, WA, USA, 8–12 July 2006; pp. 27–34.
Diplaris, S.; Tsoumakas, G.; Mitkas, P.A.; Vlahavas, I. Protein classification with multiple algorithms. In Proceedings of the Advances in Informatics: 10th Panhellenic Conference on Informatics, PCI 2005, Volas, Greece, 11–13 November 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 448–456.
Kawai, Y.; Fujii, Y.; Akimoto, K.; Takahashi, M. Evaluation of Serum Protein Binding by Using in Vitro Pharmacological Activity for the Effective Pharmacokinetics Profiling in Drug Discovery. Chem. Pharm. Bull. 2010, 58, 1051–1056.
Krohn-Grimberghe, A.; Drumond, L.; Freudenthaler, C.; Schmidt-Thieme, L. Multi-relational matrix factorization using bayesian personalized ranking for social network data. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, Seattle, WA, USA, 8–12 February 2012; pp. 173–182.
Tang, L.; Liu, H. Community Detection and Mining in Social Media; Morgan & Claypool Publishers: San Rafael, CA, USA, 2010.
Soonsiripanichkul, B.; Murata, T. Domination dependency analysis of sales marketing based on multi-label classification using label ordering and cycle chain classification. In Proceedings of the 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 10–14 July 2016; IEEE: New York, NY, USA, 2016; pp. 1048–1053.
Nassar, O.A.; Al Saiyd, N.A. The integrating between web usage mining and data mining techniques. In Proceedings of the 2013 5th International Conference on Computer Science and Information Technology, Amman, Jordan, 27–28 March 2013; IEEE: New York, NY, USA, 2013; pp. 243–247.
Quinlan, J.R. Combining instance-based and model-based learning. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, USA, 27–29 July 1993; pp. 236–243.
Zhang, M.L.; Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007, 40, 2038–2048.
Zhang, M.L.; Zhou, Z.H. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 2006, 18, 1338–1351.
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139.
Zhang, M.L.; Peña, J.M.; Robles, V. Feature selection for multi-label naive Bayes classification. Inf. Sci. 2009, 179, 3218–3229.
Thabtah, F.A.; Cowling, P.; Peng, Y. MMAC: A new multi-class, multi-label associative classification approach. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), Brighton, UK, 1–4 November 2004; IEEE: New York, NY, USA, 2004; pp. 217–224.
Alazaidah, R.; Ahmad, F.K. Trending challenges in multi label classification. Int. J. Adv. Comput. Sci. Appl. 2016, 7, 127–131.
Abdelhamid, N.; Ayesh, A.; Hadi, W. Multi-label rules algorithm based associative classification. Parallel Process. Lett. 2014, 24, 1450001.
Veloso, A.; Meira, W.; Gonçalves, M.; Zaki, M. Multi-label lazy associative classification. In Proceedings of the Knowledge Discovery in Databases (PKDD 2007: 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, 17–21 September 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 605–612.
Li, X.; Qin, D.; Yu, C. ACCF: Associative classification based on closed frequent itemsets. In Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Shandong, China, 18–20 October 2008; IEEE: New York, NY, USA, 2008; Volume 2, pp. 380–384.
Liu, B.; Hsu, W.; Ma, Y. Integrating classification and association rule mining. In Proceedings of the Kdd, New York, NY, USA, 27–31 August 1998; Volume 98, pp. 80–86.
Abdelhamid, N.; Ayesh, A.; Thabtah, F.; Ahmadi, S.; Hadi, W. MAC: A multiclass associative classification algorithm. J. Inf. Knowl. Manag. 2012, 11, 1250011.
Alazaidah, R.; Almaiah, M.A. Associative classification in multi-label classification: An investigative study. Jordanian J. Comput. Inf. Technol. 2021, 7. Available online: https://www.proquest.com/openview/9a1e4545ef6dd7deea31b808f011119c/1?pq-origsite=gscholar&cbl=5500744 (accessed on 11 February 2023).
Liu, B.; Ma, Y.; Wong, C.K. Improving an association rule based classifier. In Proceedings of the Principles of Data Mining and Knowledge Discovery: 4th European Conference, PKDD 2000, Lyon, France, 13–16 September 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 504–509.
Huang, S.J.; Zhou, Z.H. Multi-label learning by exploiting label correlations locally. In Proceedings of the AAAI Conference on Artificial Intelligence, Toronto, ON, US, 22–26 July 2012; Volume 26, pp. 949–955.
Alazaidah, R.; Ahmad, F.K.; Mohsin, M. Multi label ranking based on positive pairwise correlations among labels. Int. Arab J. Inf. Technol. 2020, 17, 440–449.