Graph Clustering Algorithms | Encyclopedia MDPI

Graph Clustering Algorithms: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

Graph clustering has received considerable attention, and its applications are numerous, ranging from the detection of social communities to the clustering of computer networks. It is classified as an NP-class problem, and several algorithms have been proposed with specific objectives. There also exist various quality metrics for evaluating them. Having clusters with the required density can be beneficial because it permits the effective deployment of resources.

edge degree
graph clustering
mean relative density deviation coefficient (MRDDC)
overlapping clustering
partitioning clustering
relative density

1. Introduction

The rapid increase in the volume of data and the popularity of artificial intelligence (AI) over the last few decades have led to the emergence of the popular fields known as data science and knowledge discovery. These fields provide techniques aimed at uncovering hidden structures and useful patterns within the available information. In its simplest terms, clustering is a technique that separates information into distinct groups in which intra-similarities are maximized while inter-similarities are minimized ^[1]. When information can be represented as a graph or a network, then graph clustering, a sub-field of clustering, has a direct application by analyzing the connectivity and information within nodes and edges to discover intrinsic knowledge hidden within the underlying structure ^[2].

In almost all domains of science, graph clustering serves a variety of objectives, such as extracting similar interests of communities within a large social network ^[2]^[3]. In transportation networks, it is applied in optimizing routes and traffic flow ^[4]. Bioinformatics is another field where the application of graph clustering is prolific. Identifying functional modules of genes or proteins is such an example ^[5]. Other application areas include image segmentation ^[6] and personalized recommendations in a recommendation system ^[7]^[8] etc.

Graph clustering is an NP-class problem ^[9] that can be classified into three distinct categories, partitioning, overlapping, and hierarchical clustering ^[2]. In addition, numerous graph clustering algorithms have been proposed according to their diverse objectives. A few well-known techniques include k-means ^[10], DBSCAN ^[11], Girvan–Newman ^[12], Louvain ^[13], and numerous others. These algorithms can be parameter-free (i.e., preset to default) or parameterizable (i.e., allowing users to specify some parametric values). The scope of these parameters spans several facets; these are the cluster quantity, minimum neighboring points, the resolution parameter, regularization parameter, and convergence criteria ^[14].

Complex algorithms, such as random walk-based and spectral clustering, require some level of expertise in graph theory and linear algebra to adjust for suitable values of their parameters. They often involve numerous iterations and experimentation and their assessment usually involves the adjusting procedure. These can render them computationally intensive, especially for extensive and intricate networks. In order to take full advantage of parameter tuning, algorithmic complexity is often associated. Apart from computational complexity, this process may result in unbalanced clusters, especially if parameters are inappropriately adjusted. In the context of density and connectivity, unbalanced clustering can lead to inefficient resource utilization and a rise in maintenance costs ^[15].

As stated above, graph clustering has applications in almost every domain, and, like many other areas of science, graph clustering algorithms were introduced in response to the objective(s) of their applications. Nevertheless, there remain several requirements in graph clustering that have yet to be fulfilled. One of these is the capability to cluster a graph in such a way that the resultant clusters have a comparable or identical density. This feature is useful in the deployment and administration of resources in a variety of applications, for example, if a large community is represented by a complex graph. Subcommunities can be viewed as clusters within the whole community. If they are of equal or comparable density, distribution of resources can be more efficiently completed. A network that represents communication among individuals through LINE messaging is another good example. It would be beneficial if the network could be partitioned into smaller units with similar density. This allows for an efficient deployment of resources to monitor these communities as each subcommunity is likely to require similar resources.

2. Graph Clustering Algorithms

This section provides an overview of widely used graph clustering algorithms, encompassing both partitioning and overlapping clustering approaches. In ^[16], an extensive exploration is conducted on the categorization, underlying principles, and parameterization techniques of partitioning clustering algorithms. This exploration provides valuable insights into the intricate characteristics of these algorithms. Overall, this section contributes to a fundamental understanding of the current state-of-the-art graph clustering algorithms.

In the realm of graph clustering, a multitude of state-of-the-art algorithms have emerged to address the identification of cohesive communities or subgraphs within complex networks or graphs. Among them, the Girvan–Newman (GN) algorithm ^[12] utilizes edge-betweenness to find clusters in complex networks. By creating a hierarchical dendrogram, the algorithm enables modularity analysis to identify the optimal level cut. The Louvain algorithm ^[13], known for its multilevel clustering approach, optimizes modularity through local node movements and network aggregation. However, the Louvain algorithm has limitations in identifying small communities. To address the issue of poorly connected clusters, the Leiden algorithm ^[17] enhances the Louvain method by employing fast local moves and random neighbor moves. This enhancement ensures the formation of internally connected clusters and leads to a stable partition. Fluid communities is another example of a clustering algorithm that partitions a graph into communities using virtual fluid and has found applications in various domains ^[18]. It also provides an adjustable parameter for cluster customization. The spectral method leverages the power of eigenvalues derived from special matrices to improve the performance of cluster determination. It outperforms conventional methods such as k-means and enables a more robust analysis of complex networks ^[19]^[20]. Users may specify the desired number of clusters (k), or the algorithm may arbitrarily determine this value. The expectation maximization (EM) algorithm ^[21] embraces a model-based approach by estimating essential parameters through probability distributions. Furthermore, it provides users with the ability to define the optimal number of clusters (k) and adjust additional parameters, including maximal iterations and convergence criteria.

The fast greedy algorithm is a notable example of a scalable and efficient agglomerative hierarchical clustering method ^[22]^[23]. It operates without the need for user-specified parameters and takes advantage of optimized data structures and strategic optimization shortcuts. The InfoMap algorithm adopts information theory principles ^[24]^[25] by employing random walks to analyze information flow and partition the network into modules. It also optimizes the transmission rate together with minimum description length to quantify clustering quality. Similar to InfoMap, the label propagation algorithm (LPA) is another approach that avoids parameter specification by iteratively assigning node labels based on neighbors to achieve clustering ^[26]. To address the shortcomings of randomness issues, extended versions of LPAs, such as MemLPA and LPA-MNI, have been proposed to enhance the performance ^[27]^[28]. The Walktrap algorithm adopts a hierarchical approach through random walks and merging neighboring clusters. It incorporates a parameter ‘t’ to facilitate practical applications in its attempt to optimize the performance ^[29].

The spinglass approach ^[30]^[31], inspired by statistical mechanics, utilizes the ‘Potts model’ to connect nodes sharing the same spin state and employs advanced simulated annealing techniques for resource-efficient cluster identification. The leading eigenvector capitalizes on modularity maximization by computing the leading eigenvector of the modularity matrix in partitioning the graph ^[32]. Although the algorithm itself is parameter-free, graph clustering via modularity maximization may necessitate the adjustment of parameters like resolution, stopping criteria, and the modularity matrix computation method. The graph-based k-means algorithm is another parameter-free algorithm whereby minimum spanning trees are utilized to estimate k and centroid placement. Its proven high performance lies in handling noisy data ^[33].

Another noteworthy aspect of graph clustering includes Bayesian techniques. The Bayesian nonparametric methods have emerged as influential tools. These approaches provide a probabilistic framework for modeling complex relationships within graphs. Unlike traditional clustering methods that require a predetermined cluster count or some other parameters, Bayesian nonparametric techniques adaptively infer the number of clusters from the data. This flexibility is particularly advantageous in scenarios where the true number of clusters is unknown or varies. This makes Bayesian nonparametric methods well-suited for diverse real-world applications ^[34]^[35]. Furthermore, these techniques inherently capture uncertainty in the clustering process, providing a probabilistic assignment of nodes to clusters. Models such as the Dirichlet Process Mixture Model (DPMM) ^[36]^[37]^[38]^[39], Bayesian Community Detection (BCD) ^[40], Infinite Relational Model (IRM) ^[39]^[40], and the Chinese Restaurant Process (CRP) fall under the Bayesian nonparametric umbrella and have found applications in diverse fields, including social network analysis, bioinformatics, and community detection in complex networks ^[41]. Despite the unprecedented flexibility and applicability of these techniques in diverse real-world scenarios, the effective clustering of graphs remains a significant area of exploration and refinement from several perspectives. An absence in the ability in specifying the required density of the resultant clusters is an avenue for research and improvement in this field.

In the context of real networks characterized by intricately overlapping community structures, traditional methods prove insufficient for accurately identifying overlapped communities ^[42]. The existing approaches for identifying overlapped clusters can be broadly categorized into two main types. The first type includes node-based clustering algorithms (node clustering) that directly partition the network’s nodes into distinct clusters by leveraging the available structural information. The second type are edge-based algorithms, where edges that typically have unique identities and edges that connect to a single node may belong to multiple link communities. These algorithms are referred to as ‘edge’ or ‘link-based’ clustering methods. However, it is worth noting that the majority of well-established algorithms predominantly fall within the first category. One notable approach that stands out is the Clique Percolation Method (CPM), which draws inspiration from clique percolation theory ^[43]^[44]. It provides insights into the shared connections among nodes and enables the identification of multiple communities that overlap in terms of their node memberships. The Lancichinetti–Fortunato–Radicchi Measures (LFM) algorithm ^[45] detects overlapping and hierarchical community structures in complex networks by combining modularity and fitness optimization. This algorithm offers valuable insights into the complex nature of community structures. However, the LFM algorithm has limitations in handling large-scale networks due to its computational complexity. It also relies on predefined resolution parameters, which has an impact on its scalability and adaptability in real-world applications. The greedy clique expansion (GCE) algorithm, which was proposed in ^[46], is a strong way to find highly overlapping community structures in social networks by systematically expanding maximal cliques. GCE uncovers the intricate patterns of node overlaps and provides valuable insights into the complex relationships within communities. However, the algorithm’s scalability may be a limitation in large-scale network applications due to its reliance on a greedy expansion strategy. Nonetheless, GCE presents a valuable approach for capturing overlapping communities in social networks. The Overlapping Cluster Generator (OCG) method is a prominent approach for identifying overlap areas in protein–protein interaction (PPI) networks ^[47]. It employs a sophisticated methodology such as parameters like clique size, similarity threshold, and minimum cluster size to control the clustering process. By leveraging these parameters, OCG captures overlapping structures and identifies densely connected subgroups. However, OCG may be sensitive to parameter settings; hence, it requires tuning for optimal results. Another prominent algorithm, the improved Bacteria Foraging Optimization (IBFO) mechanism, combines the intuitionistic fuzzy set for identifying overlap modules in protein–protein interaction networks. This method effectively tackles the challenge of overlapping functional modules in PPI networks and automatically determines the number of clusters. However, the choice of parameter values, including the thresholds for membership degree and indeterminacy degree, as well as the maximum iterations of the IBFO algorithm, may significantly impact the resulting clustering outcomes ^[48]. The ETCC-SA (Edge-Triangle Clique Cover using Simulated Annealing) algorithm ^[49] is proposed for detecting overlapping communities in social networks. It explores different feature assignments for vertices and iteratively adjusts them based on a scoring function. A cluster is formed as the algorithm gradually converges toward an approximate solution. As the algorithm uses heuristic approximations, it does not guarantee optimality and is sensitive to parameter choices like the number of rounds and the mixing parameter. The Clique Overlap Newman–Girvan algorithm (CONGA) ^[50] is an extended version of the Girvan–Newman (GN) algorithm that specifically focuses on detecting overlapping clusters in networks. It enhances the GN algorithm by incorporating a mechanism to identify and represent overlapping communities. By considering the betweenness of edges and tracking overlapping vertices during the merging process, CONGA effectively detects and characterizes overlapping clusters in networks. Nevertheless, CONGA suffers from the resolution limit problem, making it unsuitable to detect smaller clusters within larger ones in overlapped communities.

The Clique-Based Louvain Algorithm (CBLA) is an approach for detecting overlapping communities in graphs that combines concepts from the Clique Percolation Method (CPM) and the Louvain algorithm. It identifies cliques using CPM and then employs the Louvain algorithm to classify unclassified nodes. However, the Louvain algorithm has a resolution limit, posing difficulties in detecting communities at different scales and identifying smaller communities within larger ones ^[17]^[51]. Furthermore, the CBLA’s performance may be sensitive to parameter choices, including the clique size ^[52]^[53]. A normalized cut-based scalable version of spectral clustering is proposed ^[54] to solve the problem of finding overlapping communities in large, complex networks. By utilizing a hierarchical framework and incorporating node weights, it addresses the challenge of identifying nodes belonging to multiple communities. However, the selection of the tolerance parameter within the hierarchical framework may significantly impact the clustering results. The algorithm can control the level of overlap between communities by adjusting the value of the threshold. A higher value allows for a greater degree of overlap, while a lower value results in less overlap. The selection of the optimal value requires careful consideration to achieve accurate and meaningful clustering results. In ^[55], a novel approach called LOCD (Local Overlapping Community Detection) is introduced to identify overlapping nodes within community structures. LOCD identifies structural centers in a network based on their higher density and relative distance from nodes with higher densities. It expands communities around these centers using structural similarity. While LOCD effectively uncovers overlapping communities, the choice of cutoff distance and the use of structural similarity as a weighting measure can influence clustering results. Therefore, careful adjustment of these parameters is crucial to ensure accurate and reliable clustering outcomes. Moreover, k-Neighborhood Attribute Structural (kNAS) ^[56] is another technique for detecting overlapping clusters in a graph by considering structural and attribute similarities among vertices. It groups objects into clusters based on their k-nearest neighbors and similar attributes, enabling the identification of overlapping communities. The key parameter is the distance value (k). However, kNAS strictly satisfies both structural and attribute similarities but potentially does not provide a balanced trade-off. Link communities (LC) is a link-based method that detects communities as interrelated groups using hierarchical clustering and partition density optimization. It combines link-based analysis, hierarchical clustering, and partition density to identify overlapping communities and hierarchical organization in networks. The algorithm’s results can be influenced by parameters like link weight threshold, community size threshold, resolution parameter, link strength measure, initialization strategy, convergence criteria, and network structure. Therefore, experimenting with different settings is crucial for achieving desired outcomes ^[57]. The extended link clustering (ELC) method improves the existing link clustering (LC) method by using extended link similarity (ELS) to create a denser transform matrix and employing EQ (an extended measure of quality of modularity) for optimal partitioning. This approach generates more realistic communities by incorporating more link information and using EQ to determine the cut level in the hierarchical clustering dendrogram. ELC achieves higher EQ values, making it more realistic compared to the original LC method and the classical CPM method ^[58]. Network Decomposition for Overlapping Community Detection (NDOCD) is a novel approach for overlapping community detection using network decomposition, node clustering, and seed expansion. It determines overlapping communities by applying a seed expansion strategy based on joint strength and membership degree thresholds. The approach showed superior performance compared to traditional link clustering and node clustering algorithms. However, its higher sensitivity is attributed to the adjustment of parameters like joint strength and membership degree ^[59].

The study in ^[60] introduces a Scalable Integrated Model using Game Theory (SIMGT), a novel approach for detecting overlapping communities in complex networks. It is inspired by social identity theory, and it models community formation as a noncooperative game, utilizing second-order proximity and stochastic gradient ascent. SIMGT proves effective, surpassing BigClam in AvgF1 across datasets, demonstrating scalability. Key parameters, including thresholds, proximity metrics (e.g., Jaccard coefficient), and weighting strategies, require careful optimization. These can be problematic for users in seeking optimal results.

The GREESE algorithm ^[61] is introduced for overlapping community detection, employing a coupled-seeds expansion strategy, a fitness function, and a merging phase that regulates overlapping rates. It outperforms seven state-of-the-art algorithms in accuracy and execution time on diverse networks; GREESE offers simplicity, effectiveness, and promising results. The algorithm’s parameters include Common Neighbor Similarity (C), Fitness Function (F), Expansion and Merging Phase Thresholds, as well as Network Parameters in LFR Benchmarks and Real-World Network Parameters. Optimal parameter selection may impact algorithm performance, requiring tuning for different networks.

At present, several popular libraries such as Scikit-learn, NetworkX, igraph, and MATLAB include the implementation of many algorithms, such as fluid communities, Louvain, and the spectral method described above, and they can be easily used. To date, there has not been a graph clustering algorithm that allows users to specify some form of density in the resulting clusters.

This entry is adapted from the peer-reviewed paper 10.3390/app14010380

References

Berahmand, K.; Haghani, S.; Rostami, M.; Li, Y. A new attributed graph clustering by using label propagation in complex networks. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 1869–1883.
Schaeffer, E. Graph clustering. Comput. Sci. Rev. 2007, 1, 27–64.
Huang, X.; Cheng, H.; Yu, J.X. Dense community detection in multi-valued attributed networks. Inf. Sci. 2015, 314, 77–99.
Saeedmanesh, M.; Geroliminis, N. Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks. Transp. Res. Procedia 2017, 23, 962–979.
Thomas, J.; Seo, D.; Sael, L. Review on graph clustering and subgraph similarity-based analysis of neurological disorders. Int. J. Mol. Sci. 2016, 17, 862.
Xia, K.; Gu, X.; Zhang, Y. Oriented grouping-constrained spectral clustering for medical imaging segmentation. Multimed. Syst. 2020, 26, 27–36.
Rostami, M.; Oussalah, M.; Farrahi, V. A novel time-aware food recommender system based on deep learning and graph clustering. IEEE Access 2022, 10, 52508–52524.
Shao, B.; Li, X.; Bian, G. A survey of research hotspots and frontier trends of recommendation systems from the perspective of knowledge graph. Exp. Syst. Appl. 2021, 165, 113764.
Hong, S.W.; Miasnikof, P.; Kwon, R.; Lawryshyn, Y. Market graph clustering via QUBO and digital annealing. J. Risk Financ. Manag. 2021, 14, 34.
MacQueen, J. Classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 21 June–18 July 1965; University of California Press: Berkeley, CA, USA, 1967; pp. 281–297.
Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Kdd, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231.
Girvan, M.; Newman, M.E.J. Community structure in social and biological networks. Proc. Nat. Acad. Sci. USA 2002, 99, 7821–7826.
Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 10, 10008.
Kothari, R.; Pitts, D. On finding the number of clusters. Pattern Recognit. Lett. 1999, 20, 405–416.
Sankar, S.; Ramasubbareddy, S.; Luhach, A.K.; Nayyar, A.; Qureshi, B. CT-RPL: Cluster tree-based routing protocol to maximize the lifetime of Internet of Things. Sensors 2020, 20, 5858.
Tariq, R.; Lavangnananda, K.; Bouvry, P.; Mongkolnam, P. Partitioning Graph Clustering with Density. IEEE Access 2023, 11, 122273–122294.
Traag, V.A.; Waltman, L.; Van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233.
Parés, F.; Gasulla, D.G.; Vilalta, A.; Moreno, J.; Ayguadé, E.; Labarta, J.; Cortés, U.; Suzumura, T. Fluid communities: A competitive, scalable and diverse community detection algorithm. In International Conference on Complex Networks and Their Applications; Springer: Berlin/Heidelberg, Germany, 2017; pp. 229–240.
Ng, A.; Jordan, M.; Weiss, Y. On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2001, 14, 1–8.
Luxburg, V.U. A tutorial on spectral clustering. Statist. Comput. 2007, 17, 395–416.
Dempster, A.P.; Laird, N.M.; Rubin, D. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 1997, 39, 1–22.
Tandon, A.; Albeshri, A.; Thayananthan, V.; Alhalabi, W.; Fortunato, S. Fast consensus clustering in complex networks. Phys. Rev. E 2019, 99, 042301.
Kuwil, F.H.; Shaar, F.; Topcu, A.E.; Murtagh, F. A new data clustering algorithm based on critical distance methodology. Exp. Syst. Appl. 2019, 129, 296–310.
Rosvall, M.; Bergstrom, C.T. Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. USA 2008, 105, 1118–1123.
Rosvall, M.; Axelsson, D.; Bergstrom, C.T. The map equation. Eur. Phys. J. Spec. Top. 2009, 178, 13–23.
Raghavan, U.N.; Albert, R.; Kumara, S. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 2007, 76, 036106.
Fiscarelli, A.M.; Brust, M.R.; Danoy, G.; Bouvry, P. Local memory boosts label propagation for community detection. Appl. Netw. Sci. 2019, 4, 95.
Li, H.; Zhang, R.; Zhao, Z.; Liu, X. LPA-MNI: An improved label propagation algorithm based on modularity and node importance for community detection. Entropy 2021, 23, 497.
Pons, P.; Latapy, M. Computing communities in large networks using random walks. In International Symposium on Computer and Information Sciences; Springer: Berlin/Heidelberg, Germany, 2005; pp. 284–293.
Xie, W.B.; Lee, Y.L.; Wang, C.; Chen, D.B.; Zhou, T. Hierarchical clustering supported by reciprocal nearest neighbors. Inf. Sci. 2020, 527, 279–292.
Rustamaji, H.C.; Suharini, Y.S.; Permana, A.A.; Kusuma, W.A.; Nurdiati, S.; Batubara, I.; Djatna, T. A network analysis to identify lung cancer comorbid diseases. Appl. Netw. Sci. 2022, 7, 30.
Newman, M.E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 2006, 74, 036104.
Galluccio, L.; Michel, O.; Comon, P.; Hero, A.O. Graph-based k-means clustering. Signal Process. 2012, 92, 1970–1984.
Bourouis, S.; Alroobaea, R.; Rubaiee, S.; Andejany, M.; Bouguila, N. Nonparametric Bayesian Learning of Infinite Multivariate Generalized Normal Mixture Models and Its Applications. Appl. Sci. 2021, 11, 5798.
Orbanz, P.; Teh, Y.W. Bayesian Nonparametric Models. In Encyclopedia of Machine Learning; Sammut, C., Webb, G.I., Eds.; Springer: Boston, MA, USA, 2011; pp. 81–89.
Karras, C.; Karras, A.; Giotopoulos, K.C.; Avlonitis, M.; Sioutas, S. Consensus Big Data Clustering for Bayesian Mixture Models. Algorithms 2023, 16, 245.
McAuliffe, J.D.; Blei, D.M.; Jordan, M.I. Nonparametric empirical Bayes for the Dirichlet process mixture model. Stat Comput. 2006, 16, 5–14.
Li, Y.; Schofield, E.; Gonen, M. A Tutorial on Dirichlet Process Mixture Modeling. J. Math. Psychol. 2019, 91, 128–144.
Andersen, K.W.; Madsen, K.H.; Siebner, H.R.; Schmidt, M.N.; Morup, M.; Hansen, L.K. Non-parametric Bayesian graph models reveal community structure in resting state fMRI. Neuroimage 2014, 100, 301–315.
Palla, K.; Knowles, D.A.; Ghahramani, Z. Relational learning and network modelling using infinite latent attribute models. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 462–474.
Blei, D.M.; Frazier, P.I. Distance-dependent Chinese restaurant processes. J. Mach. Learn. Res. 2011, 12, 2461–2488.
Xie, J.; Kelley, S.; Szymanski, B.K. Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Comput. Surv. 2013, 45, 1–35.
Palla, G.; Derényi, I.; Farkas, I.; Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 2015, 435, 814–818.
Shen, H.; Cheng, X.; Cai, K.; Hu, M.B. Detect overlapping and hierarchical community structure in networks. Phys. A. Stat. Mech. Appl. 2009, 388, 1706–1712.
Lancichinetti, A.; Fortunato, S.; Kertész, J. Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 2009, 11, 033015.
Lee, C.; Reid, F.; McDaid, A.; Hurley, N. Detecting highly overlapping community structure by greedy clique expansion. arXiv 2010, arXiv:1002.1827.
Becker, E.; Robisson, B.; Chapple, C.E.; Guénoche, A.; Brun, C. Multifunctional proteins revealed by overlapping clustering in protein interaction network. Bioinform. 2012, 28, 84–90.
Lei, X.; Wang, F.; Wu, F.X.; Zhang, A.; Pedrycz, W. Protein complex identification through Markov clustering with firefly algorithm on dynamic protein–protein interaction networks. Inf. Sci. 2016, 329, 303–316.
Li, P.; Dau, H.; Puleo, G.; Milenkovic, O. Motif clustering and overlapping clustering for social network analysis. In Proceedings of the IEEE INFOCOM 2017-IEEE Conference on Computer Communications, IEEE, Atlanta, GA, USA, 1–4 May 2017; pp. 1–9.
Gregory, S. An algorithm to find overlapping community structure in networks. In Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, 17–21 September 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 91–102.
Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174.
Seda, M. The Maximum Clique Problem and Integer Programming Models, Their Modifications, Complexity, and Implementation. Symmetry 2023, 15, 1979.
Gupta, S.K.; Singh, D.P. CBLA: A Clique Based Louvain Algorithm for Detecting Overlapping Community. Procedia Comput. Sci. 2023, 218, 2201–2209.
Van Lierde, H.; Chow, T.W.; Chen, G. Scalable spectral clustering for overlapping community detection in large-scale networks. IEEE Trans. Knowl. Data Eng. 2019, 32, 754–767.
Wang, X.; Liu, G.; Li, J. Overlapping community detection based on structural centrality in complex networks. IEEE Access 2017, 5, 25258–25269.
Boobalan, M.P.; Lopez, D.; Gao, X.Z. Graph clustering using k-Neighbourhood Attribute Structural similarity. Appl. Soft Comput. 2016, 47, 216–223.
Ahn, Y.Y.; Bagrow, J.P.; Lehmann, S. Link communities reveal multiscale complexity in networks. Nature 2010, 466, 761–764.
Huang, L.; Wang, G.; Wang, Y.; Blanzieri, E.; Su, C. Link clustering with extended link similarity and EQ evaluation division. PLoS ONE 2013, 8, e66005.
Ding, Z.; Zhang, X.; Sun, D.; Luo, B. Overlapping community detection based on network decomposition. Sci. Rep. 2016, 6, 24115.
Wang, Y.; Bu, Z.; Yang, H.; Li, H.J.; Cao, J. An effective and scalable overlapping community detection approach: Integrating social identity model and game theory. Appl. Math. Comput. 2021, 390, 125601.
Asmi, K.; Lotfi, D.; Abarda, A. The greedy coupled-seeds expansion method for the overlapping community detection in social networks. Computing 2022, 104, 295–313.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.