Parallelization Strategies for Graph-Code-Based Similarity Search

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Patrick Steinert	--	2002	2024-01-18 11:36:45	\|
2	format correct	Catherine Yang	Meta information modification	2002	2024-01-19 02:04:40	\|

This entry is adapted from the peer-reviewed paper 10.3390/bdcc7020070

The volume of multimedia assets in collections is growing exponentially, and the retrieval of information is becoming more complex. The indexing and retrieval of multimedia content is generally implemented by employing feature graphs. Feature graphs contain semantic information on multimedia assets. Machine learning can produce detailed semantic information on multimedia assets, reflected in a high volume of nodes and edges in the feature graphs. Graph Codes provide fast and effective multimedia indexing and retrieval, even in billion-scale use cases.

semantic multimedia feature graph Graph Code

1. Introduction

Whether in social networks, media, or medicine, many industries collect and process a growing volume of multimedia content objects (i.e., representations of real-world scenes, such as videos, images, textual descriptions, audio recordings, or combined objects). Statista ^[1] describes an increase in the volume of photos taken with a smartphone, from 660 billion in 2013 to 1.2 trillion in 2017. When comparing the volume of titles on video streaming services from fall 2021 ^[2] and summer 2022 ^[3], annual growth can be observed. Research data sets grow with similar rates, as the National Library of Medicine shows. Founded as Open-i in 2012 with 600,000 ^[4] assets, the collection grew to 1.2 million in 2022 ^[5]. Similar rates are shown in Figure 1; in one minute on the Internet ^[6], 695,000 Instagram ^[7] stories are shared, 500 hours of YouTube ^[8] content is uploaded, and 197 million emails are sent.

Figure 1. A Minute on the Internet in 2021 ^[6].

Mechanisms for the efficient indexing and fast retrieval of these multimedia content objects are essential to manage this large volume of information. Cloud computing ^[9] and big data technologies ^[10] enable the storage and processing of these amounts of multimedia content objects. Recent improvements in machine ^[11] learning, such as deep learning ^[12], enabled the automated extraction of features from the raw multimedia content object by employing object recognition ^[13], face identification, and further methods. The increasingly high resolution in multimedia content objects, such as 32-bit audio recording, 8K video recording, and 200-megapixel smartphone cameras, allows the extraction of features with a high level of detail (LOD) from the content of multimedia content objects. All this extracted feature information can be efficiently organized and indexed by graph-based technologies. Evaluations showed that they are fast and effective technologies for Multimedia Information Retrieval (MMIR) ^[14] and that Graph Codes can perform better as graph databases. According to ^[15], the acceptable response time for users is around one second. However, information retrieval in large multimedia collections and a high LOD still result in processing times above the margin of one second. Previous experiments show a potential speedup of the execution times of the Graph Code algorithm through the parallelization of Graph Code algorithms. One of the remaining research questions is how to efficiently parallelize Graph Code algorithms and whether this can lead to a speedup.

2. Information Retrieval

Information retrieval ^[16] aims to find information in large information collections. Multimedia Information Retrieval particularly targets collections with image, video, text, or audio assets (i.e., multimedia content objects). MMIR systems are designed to support these use cases. To search for information, the main component is a search engine. The search engine has an information database containing the list of multimedia assets in the collection and also an index of them. The index contains metadata about the assets. Semantic metadata connect the features and make them machine processable. Metadata can be supplied or generated by feature extraction. In order to organize features, graph-based methodologies and structures are frequently employed, given that feature information relies on information nodes and the connections that exist between these nodes ^[17].

3. Multimedia Features and Multimedia Feature Graphs

The Multimedia Feature Graph (MMFG) ^[18] is a weighted and directed graph ^[19], whose nodes and edges represent features of multimedia assets. The Generic Multimedia Annotation Framework (GMAF) ^[20] is an MMIR system for multimedia assets that uses MMFGs as an index and access method. GMAF provides a flexible plugin architecture that allows the processing of various multimedia asset types to extract features that are stored as metadata in the MMFG. The extracted features are contributed to the MMFG, which can be further processed. Extensions of MMFGs have led to semantic analysis, such as Semantic Multimedia Feature Graphs (SMMFGs) and Explainable SMMFGs (ESMMFGs) ^[21]. Despite these extensions, the graph-based structure of MMFGs remains and can lead to slow processing times. When the LOD of the assets increases, the number of elements in the MMFG also increases, which further increases the processing times. To address this, Graph Codes were introduced for faster indexing ^[22]. Therefore, it is important to experiment with an improved processing model to reduce processing times. This is outlined in the next subsection.

4. Graph Codes and Algorithms

Graph Codes ^[22]^[23] are a 2D transformation of a multimedia feature graph that is encoded using adjacency matrix operations ^[24]. GCs have been shown to be more efficient for similarity and recommendation searches than graph-traversal-based algorithms. Graph Codes represent the labels of feature graph nodes and edges in the form of vocabulary terms. Based on the adjacency matrix of such a feature graph, these are used as row and column indices. The elements of the matrix represent the relationships between the vocabulary terms. The type of the edge in the graph is encoded as the value of the matrix element. Figure 2 illustrates a simple example of a multimedia feature graph (see Figure 2a), a detailed section from the graph (Figure 2b), its corresponding GC in a table representation (Figure 2c), and the GC matrix

G C_{e x}

(Figure 2d).

Figure 2. Multimedia features represented as a Graph Code index (a–d) ^[25].

GCs contain a dictionary

d i c t_{G C}

of feature vocabulary terms (

F V T

) and represent the relationships between these terms using the matrix field

m_{i, j}

. A similarity metric triple

M_{G C} = (M_{F}, M_{F R}, M_{R T})

has been defined for GCs. The feature metric

M_{F}

is based on the vocabulary, the feature relationship metric

M_{F R}

is based on the possible relationships, and the feature relationship type metric

M_{R T}

is based on the actual relationship types.

Semantic Graph Codes (SGCs) ^[20] are an extension of GCs that incorporate additional semantic structures using annotation with semantic systems such as RDF, RDFS, ontologies, or Knowledge Organization Systems ^[26]^[27]^[28]. This additional semantic information can help to bridge the semantic gap between the technical representations of multimedia feature graphs and their human-understandable meanings.

With the introduction of semantics to the MMFGs in ^[21], the researchers introduced additional metrics to improve the efficiency and effectiveness of Graph Codes for MMIR. First, the feature discrimination

M_{D I S}

is defined as the difference in the number of nonzero Graph Code fields for two feature vocabulary terms of a given Graph Code or Semantic Graph Code. TFIDF ^[29] is a numerical statistic used in natural language processing to evaluate the importance of a word in a document. An adapted TFIDF measure for Graph Codes can use

M_{D I S}

to reveal how representative a term is for a single document—in this case, an SGC. The Semantic Graph Code collection corresponds to the TFIDF documents. With

M_{R E L}

, it can be used to define a threshold for a collection to exclude less relevant features from the retrieval process. Alternatively,

M_{R E L}

can be used to weight terms according to the use case. This requires the pre-processing of the Graph Codes by removing the non-relevant vocabulary terms. This step needs to be performed when the relevance threshold is changed, or whenever a multimedia content object is added to or removed from the collection.

Introduced in ^[22], the basic algorithm for the comparison and order of Graph Codes in a collection is listed in pseudocode below.

for each GC in collection

--- parallelize ---

calculate the intersection matrices

of GC_Query and~GC

--- parallelize each ---

calculate M_F of GC_Query and GC

calculate M_FR of GC_Query and GC

calculate M_RT of GC_Query and GC

--- end parallelize each ---

compare

--- end parallelize~---

order result list according to

value of M_F

value of M_FR where M_F is equal

value of M_RT where M_F and M_FR are equal

return result list

Experiments with a proof-of-concept (POC) implementation of the algorithm showed that parallel instances can process individual parts of a Graph Code collection in less time on multicore CPUs compared to a single instance. Compared to a single instance with an execution time of 635 s, 16 instances could process the same volume of data in 65 s ^[30].

5. Parallel Computing

The scaling of algorithm processing can either be achieved by higher-performance computing resources or by executing parts of the algorithm in parallel. Higher performance is usually an upgrade of hardware, which is called vertical scaling. Parallel computing can be achieved in many ways. In a single computer, multiple processing units can exist in the form of multicore Central Processing Units (CPUs) ^[31] or coprocessors such as Graphics Processing Units (GPUs) ^[32] or Field-Programmable Gate Arrays (FPGAs) ^[33]. While multicore CPUs work in Multi-Instructions Multiple Data (MIMD) ^[34] fashion, GPUs usually work in a Single-Instruction Multiple Data (SIMD) ^[34] method. Although MIMD is suitable for general purpose computing, it is limited for massive parallelization ^[35] (p. 181). SIMD is optimal for massive parallelization, but only in cases where the same instructions are being applied to the data. Both concepts can be found in modern processors, e.g., Apple M-series ^[36] and A-series ^[37], Nvidia G200 ^[35] (p. xii), CPU AVX extension ^[38]. Additionally, systems can contain multiple processing units, such as multi GPUs. State-of-the-art approaches ^[39]^[40]^[41] mainly apply GPUs for performance improvements.

Another option for in-system parallelization is distributed computing. Instead of spreading the operations on the data to different processing units in the system, the data and the instructions are distributed to many systems. This is called horizontal scaling. Many frameworks are available to support coordination in distributed computing. Frameworks such as Hadoop ^[42] or TensorFlow ^[43] can be used to coordinate parallel execution on large clusters of computers.

On a high level, a task to parallelize can be classified as Task-Level Parallelization (TLP) ^[31] or Data-Level Parallelization (DLP) ^[31]. While tasks in TLP can be general purpose and very different from each other, in DLP, the same operation is applied to different data, similar to a matrix multiplication. Given Flynn’s taxonomy ^[34], for the parallel computation of multiple data, SIMD or MIMD processors can be used. TLP works well with MIMD processors, while, for DLP, SIMD processors are more suitable. As mentioned above, modern processors cannot simply be categorized as SIMD or MIMD, because they often have features of both categories. Multi-Core CPUs such as an Intel i9 ^[38] work as MIMD but have SIMD extensions such as AVX. GPUs operate as SIMD, but Nvidia CUDA ^[44] GPUs can also operate as MIMD. Hence, a detailed analysis is needed to find the most suitable processor for a certain task.

To take advantage of the potential for parallel computing, applications and algorithms need to be modified. The method of algorithm decomposition ^[45] (p. 95) can be applied to identify sections that can run in parallel. Recursive decomposition searches for options for a divide-and-conquer approach. Data decomposition looks for parts of the algorithm that apply the same operation to parts of the data to be processed. Further techniques exist, and they can be applied in hybrid. The result of the algorithm decomposition is a Task Dependency Graph (TDG). A TDG is a directed acyclic graph that signifies the execution process of a task-oriented application. In this graph, the algorithm’s tasks are depicted as nodes, while edges symbolize the interdependencies among tasks. This relationship denotes that a task can only commence its execution once its preceding tasks, represented by incoming edges, have been successfully completed. The TDG can be used to organize an algorithm for the targeted system.

According to ^[46], the efficiency gain of parallel executions is defined as the speedup S, which is the ratio of the sequential execution time

t_{s}

and the execution time on n processors

t_{p}

Amdahl’s ^[47] and Gustafson’s ^[48] laws can be used to calculate the theoretical speedup, based on the fractions of the program, which can be parallelized or not, and the number of execution units. According to Gustafson’s law, the speedup S on 10 cores N would be

9.1

However, the following decomposition of the algorithm to calculate and order the Graph Code metrics will produce a TDG. The characteristics of the tasks can indicate which execution model provides the best acceleration yields.

References

Richter, F. Smartphones Cause Photography Boom. 2017. Available online: https://www.statista.com/chart/10913/number-of-photos-taken-worldwide/ (accessed on 29 January 2023).
DCTV Productions Comparing Streaming Services. (1 January 2022). Available online: http://web.archive.org/web/20220101112312/https://dejaview.news/comparing-streaming-services/ (accessed on 21 January 2023).
DCTV Productions Comparing Streaming Services. (21 August 2022). Available online: http://web.archive.org/web/20220828145828/https://dejaview.news/comparing-streaming-services/ (accessed on 21 January 2023).
Demner-Fushman, D.; Antani, S.; Simpson, M.; Thoma, G. Design and Development of a Multimodal Biomedical Information Retrieval System. J. Comput. Sci. Eng. 2012, 6, 168–177.
National Library of Medicine. What Is Open-i? Available online: https://openi.nlm.nih.gov/faq#collection (accessed on 21 January 2023).
Jenik, C. A Minute on the Internet in 2021. Statista. 2022. Available online: https://www.statista.com/chart/25443/estimated-amount-of-data-created-on-the-internet-in-one-minute/ (accessed on 17 October 2022).
Meta Platforms Ireland Limited. Instagram Homepage. 2022. Available online: https://www.instagram.com/ (accessed on 21 January 2023).
Google. YouTube. Available online: http://www.youtube.com (accessed on 21 January 2023).
Cloud Computing—Wikipedia. Page Version ID: 1128212267. 2022. Available online: https://en.wikipedia.org/w/index.php?title=Cloud_computing&oldid=1128212267 (accessed on 26 December 2022).
Big Data—Wikipedia. Page Version ID: 1126395551. 2022. Available online: https://en.wikipedia.org/w/index.php?title=Big_data&oldid=1126395551 (accessed on 16 December 2022).
Machine Learning—Wikipedia. Page Version ID1128287216. 2022. Available online: https://en.wikipedia.org/w/index.php?title=Machine_learning&oldid=1128287216 (accessed on 19 December 2022).
Deep Learning. Wikipedia. Page Version ID1127713379. 2022. Available online: https://en.wikipedia.org/w/index.php?title=Deep_learning&oldid=1127713379 (accessed on 16 December 2022).
Dasiopoulou, S.; Mezaris, V.; Kompatsiaris, I.; Papastathis, V.; Strintzis, M. Knowledge-Assisted Semantic Video Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2005, 15, 1210–1224. Available online: http://ieeexplore.ieee.org/document/1512239/ (accessed on 21 January 2023).
Raieli, R. Multimedia Information Retrieval: Theory and Techniques; Chandos Publishing: Cambridge, UK, 2013; ISBN 978-1843347224.
CXL.com. Reduce Your Server Response Time for Happy Users, Higher Rankings. 2021. Available online: https://cxl.com/blog/server-response-time/ (accessed on 12 October 2021).
Singhal, A. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 2001, 24, 35–43.
Davies, J.; Studer, R.; Warren, P. Semantic Web technologies: Trends and Research in Ontology-Based Systems; John Wiley & Sons: Hoboken, NJ, USA, 2006; OCLC: ocm64591941.
Wagenpfeil, S.; McKevitt, P.; Hemmje, M. AI-Based Semantic Multimedia Indexing and Retrieval for Social Media on Smartphones. Information 2021, 12, 43.
Gurski, F.; Komander, D.; Rehs, C. On characterizations for subclasses of directed co-graphs. J. Comb. Optim. 2021, 41, 234–266.
Wagenpfeil, S.; Hemmje, M. Towards AI-based Semantic Multimedia Indexing and Retrieval for Social Media on Smartphones. In Proceedings of the 15th International Workshop on Semantic and Social Media Adaptation And Personalization (SMA), Zakynthos, Greece, 29–30 October 2020; pp. 1–9.
Wagenpfeil, S.; Mc Kevitt, P.; Cheddad, A.; Hemmje, M. Explainable Multimedia Feature Fusion for Medical Applications. J. Imaging 2022, 8, 104.
Wagenpfeil, S.; Vu, B.; Mc Kevitt, P.; Hemmje, M. Fast and Effective Retrieval for Large Multimedia Collections. Big Data Cogn. Comput. 2021, 5. Available online: https://www.mdpi.com/2504-2289/5/3/33 (accessed on 11 October 2022).
Wagenpfeil, S.; McKevitt, P.; Hemmje, M. Graph Codes-2D Projections of Multimedia Feature Graphs for Fast and Effective Retrieval. ICIVR. 2021. Available online: https://publications.waset.org/vol/180 (accessed on 2 February 2022).
Sciencedirect.com. Adjacency Matrix. 2020. Available online: https://www.sciencedirect.com/topics/mathematics/adjacency-matrix (accessed on 3 April 2023).
Wagenpfeil, S.; Mc Kevitt, P.; Hemmje, M. Towards Automated Semantic Explainability of Multimedia Feature Graphs. Information 2021, 12, 502. Available online: https://www.mdpi.com/2078-2489/12/12/502. (accessed on 3 January 2023).
Asim, M.N.; Wasim, M.; Ghani Khan, M.U.; Mahmood, N.; Mahmood, W. The Use of Ontology in Retrieval: A Study on Textual. IEEE Access 2019, 7, 21662–21686.
Domingue, J.; Fensel, D.; Hendler, J.A. (Eds.) Introduction to the Semantic Web Technologies. In Handbook of Semantic Web Technologies; SpringerLink: Berlin, Germany, 2011.
W3C. SKOS Simple Knowledge Organisation System. 2021. Available online: https://www.w3.org/2004/02/skos/ (accessed on 2 February 2022).
Silge, J.; Robinson, D. Text Mining with R: A Tidy Approach. (O’Reilly, 2017). OCLC: ocn993582128. Available online: https://www.tidytextmining.com/tfidf.html (accessed on 20 March 2023).
Wagenpfeil, S. Smart Multimedia Information Retrieval. (University of Hagen, 2022). Available online: https://nbn-resolving.org/urn:nbn:de:hbz:708-dh11994 (accessed on 9 February 2023).
Rauber, T.; Rünger, G. Parallel Programming; Springer: Berlin/Heidelberg, Germany, 2013; Section 3.3; pp. 98–112.
Kirk, D.; Hwu, W. Programming Massively Parallel Processors: A Hands-On Approach; Elsevier, Morgan Kaufmann: Amsterdam, The Netherlands, 2013.
Tanenbaum, A. Structured Computer Organization; Pearson Prentice Hall: Hoboken, NJ, USA, 2006; OCLC: Ocm57506907.
Flynn, M. Very high-speed computing systems. Proc. IEEE 1966, 54, 1901–1909. Available online: http://ieeexplore.ieee.org/document/1447203/ (accessed on 17 December 2021).
Keckler, S.; Hofstee, H.; Olukotun, K. Multicore Processors and Systems; Springer: Berlin/Heidelberg, Germany, 2009.
Apple Inc. M1 Pro and M1 Max. 2021. Available online: https://www.apple.com/newsroom/2021/10/introducing-m1-pro-and-m1-max-the-most-powerful-chips-apple-has-ever-built/ (accessed on 29 January 2023).
Wikipedia. Apple A14. 2022. Available online: https://en.wikipedia.org/wiki/Apple_A14 (accessed on 12 January 2022).
Intel Deutschland GmbH. Intel® Core™ i9-12900KF Processor. Available online: https://www.intel.de/content/www/de/de/products/sku/134600/intel-core-i912900kf-processor-30m-cache-up-to-5-20-ghz/specifications.html (accessed on 18 December 2021).
Harish, P.; Narayanan, P. Accelerating large graph algorithms on the GPU using CUDA. In Proceedings of the High Performance Computing—HiPC 14th International Conference, Goa, India, 18–21 December 2007; pp. 197–208.
Johnson, J.; Douze, M.; Jégou, H. Billion-scale similarity search with GPUs. arXiv 2017, arXiv:1702.08734.
Kusamura, Y.; Kozawa, Y.; Amagasa, T.; Kitagawa, H. GPU Acceleration of Content-Based Image Retrieval Based on SIFT Descriptors. In Proceedings of the 19th International Conference On Network-Based Information Systems (NBiS), Ostrava, Czech Republic, 7–9 September 2016; pp. 342–347. Available online: http://ieeexplore.ieee.org/document/7789781/ (accessed on 4 January 2022).
Apache™ Hadoop® Project Apache Hadoop. Available online: https://hadoop.apache.org/ (accessed on 13 January 2022).
Google Ireland Limited. TensorFlow Home Page. 2022. Available online: https://www.tensorflow.org/ (accessed on 18 December 2022).
NVIDIA CUDA-Enabled Products. CUDA Zone. Available online: https://developer.nvidia.com/cuda-gpus (accessed on 18 December 2022).
Grama, A. Introduction to Parallel Computing; Addison-Wesley: Boston, MA, USA, 2003.
Rauber, T.; Rünger, G. Parallel Programming; Springer: Berlin/Heidelberg, Germany, 2013; Section 4.2.1; pp. 162–164.
Amdahl, G. Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, Reprinted from the AFIPS Conference Proceedings, Vol. 30 (Atlantic City, N.J., Apr. 18–20), AFIPS Press, Reston, Va., 1967, pp. 483–485, When Dr. Amdahl Was at International Business Machines Corporation, Sunnyvale, California. IEEE Solid-State Circuits Newsl. 2007, 12, 19–20. Available online: http://ieeexplore.ieee.org/document/4785615/ (accessed on 27 March 2022).
Gustafson, J. Reevaluating Amdahl’s law. Commun. ACM 1988, 31, 532–533.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Computer Science, Theory & Methods

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Patrick Steinert

Stefan Wagenpfeil

Paul Mc Kevitt

Ingo Frommholz

Matthias Hemmje

View Times: 107

Update Date: 19 Jan 2024

Table of Contents

Video Upload Options

Confirm