Spatial Keyword Group Query: Comparison
Please note this is a comparison between Version 1 by Jing Li and Version 2 by Sirius Huang.

Location-based services have been commonly used in daily life. Spatial keyword query, as one of the important technologies in location-based services, has been widely used in many fields, such as intelligent navigation systems and spatial positioning systems. Different types of spatial keyword query problems have been studied in depth by scholars.基于位置的服务在日常生活中已普遍使用。空间关键词查询作为基于位置的服务中的重要技术之一,已广泛应用于智能导航系统、空间定位系统等多个领域。学者们对不同类型的空间关键词查询问题进行了深入研究。

  • time-aware
  • exclusion keywords
  • Huffman coding
  • group query
  • spatial keyword query

. Introduction

With the rapid development of Internet technology and sensor technology in recent years, location-based services have been commonly used in daily life. In particular, spatial keyword query, as one of the important technologies in location-based services, has been widely used in many fields, such as intelligent navigation systems and spatial positioning systems. Different types of spatial keyword query problems have been studied in depth by scholars at home and abroad, such as spatial keyword nearest neighbor query problems [1][2][3], TOP-k spatial keyword queries [4][5][6][7][8], inverse nearest neighbor queries [9][10][11][12] and spatial keyword group queries [13][14][15]. In the spatial database, a number of spatial–text objects, also known as points of interest (POI), are stored. A spatial keyword query returns one or some points of interest, which need to meet the various requirements of the query and be the best distance. Among the various branches of spatial keyword queries, the application of spatial keyword group queries in daily life has gradually spread in recent years and attracted the attention of many scholars. For example, when a tourist travels to a certain city, planning the desired location of the hotel needs to include activities such as eating, visiting the park, watching movies and so on for a period of time.
Temporal information is also an indispensable consideration in the field of spatial keyword queries. Since each POI such as stores or parks has its own opening hours, users cannot access the POI during non-opening hours. In the query system or recommendation system, different POIs have different opening hours, for example, the breakfast store is open from 6:30 to 10:00 and the shopping mall is open from 9:30 to 20:00, so for the different needs of different users, it is necessary to return different POI objects. Specifically, when the user wants to eat breakfast at 7:00, the query system should return the above breakfast store instead of the mall. Although there may be restaurants in the mall, the mall’s time does not meet the user’s needs. However, most researchers have focused on textual constraints and spatial constraints and have not considered the impact of temporal constraints in daily life.
Furthermore, with the development of science and technology in society, the needs of users are also increasing. For spatial keyword queries, users may not only want to query to meet the required keyword objects, but may also hope that, in the query process, they can be excluded from containing the user’s exclusion keyword objects. Moreover, some users around the world also have many exclusionary matters due to their religious beliefs and other reasons. However, few studies on spatial keyword querying have considered users’ exclusion preferences. Therefore, spatial keyword querying with exclusion keywords has great research value.

2. Spatial Keyword Group Query空间关键词组查询

The spatial keyword group query problem 空间关键字组查询问题[13][14][15], as a special for作为空间关键词查询的一种特殊形式,在现实生活中有着共同的应用。m of spatial keyword query, has common applications in real life. The concept of the mCK (m-closest keywords) query was first proposed in the literature CK(m-最接近的关键字)查询的概念最早是在文献中提出的[16]. Given a set of query keywords, the 给定一组查询关键字,mCK query finds m interest points and the keyword information of this set of interest points jointly covers the query keywords and minimizes the maximum pairwise distance of the objects in the group. However, Zhang et al. 查询查找m个兴趣点,这组兴趣点的关键字信息共同覆盖查询关键字,最小化组中对象的最大成对距离。然而,Zhang等人。[16] only focused on the fact that each object contains only one query keyword and its proposed exact algorithm has low adaptability for large data sets. Considering the possibility of multiple keywords, 只关注每个对象只包含一个查询关键字的事实,其提出的精确算法对大型数据集的适应性较低。考虑到多个关键词的可能性,Guo et al. [17]放宽了每个对象只有一个关键字的约束,并提出了因子为 relaxed the constraint of having only one keyword per object and proposed an approximation algorithm with a factor of (1.15 + ε). Deng et al) 的近似算法。邓等. [18]提出了一个最佳关键字覆盖率问题,这是 presented an optimal keyword coverage problem, which is a variant of the mCK query. This study considered both intra-group distance and keyword weights and incorporated both factors into a linear cost function, thereby proposing an accurate algorithm to solve this problem. Since the results of mCK queries do not provide sufficient support in some application scenarios, a new spatial keyword query, called the mCK 查询的变体。本研究同时考虑了组内距离和关键词权重,并将这两个因素合并到线性成本函数中,从而提出了一种精确的算法来解决这个问题。由于mCK查询的结果在一些应用场景中没有提供足够的支持,文献中提出了一种新的空间关键词查询,称为SK-Cover query (查询(spatial keyword cover query), was proposed in the literature [19].此查询考虑了结果集中的对象数,并为此查询问题放置了时间复杂度为 This query considered the number of objects in the results set and put an approximate algorithm with a time complexity of O (O (logm) for this query problem. Effective access policies and pruning rules were laid out to improve the efficiency and scalability of the algorithm in 的近似算法。制定了有效的访问策略和剪枝规则,以提高算法的效率和可扩展性。[19]. Subsequently, 随后,Li et al. [20] studied spatial keyword group queries and proposed a parametric approximation algorithm that allows the approximation ratio to be adaptive and the user to assign arbitrary query precision.研究了空间关键字组查询,并提出了一种参数化近似算法,该算法允许近似比率自适应,用户分配任意查询精度。 The literature 文献[13][14][15][16][17][18][19][20] are traditional spatial keyword group queries, focusing only on the study of keyword constraints and distance constraints. Nevertheless, some researchers have extended conventional spatial keyword group queries to some extent. 是传统的空间关键词组查询,只关注关键词约束和距离约束的研究。然而,一些研究人员在一定程度上扩展了传统的空间关键字组查询。Singh et al. 等人。[21] considered objects labeled with keywords and embedded in a vector space to study the multi-objective query problem in a multi-dimensional space for the closest set of points satisfying a given set of keywords. 考虑用关键字标记并嵌入到向量空间中的对象,以研究多维空间中的多目标查询问题,以获得满足给定关键字集的最接近的点集。在低维空间中,In a low-dimensional space, IR trees are mainly used to support query processing. But in a high-dimensional space (with dimensions larger than 10), different MBRs (minimum bounding rectangles) of IR trees can contain a large amount of overlapping data, resulting in a reduced index performance. So, this literature studied indexing mechanisms and query algorithms in a high-dimensional space, using random projection and hash-based indexing structures, and achieved high scalability and speedup ratios. The CoSKQ (collective spatial keyword query) query model introduces query points based on mCK, and Gao, Cao and Zhao et al. R 树主要用于支持查询处理。但在高维空间(维数大于 10)中,IR 树的不同 MBR(最小边界矩形)可能包含大量重叠数据,导致索引性能降低。因此,本文研究了高维空间中的索引机制和查询算法,使用随机投影和基于哈希的索引结构,并实现了高可扩展性和加速率。CoSKQ(集体空间关键词查询)查询模型引入了基于mCK的查询点,Gao、Cao和Zhao等人。[22][23][24] studied the 研究了路网空间中的CoSKQ query model in a road network space. Unlike most studies, Zhao et al. 查询模型。与大多数研究不同,Zhao等人。[24] proposed a popularity-aware aggregated keyword in road networks, aiming to find a set of popular 在道路网络中提出了一个流行感知聚合关键词,旨在找到一组流行POIs (i.e., a popularity region). The POIs cover the query keywords and satisfy the distance requirements from each node to the query node and between each node pair, such that the sum of the scores of these nodes for the query keywords is maximized. For this reason, a scaling technique for scoring was raised to reduce the search space, and a redundant computation reduction technique was proposed to reduce the redundant computations in query processing. Su et al(即一个流行区域)。POI覆盖了查询关键字,满足了从每个节点到查询节点之间以及每个节点对之间的距离要求,从而使查询关键字的这些节点的分数之和最大化。为此,提出了一种用于评分的缩放技术来减少搜索空间,并提出了一种冗余计算约简技术来减少查询处理中的冗余计算。Su 等. [25] also studied ensemble spatial keyword queries in a road network environment, but the difference is that it is based on group-based ensemble queries, i.e., 还研究了路网环境下的集成空间关键词查询,但不同的是它基于基于组的集成查询,即GBCK (group-based collective keyword). Considering the importance of the keyword levels for decision support, Zhang et al. )。考虑到关键词水平对决策支持的重要性,Zhang等人。[26] proposed the 提出了Level Aware Set Space Keyword Query (LCSK) to find a set of POIs that jointly covers the query keywords with threshold constraints and minimal spatial distance cost. An exact algorithm and approximation algorithm with provable approximations related to this problem were designed for the LCSK. To better express more fine-grained preferences for cost-aware and distance-constrained keyword queries in the set space, Chan et al. (LCSK),以找到一组共同覆盖具有阈值约束和最小空间距离成本的查询关键字的POI。为LCSK设计了一种与该问题相关的精确算法和具有可证明近似的近似算法。为了更好地表达在设定空间中对成本感知和距离约束的关键字查询的更细粒度的偏好,Chan 等人。[27]提出了新的准则,优化了成本函数,设计了精确近似算法。徐, proposed new criteria, optimized the cost function and designed an exact and approximate algorithm. Xu et al. [28] differed from other works in that it studied mobile set space keyword queries from the dynamics of query points and raised two approximation algorithms based on the safety zone technique.与其他工作不同的是,它从查询点的动力学角度研究了移动集空间关键字查询,并提出了两种基于安全区技术的近似算法。 考虑到时态信息在空间关键词查询中的重要性,一些研究者提出了具有时态感知的查询模型。Considering the importance of temporal information in spatial keyword queries, some researchers have proposed query models with temporal awarenesshen 等. Chen et al. [29] first studied temporally aware Boolean keyword queries, known as a 首先研究了时间感知布尔关键字查询,称为TABSKQ (time-aware Boolean spatial keyword query), and proposed an efficient index structure TA-tree and its corresponding query algorithm for this problem model. The index can effectively prune the search space and take into account both keyword information and temporal information. Chen et al),并针对该问题模型提出了一种高效的索引结构TA树及其对应的查询算法。索引可以有效地修剪搜索空间,并同时考虑关键字信息和时间信息。Chen 等. [30] proposed two evaluation functions for time-aware aggregate keyword queries to meet different types of query requirements and both proposed corresponding processing algorithms. 提出了两种用于时间感知聚合关键字查询的评估函数,以满足不同类型的查询需求,并都提出了相应的处理算法。Chan et al. 等人。[31] studied the model of indoor keyword routing with time constraints, i.e., the 研究了具有时间约束的室内关键词路由模型,即TIKRQ problem (time-constrained indoor keyword-aware routing query), and proposed a series of pruning rules and corresponding solution algorithms. Considering that most of the existing works cannot solve the spelling error cases and thus focus on the time-aware approximate set keyword search in traffic networks, Feng et al. 问题(时间约束的室内关键词感知路由查询),并提出了一系列剪枝规则和相应的求解算法。考虑到现有著作大多无法解决拼写错误问题,因此主要集中在交通网络中的时间感知近似集关键词搜索上,Feng等人认为。[32] proposed a 提出了一种用于查询对象距离剪枝的TDAG-tree index for the distance pruning of query objects and designed two approximation algorithms to improve the processing efficiency to a large extent. 树索引,并设计了两种近似算法,以在很大程度上提高处理效率。

References

  1. Zhang, L.; Li, S.; Guo, Y.; Hao, X. A Method for k Nearest Neighbor Query of Line Segment in Obstructed Spaces. J. Inf. Process. Syst. 2020, 16, 406–420.
  2. Zhang, L.; Ren, L.; Hao, X.; Li, S. Query Method for Nearest Region of Spatial Line Segment Based on Hilbert Curve Grid. Int. J. Innov. Comput. Inf. Control 2019, 15, 1287–1307.
  3. Yang, R.; Niu, B. Continuous k Nearest Neighbor Queries over Large-Scale Spatial–Textual Data Streams. ISPRS Int. J. Geo-Inf. 2020, 9, 694.
  4. Li, S.; Hu, Y.; Hao, X.; Zhang, L.; Hao, Z. Approximate k-Nearest Neighbor Query of High Dimensional Data Based on Dimension Grouping and Reducing. J. Comput. Res. Dev. 2021, 58, 609–623.
  5. Dias de Almeida, J.P.; Durão, F.A. Personalizing the Top-k Spatial Keyword Preference Query with Textual Classifiers. Expert Syst. Appl. 2020, 162, 113841.
  6. Wang, J.; Xiong, Z.; Han, Q.; Han, X.; Yang, D. Top-k Socially Constrained Spatial Keyword Search in Large SIoT Networks. IEEE Internet Things J. 2022, 9, 9280–9289.
  7. Zhang, D.; Tan, K.-L.; Tung, A.K.H. Scalable Top-k Spatial Keyword Search. In Proceedings of the 16th International Conference on Extending Database Technology—EDBT’13, Genoa, Italy, 18–22 May 2013; ACM Press: Genoa, Italy, 2013; p. 359.
  8. Pan, X.; Yu, Q.-D.; Ma, A.; Sun, Y.; Wu, L.; Guo, J. Efficient algorithm of top-k spatial keyword search with OR semantics. J. Softw. 2020, 31, 3197–3215.
  9. Cheong, O.; Vigneron, A.; Yon, J. Reverse Nearest Neighbor Queries in Fixed Dimension. Int. J. Comput. Geom. Appl. 2011, 21, 179–188.
  10. Allheeib, N.; Islam, M.S.; Taniar, D.; Shao, Z.; Cheema, M.A. Density-Based Reverse Nearest Neighbourhood Search in Spatial Databases. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 4335–4346.
  11. Islam, M.S.; Shen, B.; Wang, C.; Taniar, D.; Wang, J. Efficient Processing of Reverse Nearest Neighborhood Queries in Spatial Databases. Inf. Syst. 2020, 92, 101530.
  12. Pan, X.; Nie, S.; Hu, H.; Yu, P.S.; Guo, J. Reverse Nearest Neighbor Search in Semantic Trajectories for Location-Based Services. IEEE Trans. Serv. Comput. 2022, 15, 986–999.
  13. Cao, X.; Cong, G.; Jensen, C.S.; Ooi, B.C. Collective Spatial Keyword Querying. In Proceedings of the 2011 International Conference on Management of Data—SIGMOD’11, Athens, Greece, 12–16 June 2011; ACM Press: Athens, Greece, 2011; p. 373.
  14. Song, X.; Xu, J.; Zhou, R.; Liu, C.; Zheng, K.; Zhao, P.; Falkner, N. Collective Spatial Keyword Search on Activity Trajectories. Geoinformatica 2020, 24, 61–84.
  15. Chan, H.K.-H.; Long, C.; Wong, R.C.-W. On Generalizing Collective Spatial Keyword Queries. IEEE Trans. Knowl. Data Eng. 2018, 30, 1712–1726.
  16. Zhang, D.; Chee, Y.M.; Mondal, A.; Tung, A.K.H.; Kitsuregawa, M. Keyword Search in Spatial Databases: Towards Searching by Document. In Proceedings of the 2009 IEEE 25th International Conference on Data Engineering, Shanghai, China, 29 March–2 April 2009; IEEE: Shanghai, China, 2009; pp. 688–699.
  17. Guo, T.; Cao, X.; Cong, G. Efficient Algorithms for Answering the M-Closest Keywords Query. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, 31 May–4 June 2015; ACM: Melbourne, Australia, 2015; pp. 405–418.
  18. Deng, K.; Li, X.; Lu, J.; Zhou, X. Best Keyword Cover Search. IEEE Trans. Knowl. Data Eng. 2015, 27, 61–73.
  19. Choi, D.-W.; Pei, J.; Lin, X. Finding the Minimum Spatial Keyword Cover. In Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 16–20 May 2016; IEEE: Helsinki, Finland, 2016; pp. 685–696.
  20. Li, J.; Xu, M. A Parametric Approximation Algorithm for Spatial Group Keyword Queries. IDA 2021, 25, 305–319.
  21. Singh, V.; Zong, B.; Singh, A.K. Nearest Keyword Set Search in Multi-Dimensional Datasets. IEEE Trans. Knowl. Data Eng. 2016, 28, 741–755.
  22. Gao, Y.; Zhao, J.; Zheng, B.; Chen, G. Efficient Collective Spatial Keyword Query Processing on Road Networks. IEEE Trans. Intell. Transport. Syst. 2016, 17, 469–480.
  23. Cao, X.; Chen, L.; Cong, G.; Xiao, X. Keyword-Aware Optimal Route Search. Proc. VLDB Endow. 2012, 5, 1136–1147.
  24. Zhao, S.; Cheng, X.; Su, S.; Shuang, K. Popularity-Aware Collective Keyword Queries in Road Networks. Geoinformatica 2017, 21, 485–518.
  25. Su, S.; Zhao, S.; Cheng, X.; Bi, R.; Cao, X.; Wang, J. Group-Based Collective Keyword Querying in Road Networks. Inf. Process. Lett. 2017, 118, 83–90.
  26. Zhang, P.; Lin, H.; Yao, B.; Lu, D. Level-Aware Collective Spatial Keyword Queries. Inf. Sci. 2017, 378, 194–214.
  27. Chan, H.K.-H.; Liu, S.; Long, C.; Wong, R.C.-W. Cost-Aware and Distance-Constrained Collective Spatial Keyword Query. IEEE Trans. Knowl. Data Eng. 2021, 35, 1324–1336.
  28. Xu, H.; Gu, Y.; Sun, Y.; Qi, J.; Yu, G.; Zhang, R. Efficient Processing of Moving Collective Spatial Keyword Queries. VLDB J. 2020, 29, 841–865.
  29. Chen, G.; Zhao, J.; Gao, Y.; Chen, L.; Chen, R. Time-Aware Boolean Spatial Keyword Queries. IEEE Trans. Knowl. Data Eng. 2017, 29, 2601–2614.
  30. Chen, Z.; Zhao, T.; Liu, W. Time-Aware Collective Spatial Keyword Query. ComSIS 2021, 18, 1077–1100.
  31. Chan, H.K.-H.; Liu, T.; Li, H.; Lu, H. Time-Constrained Indoor Keyword-Aware Routing. In Proceedings of the 17th International Symposium on Spatial and Temporal Databases, online, 23 August 2021; pp. 74–84.
  32. Feng, Z.; Jin, C.; Kim, H.; Cui, X. Time-Aware Approximate Collective Keyword Search in Traffic Networks. Knowl.-Based Syst. 2021, 229, 107367.
More
Video Production Service