There are several definitions for “geosocial network” or “location-based social network”: the first formal definition was given by Quercia et al. in 2010, who defined it as “a type of social networking in which geographic services and capabilities such as geocoding and geotagging are used to enable additional social dynamics”. One year later, Zheng refined this definition by stating that “a location-based social network (LBSN) does not only mean adding a location to an existing social network so that people in the social structure can share location embedded information but also consists of the new social structure made up of individuals connected by the interdependency derived from their locations in the physical world as well as their location-tagged media content, such as photos, video, and texts”. In 2013, Roick and Heuser defined LBSNs simply as “social network sites that include location information into shared contents”. Finally, one most recent definition is given by Armenatzoglou and Papadias and is the following: “geosocial network (GeoSN) is an online social network augmented by geographical information”.
1. Definitions of LBSN or Geosocial Networks
There are several definitions for “geosocial network” or “location-based social network”: the first formal definition was given by Quercia et al.
[1] in 2010, who defined it as “a type of social networking in which geographic services and capabilities such as geocoding and geotagging are used to enable additional social dynamics”. One year later, Zheng
[2] refined this definition by stating that “a location-based social network (LBSN) does not only mean adding a location to an existing social network so that people in the social structure can share location embedded information but also consists of the new social structure made up of individuals connected by the interdependency derived from their locations in the physical world as well as their location-tagged media content, such as photos, video, and texts”. In 2013, Roick and Heuser
[3] defined LBSNs simply as “social network sites that include location information into shared contents”. Finally, one most recent definition is given by Armenatzoglou and Papadias
[4] and is the following: “geosocial network (GeoSN) is an online social network augmented by geographical information”.
From the above definitions, it is evident that the peculiarity of LBSNs is the coupling of geographical information/services with social network sites that allow LBNSN users to benefit from the communication and sharing functionalities provided by social networks, enhanced with geographic positions of users to locate contents, people, and activities in a physical space.
To model both the social and geographical relationships in it, a LBSN is often represented through a multilevel geosocial model, with a geosocial graph G(V, E); i.e., an undirected graph with vertex set V and edge set E. Each vertex
v ∈
V represents a user and has one or more spatial locations (
v.xi,
v.yi) with 1 ≤
i ≤
n in the two-dimensional space associated with the
n locations visited by the corresponding user, and has one or more geo-located media content
mj(
v.xi,
v.yi) with 1 ≤
j ≤
p associated to the
ith location visited by the corresponding user. Each edge
e = (
u,
v) ∈
E denotes a relationship (e.g., friendship, common interest, shared knowledge, etc.) between two users
v and
u ∈
V. A graphical representation of a geosocial graph G(V, E) representing an LBSN is given in
Figure 1. Three layers can be differentiated, as also suggested by Gao and Liu
[5]. The first layer, named
social layer, contains the users of the LBSN and the relationships among them. The second layer, named
location or
geographical layer, consists of the geographical information in the two-dimensional space associated with the locations visited by the users. The last layer, named
media content layer, contains information about the media contents produced/shared by the users when visiting the locations.
Figure 1. Multilevel geosocial model representing an LBSN with the three layers associated.
2. The Process of Querying Geosocial Data
To process the geosocial queries, different kinds of query primitives are defined in the literature as fundamental operations that can be further combined to answer a wide range of general-purpose geosocial queries. As suggested by Saleem et al.
[6], these kinds of query primitives can be grouped in three categories according to the layer of the geosocial graph that is exploited by the query primitive: social query primitives that exploit the data over the social graph, spatial query primitives that exploit the data over the spatial graph, and activity query primitives that exploit the data over the media content graph. A brief description of the query primitives used in geosocial query processing literature is provided in
Table 12.
Table 12. Query primitives.
Primitive |
Description |
Filter |
Removes some vertices or edges from the graph that do not satisfy a selection condition. |
Partitioning |
Compute a partition of the vertex set into n parts of size c. |
Clustering |
Partition the vertex set into a certain number of clusters so that vertices in the same cluster should be similar to each other, |
Pruning |
Simplify a graph by reducing the number of edges while preserving the maximum path quality metric for any pair of vertices in the graph. |
In addition to the query primitives, several basic heuristics or algorithms are applied to retrieve the geosocial data. Some examples found in the literature on geosocial querying are:
-
Best-first search algorithm: it allows to explore paths to search in the geosocial graphs by using an evaluation function to decide which among the various available nodes is the most promising to explore
[7];
-
Depth-first search algorithm: it allows to explore paths to search in the geosocial graphs by starting at a given node and exploring as far as possible along each branch before backtracking
[8];
Scoring/Ranking |
Ranks the vertices based on a scoring function to predict the values associated with each vertex. |
Sorting |
Re-arrange the vertices on the graph according to one or more keys. |
Join |
Compute the join between two vertex sets if a condition defined on their features is satisfied. |
-
-
Dijkstra search algorithm: it allows to find, for a given source node in the geosocial graph, the shortest path between that node and every other node
[9];
-
Branch and bound algorithm: it allows to explore branches of the geosocial graphs, which represent subsets of the solution set, by checking against upper and lower estimated bounds on the optimal solution and then enumerates only the candidate solutions of a branch that can produce a better solution
[10];
-
Measure and conquer algorithm: it allows to explore branches of the geosocial graphs, by using a (standard) measure of the size of the subsets of the solution set (e.g., number of vertices or edges of graphs, etc.) to lower bound the progress made by the algorithm at each branching step
[11].
Several query indexing approaches have also been developed in the literature to optimise the processing of geosocial queries and quickly retrieve all of the data that a query requires. Existing indexing methods can be roughly categorised into three classes: the spatial-first, the social-first, and the hybrid indexing methods. The spatial-first indexing methods prioritise the spatial factor for the index construction and then improve it with the social factor. For example, MR-Tree
[12], GIM-tree
[13], TaR-tree
[14], and SIL-Quadtree
[15] employ a spatial index (e.g., R-tree, Quad-tree, G-tree) and integrate it with the textual and social information of objects. The social-first indexing methods prioritise social relationships among objects for the index construction and then improve it with the spatial information of objects. Representatives of these methods are the Social R-tree
[16], B-tree
[17], and 3D Friends Check-Ins R-tree
[18], which index each user along with their social relationships and then integrate the spatial information. Finally, hybrid indices are developed to store both the spatial and social information of objects giving them the same priority. For example, NETR-tree
[19], CD-tree
[20], and SaR-tree
[21][22] encode both social information and spatial information into two major pieces of information that are used to prune the search space during the query time.