1000/1000
Hot
Most Recent
Human mobility, the movement of human beings in space and time, reflects the spatial-temporal characteristics of human behavior. With big data analytics, human mobility research can be used to facilitate smart city development, in multiple disciplines such as smart traffic, smart urban planning, smart health, smart safety, smart commerce, etc. A framework for linking international academic research and city-level management policy was established and applied to the case of Hong Kong. Literatures regarding human mobility research using big data are reviewed. These studies contribute to (1) discovering the spatial-temporal phenomenon, (2) identifying the difference in human behaviour or spatial attributes, (3) explaining the dynamic of mobility, and (4) applying to city management. Then, the application of the research to smart city development are scrutinised based on email queries to various governmental departments in Hong Kong. With further improvement in the practical value of data analytics and the utilization of data sourced from multiple sectors, paths to achieve smarter cities from policymaking perspectives are highlighted.
With the use of information and communication technologies infiltrating into various aspects of peoples’ daily life, big data sets concerning human mobility provide a powerful social microscope to support the discovery of complex knowledge[1]. The new sources of incessantly produced big data shed light on the understanding of human activity, from both individual and urban perspectives, helping to discover the relationships between human mobility and resource use in the social and spatial spheres of cities [2]. Big data analytics (e.g., exploratory data mining and machine-learning) are applied to intelligent decision-making, and the interplay between technology, social change and policy strategies entails enormous opportunities for smart city development [3].
The concept of the smart city is an expression of the multitude of urban living domains to which technology and policy interventions can be applied, and the domains classified as “hard” domains include office and residential buildings, energy grids, natural resources, energy and water management, waste management, environment, transport, mobility and logistics, as well as “soft” domains such as education, culture, e-government and policies that foster economic and social welfare [4]. The spatial-temporal characteristics of human mobility, as the most frequently investigated application of big data analytics[5], can be used to facilitate smart city development, in multiple disciplines such as smart traffic, smart urban planning, smart health, smart safety, smart commerce, etc.[6][7]. For example, the convergence of transportation and information technologies plays key roles in facilitating cooperative transportation, where data streams from sensed transportation, geographic information science and social media can be fused for human engagement and shared decision making[8]. The analytics of geospatial data, both in traditional forms of raster, vector, or graph and from advanced sources of sensors and mobile devices, enhances the power of location, which has been applied for fuel and time savings, revenue increase, and health care[9].
With the increase of big data and the advancement of technologies, city development could be better understood and promoted based on data-driven knowledge. On the side of data analytic, the practical value of data-driven research is to be discovered, whereas on the city development side, the transitions for smarter applications in multiple domains and disciplines are constantly emerging. How can governments and policy makers apply data-driven research to smart city development is worth further exploration.
Based on a literature review, the studies are primarily categorised by data sources, including mobile phone, mass media, taxi or car, bank card, metro card, website, wireless devices, and multiple sources. Mobile phone (trajectories or call records) is the source that produces the richest information with constantly track locations of individuals and passengers, which is similar to a taxi or car that reflect the locations of the vehicles as well as the drivers. Mass media is related to digital apps and human activities, while bank card is related to transactions and metro card reflects travel behaviour. Data obtained by wireless devices facilitates the understanding of mobility patterns at the smallest scales. Researches based on multiple sources strengthen the power of big data by integrating difference categories of information.
The literature is also categorised by 4 levels of aims, which are discovering the phenomenon, identifying the difference, explaining the differences and implementation in city management. Human and space are two objects normally targeted in research, since human behaviour is influenced by both personal attributes (who they are) and spatial attributes (where they live or work). Table 1 summarizes these researches, categorised by data source and research objective.
Table 1. The applications of “human mobility” research using big data.
Data Source | Mobile Phone | Mass Media | Taxi and Car | Bank Card | Metro Card | Website | Multi- Source Data | |
---|---|---|---|---|---|---|---|---|
Research Objective | ||||||||
Discover the phenomenon | human mobility pattern | convergence and divergence pattern, accessibility | spatial-temporal path | mobility pattern, crashes | mobility patterns | travel pattern along railway | mobility pattern | |
space feature | activity pattern, land use pattern, street, green space usage, traffic condition | heat maps | congestion pattern | spatiotemporal structure, spatial pattern | ||||
Identify the differences | population clustering | central behaviour, collaborative behaviour, clustering by personality, distance, demographic feature, use feature, transport mode | micro-populations clustered by demographic feature or by personality | driver/passenger behaviour clustered by trip purpose or pattern | social bridge, human consumption pattern | rhythms of urban life | ||
space clarification | Job–housing balance, locational uncertainty | tourism market segmentation | interaction parcels | city ability | vibrant areas, station functions, urban form | the effectiveness of growth boundary | ||
Explain the differences | dynamics of human behaviour in space usage | social ecology, social and economic activity, human-interest, deep knowledge, cyberspace and physical space relation | cultural mechanism | migration effect | mobility features and app usage behaviour | |||
Imply to city management | for quality of life | health-disease, health-air quality, crime prediction | health-disease, crime prediction | public facility | activity pattern and prediction | collective travel time | ||
for resource use | infrastructure | touristic potential | bus route plan, electric cabs | environmental management, transport system | ||||
Count (excluding COVID-19 research) | 84 | 39 | 12 | 14 | 4 | 6 | 2 | 7 |
COVID-19 and human mobility | behavioural change, forecast of epidemic risk, impacts of mobility restriction | - | - | - | factors of pandemic transmission, forecast of epidemic risk, control measures, impacts of mobility restriction | |||
factors of pandemic transmission, control measures | post-pandemic economic recovery | |||||||
Count (COVID-19 research) | 40 | 20 | 4 | 0 | 0 | 0 | 11 | 5 |
To understand the spatio-temporal patterns of human mobility is one of the basic targets and fundamental to follow-up analysis. Spatial-temporal paths consistent for each individual are mined from mobile data, whilst relationships among the generated hotspots in temporal dimensions are discovered[10]. Large-scale collective urban mobility was analysed to elaborate people flows between areas, as well as the exceptional patterns that are associated with real-world events such as soccer matches[11]. With taxi GPS data, the distributions of origin/destination data are visualised, based on which trips are confined to a certain distance or orientation range and drivers’ choices of routes could be figured out[12]. From massive event track app data, heat maps of popular routes online can be generated and visualised by virtue of client preferences[13]. Apart from visualization, the movement pattern could be assessed by feature, e.g., stability, in terms of the convergence and divergence of individuals[14].
Apart from phone and vehicle data, WiFi and locational data are used to uncover the transition characteristics between dynamical regimes[15], or to identify frequencies, directions, intensity, and trajectories[16]. With the transaction data of a smart subway fare card, the mobility model was used for generating the heterogeneous flows at aggregated level as well as visitations at the individual level[17].
Human mobilization, although it is a kind of human behaviour, implies the attributes of the specific spaces where mobility happens. To discover the characteristics of land use, big data analytics figure out clusters, such as residential area, office, leisure-commerce and rush hour clusters[18]. The inference of changes in movements facilitates the understanding of polycentric urban form based on the identified city hubs, centres, and borders[19]. Moreover, by involving the information of land use functions, the spatial variation of self-containment and employment and jobs–housing balance could be demonstrated[20].
The interaction between human behaviour and the urban environment could be re-interpreted in more quantitative approaches with big data. For example, population exposures in urban greenery were highly correlated with land use distribution and the dynamics of human mobility[21]. On a worldwide scale, the tourism hotspots in the US and EU cities are identified based on geotagged photographs, which in turn discovers the power-law distribution of the attractiveness of tourist attractions[22].
To achieve the research objectives of discovering the phenomenon, methods of spatial analysis and visual analytics are typically applied. Software tools with different functions are used, including but not limited to ArcGIS for spatial data analyses and visualizations, Java/Python for data processing, Tableau for visual analytics, SPSS for integrity and validity checks, PostgreSQL/PostGIS for data management, and the R platform for graphing.
To identify the differences in human mobility and space utilization, clustering analysis is commonly adopted to distinguish the differences by parameters such as demographic features of gender, age, income, nationality, etc. It was found that pedestrian mobility is disaggregated between locals and foreigner, implying the impact of tourist flows on the resident life [23]. Via Twitter, human mobility measures are differentiated particularly between different race/ethnicity groups, then the age[24], while gender difference is observed in the location-based check-in data of Weibo[25].
In addition to the demographic parameters, categorizations are processed by mode or demand or other behavioural profiles, either observed or self-defined. Mobile sensed data analytics contribute to the recognition of user transport modes, e.g., by bike, bus, car, foot, and train[26]. Travel demands of taxi passengers are categorised by incorporating and taking into consideration the distance and time constraints, and geographical context[27]. Patterns of mobile users are clustered by both communication behaviour and average revenue per user[28], or by individual space–time profiles with different movements and behaviours[29]. Using data from the wide area network, by monitoring traffic in a few cities, vehicle trajectories are mined and categorised by time or location features[30].
The discovery of differences enriches the understanding of human behaviour and enhances the quality of services. It was found that people who have a higher diameter of area of influence (DAI) also spend more money for mobile communications, indicating the potentiality to use DAI as a feature for classifying socioeconomic levels of cities[31]. Tourism market segmentation based on crowdsourced data facilitates the provision of more personalised tourism services[32].
Systematic analysis combining difference and similarity among clusters improves the allocation of resources. To enhance collaborative mobility (CM) such as sharing travelling resources, the system with extracted GPS traces can automatically find sharing opportunities and recommend behavioural changes for conducting sharing services[33].
Methods of machine learning are widely adopted among this group of studies, such as support vector machines (SVMs), C-means fuzzy clustering, K-means clustering, hierarchical clustering, ant colony algorithm, etc.
Although complex, the differences among groups or individuals are to some extent explicable, based on personal properties, spatial attributions, or the interactive effects of humans and the environment.
Individually, attributes regarding personality, interest, habit, memory, etc, are influential to human mobility. Personality traits help to explain individual differences in location-based social networks usage and the types of places visited[34]. The big five personality traits are commonly applied, and it was found that extraverted individuals have more diverse routines while high neuroticism and openness individuals associate with more instable routine[35]. Interest dynamics of humans, referring to preferential return to previously visited interests, the inertial effect, and exploration of new places, are causally connected to mobility behaviour [36]. Preference itself has rich connotations. For example, differences in the lifestyles between big city and small city residents were observed through the latent patterns of human preference, which is related to both age group and city context[37].
Another interesting discovery is that individual properties are determinative in both physical space and cyberspace movement, the latter defined as changes in online activities, typically corresponding to switching in the websites of exploration. It was detected that movements in the two spaces were similar in terms of exploration of new sites and preferential return to highly visited sites [38]. The unified underpinning of mobility in the real world and cyberspace is also explained as a memory-preferential mechanism in limited space[39].
Spatial or regional attributes, such as economic-social features, cultural sociology, land use function, etc. are important drivers for different mobility behaviours. Distinctive patterns highlight the similarity or the difference in behaviour among regions in Uganda, distinctly related to economic and social regional characteristics [40]. Social ecology reflected by dominant urban functions is also determinative[41], and one example is about intangible cultural ties, e.g., the existence and persistence of dialect borders identified with Weibo data affect human mobility for social interactions[42].
The integration of multiple data sources enhances the interpretation of different space attributes. Based on subway transaction records and points of interest (POI) information, the latent functions of subway stations and the correlation between mobility patterns and station functions were illustrated[43]. Similarly, building functions are inferred with WeChat location records, taxi GPS trajectories, POI, and building footprint from high-resolution Quickbird images[44].
Latent features of either human behaviour or place attributes reflected from economic activities in the process of movement are inspiring yet noteworthy. High income taxi drivers are more likely to deny passage to passengers[45]. The integration of individual economic spending with mobility implies the ability of cities to attracting foreign visitors[46]. People may become the “social bridges” between communities, since it is found that those who work at close-by locations could act similarly in community purchase behaviour even in different communities[47].
In the physical world, mobility is more like a result of interaction between individuals and the space or environment. Methods of statistical analysis are applied for explaining the drivers and their effects, including correlation analysis (mostly used), regression models, preferential attachment models to reproduce the statistical scaling of behaviours, etc.
The results of some researches could be applied to city management, in multiple disciplines, here mainly categorised in four aspects of transportation system, public health, urban governance, and environment management and policy.
Transportation is the domain where applications of big data are broad. Strategies are proposed in the literature to predict the travel time of any trip, to provide real time traffic information, to predict movement patterns based on personality or spatio-temporal routines, to improve crash analysis, to plan bus routes, to improve taxi dispatch, and to optimise traffic during big events.
The consensual 3D speed map revealing the urban congestion patterns is used to predict the travel time of any trip within the city[48]. With information from individual participants or diary and trajectory generators, the overall traffic could be compared or monitored in real time[49][50]. With multiple data sources, the impact of big events on traffic could be estimated and strategies for reducing the collective travel time and congestion are proposed by evaluating different route choice scenarios at the urban scale[51]. Based on the established relationship between human mobility and personality, movement patterns could be predicted[52]. Twitter-based activity information could be incorporated in spatial analysis of crashes in urban areas, and geographically weighted regression (GWR) models were developed to establish a relationship between the crash counts reported and various contributing factors[53]. Intending to explore the night bus route planning issue by using taxi GPS traces, analysis was conducted to achieve the maximum number of passengers under the given conditions and constraints[54]. Agent-based modelling of taxi behaviour simulation with probe vehicle data helps to get deeper understanding of taxi behaviour, for better managing and dispatching the fleet of the taxi that can make monetary profit for the drivers[55].
Increasing attention has been paid to the integration of human mobility data into the policy of public health, particularly in supporting environmental exposure assessment and programs designed for spatiotemporal disease control.
With population density data derived by mobile phone traffic and station-based air quality data, dynamic urban air pollution (e.g., PM2.5 concentrations) population exposures were investigated in details and compared among residents, tourists and commuters [56][57]. For guiding disease control, intra-urban transmission risk of dengue fever with hourly cell phone data was mapped, considering local human movements, temporal scale, and positioning uncertainty[58], while the degree of variation in transmission coefficient, a designated spatially and temporally dependent variable, was inferable using geolocated tweets[59]. Drivers of endemic schistosomiasis were unveiled with both socioeconomic and environmental factors and human mobility features, based on which the effects of control, for example, via improved access to safe water or educational campaigns were analysed [60]. However, it is argued that even though mobility and infectious diseases have been connected by quantifying disease-relevant travel patterns on fine spatial and temporal scales, how a system can be currently implemented against disease incidence remains questionable; the models need to be formally statistically evaluated and tested[61].
The increasing involvement of human mobility data in public health research has been well reflected in the recent intensive mobility-related studies on the COVID-19 pandemic. In particular, these studies have been facilitated by the latest public and aggregated massive mobile phone data provided by major location-based service vendors, such as Google COVID-19 Community Mobility Reports[62] and Baidu Migration data[63]. A focus of mobility-related COVID-19 studies is on the interactions between human mobility and pandemic. The interactions include the role played by human mobility in the pandemic transmission[64][65]; the combined association of mobility and other environment and socioeconomic factors on the pandemic spread [66] and peoples’s change in mobility behaviours and the use of urban spaces, usually because of mobility restrictions and social distancing policies[67][68]. Based on the mobility-pandemic interactions, studies have further involved human mobility data into the forecast of epidemic risk[69][70], as well as the evaluation and recommendation of various epidemic control measures[71][72]. The environmental and socioeconomic impacts of mobility restriction resulted from the pandemic have also been further investigated. While the reduced human mobility and activities have been found to result in a general reduction of pollution[73] [74], they also have been found to lead to great economic loss and aggravation of social issues, such as inequality[75]. Thanks to the availability of timely human mobility data, various online and mobile platforms have been developed for near-real-time mapping of epidemic cases and predicted epidemic risks, to help anti-pandemic policy making and the self-protection of the public[76]. A list of technical tasks in geographic information systems to understand, predict, and support the control measures of COVID-19 are summarised in[77].
Human mobility patterns reveal the movement of a large population, potential applications of which are raised in domains from big events management, disaster evacuation, crime prediction, and safety estimation, to public space/facility provision.
Knowledge about the occurrences of big events derived from bus trajectories including event start time, end time, temporal patterns of arrival and departure of event attendees, etc. can be usefully applied in event planning and management[78]. For predicting human evacuation following disasters, human emergency behaviour could be modelled using an intelligent deep learning system such as DeepMob [79]. Regarding the classification of crime hotspots, aggregated human behavioural data captured from the mobile network infrastructure, in combination with basic demographic information, can describe the crime level of a geographical area and assist crime prediction[80]. The safety of places in new developments or public areas is possible to estimate by operationalizing relevant concepts from criminological and urban theories, with prediction features exploited from census data, Foursquare, subway usage, and taxi usage [81]. The spatial interaction characteristics and service areas of different types and sizes of public facilities are identified based on taxi drop on drop off data, for optimizing the spatial configuration of medical or other types of public facilities[82].
Referring to environment-related policies, big data contributes to the management of environment projects as well as the analysis of policy adoption. Data science and environmental management draw upon the results of national projects. The LENVIS project in Italy enables users to upload video of localised pollution events and personal narratives, forming a network of monitoring the interplay between the concentration levels of a set of airborne pollutants and the health care in hospital [83]. To assess whether the policy of energy-saving vehicles can be implemented, real mobility information is used as decision support for the taxi fleet manager to promote the adoption of electric taxi cabs [84]. Focusing on economic feasibility of the policy, quantitative calculations show that introducing electric vehicles in the taxi fleet increases the purchase cost of the taxis, however service, maintenance and repairs and fuel costs can be reduced, resulting in a reduction in total costs [85].
In summary, the hierarchy of research was illustrated by objective, with data requirements and methods briefly marked. The upper level is to simply show what happened in real world (discovering the phenomenon), mainly using the methods of basic spatial analysis and visualization. The middle levels attempt to explain how and why it happened with more data reflecting social-economic factors and their connections with spatial mobility using mathematical methods such as statistical analysis and machine learning. The bottom level of applications is based on prediction, a platform for information exchange, and knowledge generated from subject data that can be used to explore what we can do for city development and management. Although the methods applied at the bottom level were not always complex, more data are required to achieve the objectives at the deeper levels; see Figure 2.
Figure 2. Hierarchy of research by objective (data requirement and method).
Even though the advantages of applying big data analytic in the city are emphasised, the linkage between the concepts of human mobility and the smart city is still fuzzy. However, the current trends and evolution patterns of any individual smart city depend on the local context factors, and in order to shape appropriate strategies for particular cities therefore policy makers are urged to understand these factors[4].
To identify the gaps between the data-driven research and potential implementation in the various domains of smart city developments, communications with government are essential. In the case of HK, since only a few questions (not more than four) are targeted for each governmental department, emails rather than interviews were adopted for communication.
Focusing on the management scopes of different departments including the Transport Department, the Hong Kong Police Force, the Department of Health, the Environmental Protection Department, the Hong Kong Tourism Board, and the Planning Department, questions about “How can big data and the advanced analytics/technologies proposed in research applied in the real work?” were enquired with a brief introduction of related literature. Table 2 lists the fields of the questions for the corresponding government sectors. Details of the emails and the responses refer to Supplementary Materials.
Table 2. Enquiries sent to departments of Hong Kong (HK).
Department | Question |
---|---|
Transport Department (TD) | 1. About Route Planning of Public Transport 2. About Big Event Transportation Management 3. About Taxi Dispatching 4. About Vehicle Crash Analysis |
Hong Kong Police Force (HKPF) | 5. About Crime Prediction |
Department of Health (HD) | 6. About Disease Control |
Environmental Protection Department (EPD) | 7. About Air Pollution Exposure 8. About Electric Vehicle Adoption |
Hong Kong Tourism Board (HKTB) | 9. About Tourism Behaviour 10. About Market Segmentation 11. About Old Town Heritage Tourism |
Planning Department (PD) | 12. About Land Use Function |
According to their responses, the importance of applying big data in improving city governance was confirmed and many smart initiatives associated with human mobility were mentioned. Nevertheless, regarding the application of the introduced analytics, barriers are still significant. The difficulties mentioned in replied emails were interpreted in five aspects in the following Section 2.2.
Data isolation. The databases managed by different departments are not well integrated, and therefore the methodologies used in research could not be applied due to data limitation. In addition, data from sub-institutional sectors under the management of the same department may also be isolated. For example, regarding taxi dispatch, a few taxi hailing platforms are being used. Thus, the TD indicates that setting up or integrating these platforms is the primary condition to improve the operational efficiency and service quality of taxis.
Data unavailability. According to the reviewed literature, data from mobile phones is advantageous in reflecting human behaviour and discovering knowledge of peoples’ mobility. However, in the case of HK, these data are not available to government at the current moment, restricting the application of related research in real world. One example is the analysis of the real-time influence of air quality on human daily life. The real-time population exposure was measured in studied areas using mobile- and station-based big data, considering the spatiotemporal variability of both population distribution and pollutants concentrations. In contrast, the studies conducted by EPD took the air quality of a city as homogeneity without sufficient considerations of the spatial differences in pollution exposures.
Costs versus quality of data. The costs not only for the short-term installations but for the long-term operation are unneglectable barriers for obtaining big data, particularly those with high quality. Referring to air quality monitoring, even though some lower cost air monitoring devices are capable of providing quality data, their operating cost is not much lower than that of the existing stations, and thorough consideration is essential in planning for long term deployment of additional monitoring.
Limited knowledge derived from rich data. Even with integrated big data, how can deeper knowledge be generated based on more scientific analytics requires much effort. In the planning of bus routes, franchised bus companies submit proposals for service adjustment through the annual route planning programmes; however, the impacts are considered case by case rather than based on systematic analytics, by taking into consideration all the routes in the city as a whole. There is a large scope of information to turn the absorbed big data into generalized knowledge of human behaviour and space use. For instance, the HKTB has collaborated with social and digital media players such as Google, TripAdvisor and Facebook; nevertheless, the obtained data was not thoroughly used and mined to elaborate tourism behaviours, which if applicable, will be helpful in clarifying destination brandings as well as demonstrating the tourism market.
Estrangement between public and private sectors. The involvement of non-public sectors and cooperation among government, companies, institutions and even individuals are essential to the implementation of smart initiatives, yet proves to be difficult. Taking for instance the adoption of electric vehicles (EVs), chargers for public EVs are being installed to test the real-time electronic information through the Government Electronic Platform. However, how to enlarge the promotion of EVs to private sectors and individual car owners remains an open question to be further discussed.
In summary, the achievements in smart city development are laudable, yet it is still in the start-up stage, which mainly focuses on the base of smart infrastructure, such as the Common Spatial Data Infrastructure (CSDI) by the Development Bureau, map service platform by the Lands Department, Intelligent Transport Systems (ITS) by the TD, smart travel platform by HKTB, etc. Among the smart sub-domains, transportation is the area that is most related to human mobility and has applied more data-driven analytics and technologies than other areas. To accomplish the mission to “maximise innovation, knowledge and value creation for the government, business and the community”, the penetration of smart technologies to other development and management domains will come a long way.