In recent years, artificial intelligence (AI) and its subarea of deep learning have drawn the attention of many researchers. At the same time, advances in technologies enable the generation or collection of large amounts of valuable data (e.g., sensor data) from various sources in different applications, such as those for the Internet of Things (IoT), which in turn aims towards the development of smart cities. With the availability of sensor data from various sources, sensor information fusion is in demand for effective integration of big data.
Recent advances in technology have increased the popularity of the area of artificial intelligence (AI) , which aims to build “intelligent agents” with the ability to correctly interpret external data, learn from these data, and use the learned knowledge for cognitive tasks  like reasoning, planning, problem solving, decision making, motion and manipulation. Subareas of AI include robotics, computer vision, natural language processing (NLP), and machine learning . Within the latter, deep learning  has attracted the focus of many researchers. For instance, the development of AlphaGo (which uses deep reinforcement learning) for the board game of Go  has drawn the attention of researchers and the general public. In general, deep learning uses deep neural networks (DNNs), convolutional neural networks (CNNs), as well as recurrent neural networks (RNNs) like long short-term memory (LSTM) for supervised, semi-supervised, or unsupervised learning tasks  in various application areas like computer vision and NLP. Recently, deep learning has also been applied to the transportation domain , but for tasks like traffic flow forecasting, automatic vehicle detection, autonomous driving, and classification of speeding drivers.
Moreover, recent advances in technology have also enabled the generation or collection of large amounts of valuable data from a wide variety of sources in different real-life applications . For instance, different types of sensor data can be easily generated and collected in various Internet of Things (IoT)  applications—such as smart homes, smart grids, smart retail, smart cars, and smart cities . As an example, sensors (e.g., cameras; digital scanners; light imaging, detection, and ranging (LIDAR) ) mounted on aircrafts, small unmanned aerial vehicles (UAVs) , and other moving objects such as vehicles  have created large volumes of remotely sensed data, geospatial data, spatial-temporal data, and geographic information for the geographic information system (GIS) . As another example, sensors on the global navigation satellite system (GNSS) —such as the Global Positioning System (GPS) , GLONASS, Gaillieo and Beidou (which are originated in the USA, Russia, the EU and China, respectively), as well as other regional systems—have also created large volumes of geolocation and time information. With big sensory data from these sources and other sensors, sensor information fusion —which integrates sensor data and information from a collection of these heterogeneous or homogeneous sensors to produce accurate, consistent and useful knowledge—is in demand.
Urban data mining helps discover useful knowledge from urban data, which in turn helps solve some urban issues. For instance, the discovery of popular transportation modes (e.g., bicycles) of residents in a city helps city planners to take appropriate actions (e.g., add more bike lanes). To mine these urban data, researchers have traditionally been using paper-based and telephone-based travel surveys . Unfortunately, these travel surveys can be biased and contain inaccurate data about the movements of their participants. For instance, participants tend to under-report short trips, irregular trips, and car trips. They also tend to over-report public transit trips .
Alternatively, researchers have also been using commute diaries , which capture data about people’s daily commutes. Unfortunately, these diaries can be error-prone. When people are asked to use a diary to keep track of their commutes, they often forget to record their commutes throughout the day. When trips are recorded at the end of the day, diary studies can then inherit the same problems as paper-based and telephone-based travel surveys. Moreover, these diaries can also be a mental burden to study the participants, and thus cannot be used long term . Furthermore, as people’s willingness to record trips accurately throughout the day declines with each day of participation, the corresponding accuracy of the commute diaries also drops .
Recent advances in technology have led to the availability of sensor data, which in turn have led to better approaches for urban data mining. To elaborate, sensors enable users to track a large number of movement trajectories that are collected by participants of a study who use GNSS/GPS trackers or other sensors (e.g., accelerometers, barometers, Bluetooth, Wi-Fi, etc.). Hence, these GPS-based travel surveys  are more accurate than the travel surveys and commute diaries. However, the challenge of labeling trip purposes and classifying transportation modes persists. For instance, the manual segmentation of trajectories based on transportation mode can be labor intensive and is likely to be impracticable for big data . Any AI approach to automating such a task would obviously be beneficial to travel studies and other applications (e.g., IoT applications) that rely on contextual knowledge (e.g., the current travel mode of a person). For example, a driver would benefit from receiving a notification from his smartphone or smartwatch about an estimated arrival time for his trip (computed based on his current location, destination, and his interaction or saved frequently visited locations). As another example, urban analysts would benefit from an automatic trip transportation mode labeling method in a way is similar to timeline in Google Maps (which keeps track of a user’s location history and attempts to automatically classify trips with the major transportation mode). However, existing trip transportation mode labeling methods were not very accurate, needed corrections by the user, and do not track when transportation modes were changed. Hence, a more accurate method is needed.
Consider the use of standalone tracking and logging devices, which enables the participants of travel surveys to log sensor data accurately, reliably, and consistently as they have full control over the device and the hardware and software platforms are the same on every device. These devices can log data to local device storages, which are then collected for data retrieval. These devices can also connect to a smartphone application on a participant’s phone via Bluetooth and collect data regularly at intervals for further processing. To a further extent, transportation mode classification could happen on a smartphone, which then could reduce the computational burden on the logger device, decrease both architecture cost (as it requires weaker processing units) and power consumption, and thus increase the battery life. Among related works, Zheng et al.  used supervised decision trees and graph-based post-processing after classification to classify transportation modes from GPS data only.
In contrast, Hemminki et al.  used only accelerometer data to classify transportation modes (“stationary”, “walk”, “bus”, “tram”, “train”, “metro”). To elaborate, three different classifiers were trained with a combination of AdaBoost and Hidden Markov Model (HMM) for three different classes of modes. Shafique and Hato  also used accelerometer data only. They applied multiple machine learning algorithms to perform transportation mode classification and found that the Random Forest algorithm  gave accurate classifications.
Instead of using only GPS data or only accelerometer data, Ellis et al.  applied the Random Forest to both GPS data and accelerometer data for successful transportation mode classification.
Other than using both GPS data and accelerometer data, Hosseinyalamdary et al.  used both GIS and GPS data (together with an inertial measurement unit (IMU)). However, they used these data for tracking three-dimensional (3D) moving objects rather than classifying transportation modes. On the hand, Chung and Shalaby  developed a system that uses both GPS and GIS data to classify four transportation modes—“walk”, “bicycle”, “bus” and “car”—for GPS-based travel surveys by using a rule-based algorithm and a map-matching algorithm  to detect the exact roads people moved on. However, the accuracy of the system is dependent on the corresponding GIS data. Similarly, Stenneth et al.  also used both GPS and GIS data when building their real-time transportation mode classification system. To perform the classification, they used the Random Forest as the supervised learning algorithm to identify a person’s current transportation mode.
To recap, traditional methods for urban data mining include paper-based and telephone-based travel surveys , as well as commute diaries . To reduce the human workload and to utilize sensors and AI technologies for automatic processes, GPS-based travel surveys  were used. In recent years, advances in technologies have enabled the use of some combinations of data from different sensors (e.g., GNSS/GPS, accelerometers) and other modern smartphone sensors (e.g., barometer, magnetometer, etc.). Some related works  use only GPS data, while some others  use only accelerometer data. In addition, some related works  integrate both GPS and accelerometer data (i.e., an example of sensor information fusion), while some others  integrate both GPS and GIS data (i.e., another example of sensor information fusion). However, none of the aforementioned works combines GNSS/GPS, accelerometer, and GIS data in a single system. Hence, a system that integrates GNSS/GPS, accelerometer, and GIS data for urban data analytics and machine learning is in demand.