With the advent of smart mobile devices, end users get used to transmitting and storing their individual privacy in them, which, however, has aroused prominent security concerns inevitably. Numerous researchers have primarily proposed to utilize motion sensors to explore implicit authentication techniques.
1. Introduction
With the advancement of mobile communication technology and hardware updates, mobile intelligent terminal devices have become increasing popular and are widely used in people’s daily lives. The mobile application market has an anticipated annual consumer spending of USD 233 billion on the Apple App Store and Google Play store in 2022–2026 [
1]. According to Grand View Research, the global mobile application market is expected to grow at a compound annual growth rate (CAGR) of 13.4% from 2022 to 2030 [
2]. As more and more users transmit and even store their private data in mobile devices, it becomes very important to avoid information leakage in the network attack and defense. Especially in the instance where users are part of organizations such as enterprises, governments, and national infrastructure, the information leakage caused by advanced persistent threat attacks, conducted mainly by accessing the mobile devices to collect valuable confidential information, would be a catastrophe. Therefore, in order to safeguard their privacy and security, it is urgent to design suitable and robust user authentication models based on both the application scenarios and the features of mobile devices.
The current authentication methods for mobile users can be broadly categorized into two types: knowledge-based and biometric-based. Knowledge-based authentication methods [
3] require users to explicitly input information such as passwords or patterns. Although these methods are widely used due to their low cost, they have to face challenges from the perspective of usability and security [
4,
5]. For example, (1) they are prone to various attacks such as brute force, shoulder surfing, smudge, inference, and social engineering attacks, and (2) inputting the same password repeatedly in small dialog boxes would have impact on the user physical experience. In contrast, biometric-based authentication methods [
6] can mitigate the above issues to some extent. However, frequently using the facial or fingerprint recognition may also, especially in some ungoverned scenarios, bring psychological discomfort. Therefore, both usability and security should be the priority for the authentication systems. Recently, there has been extensive research on user dynamic authentication methods based on motion sensors. These methods identify users’ information with machine learning or deep learning methods to discern unique behavioral patterns through their gait/gestures, because no privacy-related permissions are required in motion sensors. Then, researchers collect a series of non-privacy motion sensor data, such as accelerometer, gravity sensor, and gyroscope sensor data (through the system-level interface any application can obtain the data). However, in real-world and complex environments, it is hard to distinguish user micro features. The major challenges are as follows:
- (1)
-
In mobile user authentication, the majority of sensory signal transformation methods rely on expert knowledge [
7,
8]. They lack the in-depth data mining of motion sensor signals, and they are unable to effectively learn the complex nonlinear relationships between invariant features.
- (2)
-
Label noise is widely present in the training phase, e.g., device owners lend their phones to others. Most research has paid more attention to signal denoising, but less to reducing label noise [
9,
10]. If a classifier is trained with incorrect labels, continuous errors can accumulate. Even if labeled samples are obtained from the target person, the classifier may still fail to authenticate the device owner.
- (3)
-
The quality of handmade features is crucial for the performance of most classifiers. However, when dealing with complex mixed motion sensor signals, relying solely on statistical features can lead to critical information loss [
7,
11,
12,
13]. Additionally, the feature extraction process is typically fixed by the determined algorithm, whereas iterative optimization can be used to update the parameters of the classification model. This can hinder the improvement of algorithms for sensor-based mobile user authentication.
2. Authentication Methods
Mobile device user authentication methods are essential for protecting user data. Currently, they can primarily be classified into three types: knowledge-based methods [
5], static feature-based methods [
6], and dynamic behavior-based methods [
7]. Knowledge-based methods require users to explicitly enter a digital password or gesture pattern to unlock the mobile device or log into an application. These methods can only verify whether a user knows the credential but cannot determine whether the user is the device owner. Furthermore, they pose certain risks, such as poor human–computer interaction experience and privacy leakage. Previous studies have shown that they can easily be broken by brute force attacks [
15], smudge attacks [
16], shoulder surfing attacks [
17], and sensor inference [
18]. In contrast, static biometric authentication methods are based on fingerprints and faces, which can achieve relatively high recognition accuracy [
19,
20,
21]. However, except concerns about the user experience and privacy leakage mentioned in
Section 1, recent research has shown that misusing fingerprint APIs on Android can make applications vulnerable to various attacks [
19], and face recognition methods based on deep learning algorithms have been proven to be circumvented by sophisticated attackers [
20,
21]. Dynamic behavior-based authentication methods access data from built-in sensors in mobile devices, including environmental locations, keystroke behaviors, finger movements, etc., and they work through the combination of feature engineering and model training. These methods may call privacy-related permissions to obtain user privacy data.
Static/dynamic feature authentication methods are based on credential technology and privacy risks. Given their inherent drawbacks, numerous researchers [
8,
11,
12,
13,
22,
23,
24,
25,
26,
27] proposed dynamic user authentication based on motion sensors, which has the following limitations: First, most user authentication based on motion sensors requires a user to use their phone at a fixed location or perform specific actions (in a lab environment), which is unrealistic and results in a large number of noisy labels in complex environments (non-lab environments). Second, the credibility of the data collected in complex environments is often questionable, for example, unreliable labels are generated if the phone is used by others except the owner. To achieve both good user experience and high authentication accuracy simultaneously, this paper proposes an effective sensor-based mobile user authentication method in complex environments.
Data denoising method for sensor-based mobile user authentication. Currently, most dynamic authentication methods utilize motion sensors [
8,
9,
10,
11,
12,
13,
22,
23,
24,
25,
26,
27]. They do not consider the impact of hardware noise. Then, it is very difficult for them to handle unlabeled data (noise data) in real environments. Finally, overfitting occurs and authentication accuracy decreases. In order to address the noise, some researchers [
8] proposed noise elimination algorithms to obtain an effective dataset in the data preprocessing stage, but these algorithms cannot precisely distinguish the mislabeled and training samples. To overcome it, researchers used semi-supervised methods, combining noisy data with a set of clean labels [
7]. Zhu et al. [
7,
8,
9] observed that flat data cannot reflect the discrepancies of different user patterns in its collection. Removing it will omit analyzing its usability. Additionally, when collecting users’ data, researchers find the data are often mislabeled, such as unreliable labels we mentioned above. This paper simultaneously considers the noise and mislabeled data during the training process, fitting the training requirements to the greatest extent to provide high-quality data.
Mobile user authentication model based on motion sensors. The existing research methods using motion sensors [
8,
9,
10,
11,
12,
13,
22,
23,
24,
25,
26,
27] continuously collect sensor data and establish corresponding models to verify users’ ID. Lu et al. [
25] used unsupervised learning algorithms to process unlabeled data, but this method resulted in high latency. Additionally, unsupervised clustering algorithms with parameter adjustment have high overhead, and the parameter’s generalization needs to be verified. Zhu et al. [
7] designed a semi-supervised online learning algorithm. It has a high level of accuracy and low latency in processing unlabeled data under relatively complex environments, but the classification they used (binary class SVM) is not applicable to time series data due to ignoring the context of user behaviors. Furthermore, most existing studies [
8,
9,
10,
11,
12,
13,
22,
23,
24,
25,
26,
27] assume the input data are sufficient, which is not considered in real complex environments. In contrast, this paper proposes the transformation of 1D signals into 2D images. Meanwhile considering the spatio-temporal characteristics of sensory signals, we extract spatio-temporal features using CNN to achieve high mobile user authentication accuracy.
This entry is adapted from the peer-reviewed paper 10.3390/math11173708