Sign languageA recognition is challenging due to the lack of communication between normal and affected people. Many social and physiological impacts are created due to speaking or hearing disability. A lot of different dimensional techniques have been proposed previously to overcome this gap. A sensor-based smart glove for sign language recognition (SLR) proved helpful to generate data based on various hand movements related to specific signs. A detailed comparative review of all types of available techniques and sensors used for sign language recognition was presented in this article. The focus of this paper was to explore emerging trends and strategies for sign language recognition and to point out deficiencies in existing systems. This paper will act as a guide for other researchers to understand all materials and techniques like flex resistive sensor-based, vision sensor-based, or hybrid system-based technologies used for sign language until nowhybrid system for sign language is a combination of both vision-sensor-based and combination of different sensors based models.
1. Introduction
1. Sensor-Based Models
A speaking or hearing disability is a cause by which people are affected naturally or accidentally. In the world’s whole population, there are approximately 72 million deaf-mute people. Lack of communication is seen between ordinary and deaf-mute people. This communication gap affects their whole lives. A unique language based on hand gestures and facial expressions lets these affected people interact with the environment and society. This language is known as sign language. Sign language varies according to the region and native languages. However, when we speak of standards, American Sign Language is considered a standard for number and alphabets recognition. This standard is considered the best communication tool for affected people only. An average healthy person with all abilities to speak and hear is not required to know this prototype because that person is entirely unfamiliar with these signs. There are two ways to make communication feasible between a healthy and affected person. Firstly, convince a healthy person to learn all sign language gestures for communication with the deaf-mute person or, secondly, make any deaf-mute person capable of translating gestures into some normal speaking format so everyone can understand sign language easily. Considering the first option, it almost looks impossible to convince any healthy person to learn sign language for communication. This is also the main drawback of sign language. Therefore, technologists and researchers have focused on the second option to make deaf-mute people capable of converting their gestures into some meaningful voice or texture information. For Sign language recognition, a smart glove embedded with sensors was introduced that can convert handmade gestures into meaningful information easily understandable by ordinary people.
Smart technology-based sign language interpreters that remove the communication gap between normal and affected people use different techniques. These techniques are based on image processing or vision-sensor-based techniques, sensor fusion-based smart data glove-related techniques, or hybrid techniques. No such limitations are seen in these technological interpreters as extracting required features from an image usually creates problems due to the foreground and background environmental conditions. If we consider an image or vision-sensor-based recognition system, there is no limitation of foreground or background in gesture recognition. Considering a sensor-based smart data glove, there is no limitation in carrying this data glove as it is mobile, lightweight, and flexible. Research has shown that many applications based on vision sensors, flex-based sensors, or hybrid techniques with different combinations of sensors are currently being used as communication tools. These applications also act as a learning tool for normal people to comfortably communicate with deaf or mute people. Latest technologies like robotics, virtual reality, visual gaming, intelligent computer interfaces, and health-monitoring components use sign language-based applications. The goal of this sign language recognition-based article was to deeply understand the current happenings and emerging techniques in sign language recognition systems. This article completely reflected on the evolution of gesture recognition-based systems and their performance, keeping in mind the limitations and pros and cons of each module. The aim of this study was to understand technological gaps and provide analysis to researchers so they can work on highlighted limitations in future perspectives. So, the aims and objectives of the prescribed study were fulfilled by considering published articles based on the specified domain, the technology used, gestures and hand movements recognized, and sensor types and languages targeted for recognition purposes. This paper also reflected on the method of performance evaluation and effectiveness level achieved for analyzing sign language techniques used previously.
People with speech and hearing disabilities use sign language based on hand gestures. Communication is performed with specific finger motions to represent the language. A smart glove is designed to create a communication link for individuals with speech disorders. It provides a close analysis of the engineering and scientific aspects of the system. The fundamentals are taken into account for the social inclusion of such individuals. A smart glove is an electronic device that translates sign language into text. This system is designed to make communication feasible between the mute people and the public. Sign language recognition-based techniques consist of three main streams that are vision-sensor based, flex sensor-based, and a combination of both vision sensor and flex sensor fused systems and are listed below.
1. Vision-sensor based SLR system
2. Sensor-based SLR system
3. Hybrid SLR system
An A
merican compreSign Language Recognition system with flex and motion sensors was implemented in ref. [1]. Th
is syste
nsive studm produced better results than a cyber glove embedded with a Kinect camera. Authors succeeded in proposing a model which performs better recognition of signs using different algorithms of machine learning [2]. Response time and accuracy
wer
elated to e increased using better sensors and an efficient algorithm for the specified task. The traditional approach of sign language
rewas changed by embedding sensor networks for recognition
was also conductedpurposes. A proposed model was implemented by the combination of Artificial Neural Network (ANN) and Support Vector Machine (SVM) in sign language recognition. This combined algorithm produced better results than the Hidden Markova model (HMM) [3].
A Review-based articles played smart data glove named “E-Voice” was developed by authors for alphabetical gesture recognition of ASL. The prototype was designed using flex sensors and accelerometer sensors. The data glove was successful in recognizing sign gestures with improved accuracy and increased efficiency [4]. Sign la
nguage vital role in understanis a subjective matter, so a new method of recognition was developed using surface electromyography (sEMG). Here, sensors were connected to the right forearm of the subject and collected data for training and testing purposes. They used the Support Vector Machine (SVM) algorithm for recognition and obtained better results for real-time gesture recognition [5]. Another mod
el was proposed as a combi
ngnation of three types of sensors. These sensors included flux, motion, and pressure sensors to determine SVM impact on sign language
recognition [6]. Daily activity was recognized using a smart-data glove. Two basic techniques f
or gestu
ndare interpretation were used using data glove interaction [7].
Implementa
tion of a more advanced approach incl
suding a deep learning model was developed in ref.
[8]. Static gestures were converted into American Sign Language alphabets using Hough T
ransform, and th
e general revieis technique was applied on 15 samples per alphabet and obtained 92% accuracy. A combination of motion tracker and the smart sensor was used in sign language recognition. An Artificial Neural Network approach was implemented to obtain the desired results [9]. The Artificial Neural Netw
ork translates American Sign Language paper reflected an applicationinto alphabets. A sensory glove called a smart hand glove with a motion detection mechanism was used for data collection purposes, and as a result, the transmitter-receiver network processed input data to control home appliances and generate recognition results [10]. Hand-b
ody la
sed discussion witnguage-based data analysis was performed using a machine-learning approach. A sensor glove embedded with 10 sensors was used to capture 22 different kinds of postures. KNN, SVM, and PNN algorithms were applied to perform sign language posture recognition [11]. Th
e seauthors in ref. [12] presented a dev
ice
ral pros and con named the “Electronic Speaking Glove”. This device was developed using a combination of Flex sensors. Flex sensor data were fed into a low-power, high-performance, and fast 8-bit TMEGA32L AVR microcontroller. A reduced instruction set architecture (RISC)-based AVR microcontroller used the “template matching” algorithm for sign language recognition.
Another Sign language recognition-bas
ed system was developed by [13].
EThis virtual image interaction-based sensor system
erging trends, developin succeeded in recognizing six letters, i.e., “A, B, C, D, F, and K,” and a digit, “8”. So, the prototype was developed using two flex sensors attached to the index and middle finger of the right hand. These sensor data were transmitted towards the Arduino Uno microcontroller. In this experimental setup, MATLAB-based Fuzzy logic was implemented. A sign gesture recognition-based prototype was developed by the authors in ref. [14]. This prototype consisted of a smart g
love technologies,embedded with five flex sensors. This acquired data were then sent towards the Arduino Nano microcontroller, and a template matching algorithm was used for gesture recognition purposes. This experimental set succeeded in recognizing four gestures made for Sign Language. A Liquid Crystal Display (LCD) and a speaker were used to display and s
ign-detection peaking recognized gestures, respectively. The authors in ref. [15] dev
eloped an experi
ce characteristics were tmental model for Standard American Sign Language (ASL) alphabet recognition. A programmable intelligent computer (PIC) was used to store the predefined alphabet data of ASL alphabets. This experimental setup was also based on template matching phenomena. For data acquisition purposes, a smart prototype based on three flex sensors along with an analogue to digital (ADC) converter, an LCD, and an INA 126 instrumentation speaker were utilized. In this setup, the 16F377A modeled microcontroller was used, which succeeded in recognizing 70% of ASL alphabet gestures.
A more advanced, intelligent, and smart system was implemented by the
authors in ref
. [16]. Their experimental setup included eight co
ntac
us of general review articlest sensors and nine flex sensors. These sensors were placed inside and outside of fingers. The outer five sensors were deployed to detect bending changes, and the inner four sensors were attached to measure hand orientation. This system was also based on a template matching algorithm where a unique 36 gesture-based standard ASL dataset was matched with input data. The ATmega 328P microcontroller was used for matching purposes, which succeeded in producing 83.1% and 94.5% accuracy for alphabet and digits, respectively.
In ref. [17], the authors made a S
tandard Sign language recognition
prototype. This prototype consisted of five flex sensors embedded with an ATmega 328 based microcontroller. Senor-based
review articles covered most acquired data were compared with the already stored ASL dataset. This experimental set succeeded in producing 80% overall accuracy. To facilitate the deaf-mute community, another important contribution was presented by ref. [18]. The aut
hors succe
chniques achieved for gesture reeded in developing a prototype which translates sign language into its perfectly matched alphabet or digit. Eleven resistive sensors were used to measure the bending of each finger. Two separate sensors were utilized in this scenario to detect wrist bending. The developed smart device worked perfectly on static gestures and produced good results in alphabet recognition
. with an overall accuracy of 90%. A Vietnamese Sign language recognition system was developed using six accelerometer sensors [19]. Th
is prototype
focus of these articles was on tecwas designed for 23 local Vietnamese gestures, including two extra postures of “space” and “punctuation”. Gesture classification was performed using Fuzzy logic-based algorithms. This device, named “AcceleGlove”, succeeded in producing 92% overall accuracy in Vietnamese Sign language recognition. A posture recognition system was developed in ref. [20]. Th
is inno
logy. Choice of the right sensory material and limitationsvative glove-based system was assembled using a flex sensor, force sensor, and a MPU6050 Accelerometer sensor. Five flex and five force sensors were attached to each finger, and an accelerometer was attached to to the wrist. The experimental setup comprised data from flex sensors, force sensors, gyro sensors, accelerometer sensors, and IMU sensor data. All these sensors related to Arduino Mega for data acquisition. Based on data classification, the output was displayed on LCD. This system achieved 96% accuracy on average. A real-time sign-to-speech translator was developed to convert static signs into speech by using “Sign Language trainer & Voice converter” software [21]. Data were acqui
red usin
existing five flex sensors and a 3-axis accelerometer sensor connected with an Arduino-based microcontroller.
A handmade sig
n recognition system
s are reflected in was developed with the help of LabVIEW software using the Arduino board. The user interacted with the environment using the LabVIEW provided Graphical User Interface (GUI), and recognition was performed with the help of Arduino [22]. Anothe
r s
e amart-sensor-embedded glove was developed by [23]. A good combination of flex sensor
s, cont
icles. So, these articles provide a deep unact sensors, and a 3-axis ADXL335 accelerometer was used for recognition purposes. Flex sensors were attached to each finger of the hand, and contact sensors were placed in between two consecutive fingers. The sign language-based gestures were obtained using a described smart glove. These sign-based analog data were transferred towards the Arduino Mega environment for recognition purposes. Classified sign gestures were displayed with the help of a 16 × 2 Liquid Crystal Display (LCD) and were converted into speech with the help of a speaker. A smart glove based on five flex sensors and an accelerometer was designed for sign language recognition [24]. This d
ata glove
rstanding of recognition mater transferred analog signal data to the microcontroller for recognition. Lastly, the output was shown with the help of pre-recorded voice matched with a recognized sign. A sign language recognition system based on numeric data was developed in ref. [25]. The authors used a combi
na
ls and methods used to obtain an efficient sign dataset with maximumtion of a Hall sensor and a 3-axis accelerometer. The smart data glove was composed of four Hall sensors attached to the fingers only. Hand orientation was measured with the help of an accelerometer, and finger bend was detected by using Hall sensors. These analog sensor data were passed towards MATLAB code to ideally recognize signs made by smart gloves. This experimental setup was only tested on numbers ranging from 0 to 9. The developed system succeeded in producing an accuracy
of 96% in digit recognition.
Despite Btraditional sensor-bas
icaled bright data gloves, another advanced approach was utilized by [26]. A smart gl
ove for gesture recognition was created by
, any machine can be made intelligent wi using LTE-4602 modeled light emitting diodes (LEDs), photodiodes, and polymeric fibers. This combination was used only to detect finger bending. Hand motion was also captured using a 3-axis Accelerometer and gyroscope. This portable smart glove succeeded in hand gesture recognition made for sign language translation. The authors also made regional sign language systems from different origins. An Urdu Sign Language-based system was developed in ref. [27]. The smart
data glove was composed of five flex sensors attach
the help of the machine learnined to each finger, and a 3-axis accelerometer was placed at the palm. To display the output, a liquid crystal display (LCD) of 16 × 2 dimensions was used. The authors succeeded in creating a dataset of 300 × 8 dimensions, and the Principal Component Analysis (PCA) technique was utilized using the MATLAB software to detect static sign gestures. Using PCA, the authors succeeded in achieving 90% accuracy. Another regional sign language recognition system was presented in ref. [28]. The authors made a prototype to convert Malaysian Sig
n approach. Machine learning techniqLanguage. A smart data glove made of Tilt sensors and a 3-axis accelerometer was developed for recognition purposes. Microcontroller and Bluetooth modes were also involved in this prototype to classify detected signs and transmit them to a smartphone. The microcontroller operated on template-matching phenomena and succeeded in recognizing a few Malaysian Sign Language gestures. Overall system accuracy was from 78.33% to 95%. Flex sensor and accelerometer-based smart gloves can perform alphanumeric data classification. Using this prototype, 26 alphabets and ten digits can be recognized using a template-matching algorithm [29]. Five flex sensors attached to each finger produ
ce
s are counted under td an analog signal of a performed gesture which was transferred towards an Arduino Uno microcontroller. Including an accelerometer for hand motion detection, the authors obtained eight valued data for a sign gesture. In ref. [30], the
aut
ree of artificial intelligenchors developed two gloves-based models. These models contained ten flex sensors attached to each finger of both hands and a 9 degree of freedom (DoF) accelerometer for motion detection. Two glove-based systems were tested on phonetic letters, including a, b, c, ch, and zh. With the help of a matching algorithm, the authors performed static sign recognition with approximately 88% accuracy.
Ame
.rican S
o, tign Language classification and recognition-system-based probabilistic segmentation were presented in ref. [31]. Th
is syste
data by which our am was divided into two main modules. The first module performed segmentation based on the Bayesian Network (BN). Data obtained during this session were used for training purposes. The second module was based on classification using a combination of Support Vector Machine (SVM) classifier with multilayer Conditional Random Field (CRF). This system succeeded in producing 89% accuracy on average. The authors in ref. [32] brought some innovation in existing sign l
ang
orithm learns to uage recognition systems by combining data obtained from sensor gloves and the data obtained using hand-tracking systems. A very well-known methodology known as Dumpster-Shafer theory was implemented on the obtained and fused data for evidence assembling. This fused system achieved 96.2% recognition accuracy on 100 two-handed ArSL. Hand motion and tilt sensor-based sign data were collected using Cyber Glove [33]. The classification of 27 hand-shape
s based on Signing Exact English (SEE) was perform
operations wered using Fisher’s linear discriminant embedded with a linear decision tree. Vector Quantization Principal Component Analysis (VQPCA) was used as a classification tool for sign language recognition. This system was successful in obtaining 96.1% overall accuracy.
An Arabic Sign Language
recognition-based deep
rovided by learning framework focusing on the singer independent isolated model was discussed in ref. [34]. The main focus of this research was on the
regional s
ensor data collected in a fileign gestures. In a vast variety of regional domains, these authors focused on only Arabic sign gestures and implemented deep learning-based approaches to achieve the desired results. Implementation of hand gestures recognition for posture classification was implemented in ref.
[35]. The
prototype was
e data were used for training p purely based on real-time hand gesture recognition. For implementation, an IMU-based data glove embedded with different sensors was used to achieve the desired results. Another advancement in the field of sensor-based gestures recognition was implemented by ref. [36]. A du
al leap motion contr
poses. In this way,oller (LMC)-based prototype was designed to capture and identify data. Gaussian Mixture Model and Linear Discriminant based approaches were implemented to achieve results. A case study-related implementation based on regional data was implemented in ref. [37]. The auth
e gesture is recognizors focused on Pakistani Sign Language models to work on Multiple Kernel Learning-based approaches. Working with signal-based sensor values for classification of real-time gestures was implemented in ref. [38]. The authors worked on wrist-worm-based real-time
hand
, and the algorith and surface postures. EMGs and IMU-based sensors were embedded to achieve the desired values of sign postures. An armband EMG sensor-based approach was implemented by the authors in ref. [39]. The m
ain efficiency can be focus was to classify finger language by utilizing ensemble-based artificial neural network learning. The sensor values helped ANN to classify gestures accurately. A sign language interpretation-based smart glove was designed by the authors in ref. [40]. A sensor-fused data glove was used to rec
ognize and classify SL postures. Another no
mputed vel approach to capture sign gestures was discussed in ref. [41]. The dev
elopers of the smart data glove named it Ski
a testing operations. WnGest as it completely grips skin with no detachments. For capturing gestures and postures data, filmy stretchable strain sensors were used. Leap motion-based identification of sign gestures was implemented with the help of
a modified LSTM model in ref. [42]. Cont
inuous sign gestures was perfectly classified using an LSTM model to get th
is techniqe desired results. Another novel approach of working with key frame sampling was implemented in ref. [43]. The authors also focused on skeletal features to utilize an attention-based sign langu
age re
, the barcognition network. A Turkish sign language dataset was processed using based line methods in ref. [44]. Lar
ge scale multi
er faced by mute modal data were classified based on regional postures to achieve good recognition results. Similarly, the authors used Multimodal Spatiotemporal Networks to classify sign language postures in ref. [45]. Develop
me
ont of a low cost model for translating sign gestures was targeted in ref. [46]. The main focus was the develop
ment of a smart wearable
in cdevice with a very reasonable price.
2. Vision Based Models
The autho
rs in ref. [47] developed a m
model u
nicating with society can be reduced to a gsing vision-sensor-based techniques to extract temporal and spatial features from video sequences. The CNN algorithm was applied on removed lines to identify the recognized activity. An American Sign Language dataset was used for feature extraction and activity recognition. An Intel Real sense camera was used to translate American Sign Language (ASL) gestures into text. The proposed system included an Intel Real camera-based setup and applied SVM and Neural Network (NN) algorithms to recognize sign language [48]. Due to the lar
ge
at extent when someone wants set of classes, inter-class complexity was increased to a large extent. This issue was resolved using the Convolutional Neural Network CNN-based approach. Depth images were captured using a high-definition Kinect Camera. Obtained images were processed using CNN to obtain alphabets [49]. Real-t
o share with people who are nime sign language was interpreted using CNN to perform real-time sign detection. This approach does not include outdated datasets or predefined image datasets. The authors manually implemented a real-time data analysis mechanism rather than the traditional approach of using predefined datasets in ref. [50]. In visio
n-sensor-based recognit
able to speak dion, 20 alphabets with numbers were recognized using Neural Network-based Hough Transform [51]. Due to
the ima
vocally imge’s dataset, a specific threshold value of 0.25 was used for efficiency achievement in the Canny edge detector. This system succeeded in achieving 92.3% accuracy. Fifty samples of alphabets and numbers were recognized by the Indian sign language system using a vision-sensor-based technique [52]. A supp
ort vector ma
ired disabilitychine (SVM)-based classifier with B-spline approximation was used, which achieved 91% accuracy on average. A hybrid pulse-coupled neural network (PCNN) embedded with a nondeterministic finite automaton (NFA) algorithm was used collectively to identify image-based gesture data [53]. This
prototype ac
ommunhieved 96% accuracy based on the best match phenomena.
Pri
nc
ation gap creates aipal component analysis (PCA) along with local binary patterns (LBP) extracted Hidden Markov Model (HMM) features with 99.97% accuracy in ref. [54]. In ref. [55], hand segmentation based on skin pcolor
oblem. Therefore detection was used. For hands identification and tracking, a skin blob tracking system was used. This system achieved 97% accuracy on 30 recognition words. In ref. [56],
sArabic Sign language
is used for communicrecognition was performed using various transformation techniques like Log-Gabor, Fourier, and Hartley transform. Hartley transform and Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) classifiers helped produce 98% accuracy. Combined orientation histogram and statistical (COHST) features along with wavelet feature techniques were used in ref. [57]. These techniques succeeded in recognizing stati
c signs made fo
n. There are severalr numbers from zero to nine in ASL. The neural network produced efficient results based on the feature values of COHST, wavelet, and histogram with 98.17% accuracy. Static gesture recognition based on alphabets was performed using neural network-based wavelet transform. This system achieved 94.06% accuracy in recognizing Persian sign language [58]. Manual signs were detected using the finger, p
eople who cannot undealm, and place of articulation. Equipment arranged for manual sign extracted data from a video sequence and matched it with a 2D image of standard American Sign Language alphabets. The proposed setup resulted in accurate sign detection of alphabets [59].
Deep lear
ning-bas
ted SLR models are also focused on vision-based approaches. The authors in ref. [60] focused on current deep lea
rn
d these sign gestureing-based techniques, trends, and issues in deep models for SL generation. Keeping in mind standard American Sign Language models, the authors in ref. [61] focus
ed on the development of a deep image-based user independent approach.
CommunTheir main work was based on PCANet features based on depth analysis. Another edge computing-based thermal image detection system was presented by the authors in ref. [62]. They worked on di
git-based sign rec
ation hindrance between the ognition model using deep learning approaches. Different computer vision-based techniques were applied for SLR tasks. A camera sensor-based prototype was used by the authors in ref. [63] to correctly identify sign p
ostu
blic res. A convolutional neural network-based approach was implemented by using video sequences in ref. [64]. A three-dimensiona
l atten
d mute people ition-based model was designed for a very large vocabulary to acquire data from video sequences and classify them using a 3D-CNN model. Similarly, the same authors implemented a boundary adaptive encoder using an attention-based method on a regional Chinese language dataset in ref. [65]. A novel key-frame-centered clip-bas
ed the main papproach was implemented on the same Chinese Sign Language-based dataset in ref. [66]. The r
egio
blem to be addressednal Chinese sign dataset was classified using video sequences in the form of images. This vision-based novel approach produced challenging results in CSL. Another fingerspell-based smart model was developed by the authors in ref.
2. Different Recognition Models
A lot of image processing and sensor-based techniques have been applied for sign language recognition. Recent studies have shown the latest framework for sign recognition with the advent of time. Detailed literature analysis and a deep understanding of sign language recognition categorized this process into different sub-sections. The further division was completely application-based depending on the type of sensors used in the formation of data gloves. So, these subdivisions were based on non-commercial prototypes for data acquisition, commercial data acquisition prototypes, Bi channel-based systems, and hybrid systems. Introducing non-commercial systems, these prototypes are self-made systems that use sensors to gather data. These sensor values are transmitted to another device, usually, any processor or controller, to understand transmitted sensor data and convert these data into their respective sign language format. Most of the sensor-based systems are non-commercial prototype-based systems discussed in the literature review. In non-commercial systems, most of the authors worked on finger bend detection regarding any sign made. So, a large variety of different solo sensors or combinations of different sensors were used to detect this finger bending. So, SLR models can be further divided into non-commercial prototype-based and framework-based prototypes.
1. Sensor-based models
2. Vision-based models
3. Non-commercial models for data glove
4. Commercial data glove models
5. Hybrid recognition models
6. Frameworks based recognition models
All details of recognition models are discussed in the paper referenced below.
3. Analysis
Sign language recognition is one of the emerging trends in today’s modern era. Much research has already been conducted, and currently, most researchers are working on this very domain. The focus of this article is to provide a brief analysis of all related work that has been done until now. For this purpose, a complete breakdown of all research activities was developed. Some authors worked on a general discussion about sign language. Most of their work was based on introduction and hypothesis to deal with sign language scenarios. There was no practical implementation of the proposed hypothesis; therefore, these authors lie in the general article category. A group of authors worked on developing systems that are able to recognize sign gestures. This group of authors is categorized in the developer domain. A good combination of sensors was used to develop a Sign language recognition system. Most of the authors used sensor-based gloves to recognize sign gestures. Another group of authors worked on existing sensor-based models and improved the accuracy and efficiency of the system. Their focus was to use a good combination of Machine Learning and Neural Network-based algorithms for accuracy achievement. Considering the author’s intentions, Machine Learning based algorithms were used by authors working on sensor-based models like sensory gloves. The authors used Neural Network-based models working on vision-sensor-based models for sign gesture recognition.
Considering literature work, some trends were also kept under consideration. Most of the authors in the sign language domain preferred to develop their own sensor-based models. The focus of authors working on this trend was to develop their own cheap and efficient model that could detect and recognize gestures easily. These models were not made for commercial use. Authors obtained another trend to develop commercial gloves. These gloves contained a maximum number of sensors e.g., 18–20 sensors, to detect sign gestures. Cost and efficiency were the main problems in commercial gloves. Analyzing these research articles, the advantages and disadvantages of vision-sensor-based, sensor-based, and hybrid-based recognition models were listed. Additionally, the last trend of focus in sign language articles, including this article, was the group of those authors who worked on surveys and reviewed articles on sign recognition. These authors provided a deep understanding of research work done previously and provided detailed knowledge of hardware modules, sensor performances, efficiency analysis, and accuracy comparisons. The advantage of review and survey articles over general and development research articles was the filtered knowledge of consideration in one article. Survey-based research articles proved to be a good help for learners and newcomers to that specific topic. Survey and review articles also provided researchers with upcoming challenges, trends, motivations, and future recommendations. A detailed comparative study help use determines uses, limitations, benefits, and advancement in the sign language domain.
4. Conclusion
D [67]. They focused on the develop
ment of an Indi
ng an automatic machine-based SL tran quiz portfolio that was based on camera-oriented posture classification. The main point of identification was based on ASLR models using a vision-based approach. A vision sensor-based three-dimensional approach was implemented by the authors in ref. [68]. Three-dimensiona
l sign lan
slation system that guage representation was classified with the help of spatial three-dimensional relational geometric features. These 3-D data were classified and recognized with the help of a S3DRGF-based technique quite efficiently. Another vision-based technique focusing on color mapping-based classification and recognition was developed by the authors in ref. [69]. A CNN-based deep learning model was tra
in
sforms SL into speech and texed on the three-dimensional data of signs. Color texture coded-based joint angular displacement maps were classified efficiently with the help of a 3-D deep CNN model. Another advanced approach based on three-dimensional data manipulation for sign gestures was implemented in ref. [70]. The authors focused on classification and recognit
ion o
r vice versa is particularly f angular velocity maps with the help of the deep ResNet model. Connived Feature ResNet was deployed specifically to classify and recognize 3-D sign data. Another video sequences-based novel approach to classify sign gestures was implemented in ref. [71]. A BiLSTM-based th
ree-dime
lpful in imprnsional residual neural network was used to capture video sequences and identify the posture data. A novel deep learning-based hand gesture recognition approach was implemented by the authors in ref. [72]. Image-based fine po
vstures were captured and perfectly recognized using
intercodeep learning-based architecture. A virtual sign channel for visual communication
. was developed in ref. [73]. The authors’ main focus was Pto create a vir
ogress in ptual communication channel for deaf-mute and hearing individuals. Another three-dimensional data representation for Indian sign language was developed in ref. [74]. The a
ut
tern recognitihors used an adoptive kernel-based motionlets-matching technique to classify gesture data. A video sequence and text embedding-based continuous sign language model was implemented in ref. [75]. Jo
in
promiset latent spaces-based data were processed using cross model alignment of a continuous sign language recognition model.
3. Non-Commercial Models for Data Glove
In non-commercial s
ystems, automated tranmost authors work on finger bend detection regarding any sign made. So, a large variety of different solo sensors or a combination of different sensors were used to detect this finger bending. The authors in ref. [76] developed a non-commercial-bas
ed prototype for sign la
tion systems, but many complex problems need to be solved nguage recognition. This system was completely based on the finger bending method. To detect finger bending, ten flex sensors were used. A pair of sensors were attached to two joints of each finger. To deal with analogue flex data, a MPU-506A multiplexer was used. Selected data coming from the multiplexer were sent to the MSP430G2231 microcontroller. A Bluetooth module was used to transmit data towards a smart cell phone. This captured data were then compared with the sign language database and the sorted result was converted into speech using a text-to-speech converter. The authors in ref. [77] also succeeded in developing a non-commercial sign language recognition-b
ase
fore they becod prototype. This prototype included five ADXL 335 accelerometer sensors connected with an ATmega 2560 microcontroller system. Based on axis orientation, sign language was identified and transmitted via a Bluetooth module towards mobile application for text-to-speech conversion. In ref. [78], a prototype was developed to help handicapped people. This prototype converted finger orientation into some a
ctions. reality. Several aspeFor this purpose, five optical fibers sensors were used to collect finger bending data. These 8-bit analog data were used to train multilayered neural networks (NN) using MATLAB. So, six hand gesture-based operations were performed using the Backpropagation training algorithm. For data validation, a tenfold validation method was implemented on 800 sample records. Similarly, for Sign Language Recognition, the authors made a non-commercial prototype based on five flex sensors [79]. The MSP430F149 mic
rocont
s of SLR technroller was used to classify incoming analog data. These data were compared with standard American Sign Language (ASL) data, and the output was displayed on Liquid Crystal Display (LCD). Using text-to-speech methodology,
particularthe recognized letter was converted into speech using a good quality speaker. The authors in ref. [80] devel
yoped the Sign-to-Letter SLR that uses a glove(S2L) system. This system contained six flex sensors and a combination of discrete-valued components and a microcontroller. Five flex sensors were attached to five fingers of the hand, and one sensor was attached to the wrist of the same hand. This combination of two different bending-based sensor
approachs succeeded in converting signs into the letter. The output of this system was displayed via the programming “IF-ELSE” condition. A combination of Light Emitting Diode- Laser Dependent Resistor (LED-LDR) sensors was used by [81]. MSP430G2553 microcontroller was used to detect signs made by finger bending. Using mentioned microcontroller,
analog data were converted into digital and ASCII codes related to 10 Sign Language Alpha
ve been previoubets. Converted data were transmitted using a Bluetooth module named as ZigBee, and recognized ASCII code was displayed on a computer screen. This code was also converted into speech.
Another fingers
pel
y el system was developed in ref. [82]. This prototype included four flex
sensors and an accelerometer sensor. The main idea in this p
lored and investigated by researototype design was to translate handmade signs into their corresponding American Sign Language (ASL) alphabets. For data acquisition, four deaf-mute individuals were gathered. This system succeeded in understanding 21 gestures out of 26. A hand gesture recognition system was developed by measuring inertial measurements along with altitude values [83]. For
data ac
hers. In this paper, an in-depth compaquisition, six Inertial Measurement Units (IMUs) were used in this prototype. Each IMU was attached to each finger, and one IMU was attached to the wrist. This experimental setup succeeded in collecting hand gesture data by an accelerometer and a gyroscope, and a magnetometer sensor provided values. These values were refined using Kalman Filter and processed through the Linear Discriminant Analysis (LDA) algorithm. Overall, 85% accuracy was achieved by using this prototype in hand gesture recognition.
4. Commercial Data Glove Based Models
Besides following the tra
diti
ve analysis of different seonal way of making cheap data gloves, some of the authors used a commercial data glove named “CyberGlove”. This commercial glove was specifically designed for deaf-mute people. A lot of affected communities and research centers used this glove for communication and research purposes. CyberGlove was manufactured precisely with the combination of 22 sensors embedded on the glove. The basic structure of the glove contained four sensors attached in between fingers and three sensors
in addressing anattached on each finger. Palm sensors and wrist bending measurement sensors were also included in this commercial prototype. This smart, thin layer, elastic fiber-based sensor glove had an approximate cost of $40,000 for each pair. Using this CyberGlove, authors in ref. [84] applied
a describing the challenges, benefcombination of neural network-based algorithms to measure the accuracy and efficiency of the system. Finger orientation and hand motion projection were captured with a smart CyberGlove embedded with a 3D motion-tracker sensor. This analog signal data were transferred towards a pair of word recognition network and velocity network algorithms. These algorithms worked on 60 American Sign Language (ASL) combinations and obtained an accuracy of 92% and 95%, respectively. A posture recognition system based on a 3D hand posture model was developed in ref. [85]. A Java 3D-based model helped i
n classificat
s, and recommendations related to SLRion and segmentation of real-time input posture data. These data were compared with pre-recorded CyberGlove-based data with the help of an index tree algorithm. Another CyberGlove embedded with a 3D motion tracker named as Folk of Birds was used for sign language recognition. CyberGlove-based data containing bend, axis, motion, and hand orientation were fed into the multilayered neural network. The Levenberg-Marquardt backpropagation algorithm was used for segmentation and sign classification. This prototype succeeded in producing 90% accuracy in American Sign Language (ASL) recognition [86].
In the sensor-based sign language recognition domain, another advancement was
pmade by intr
esented. The paper discussed the literature work of other researchers oducing a new five-dimensional technology commercial data glove commonly known as the 5DT data glove. This 5DT commercial glove was made in two variants, one with five fiber optic sensors and the other with fourteen optic sensors. 5DT manufacturers named this fiber optic smart data device ultra-motion. Internationally this data glove’s cost was approximately $995. In five sensor-based data gloves, each optical sensor is attached to each finger, and one sensor is attached for hand orientation detection. In 14 optic fiber sensors, two sensors are kept in contact with one finger, and a sensor is also attached in between fingers to check finger abduction. Two-axis measurement-based sensors are also attached in that glove to determine axis and orientation, including pitch and roll of the hand. So, these 5DT-based bright data gloves were used by authors for Japanese Sign language recognition [87]. The main
idea of devel
y targeting the available glove types, the oping this system was to automate the learning system. A 3D model based on the 5DT 14 sensor-based smart data glove for simulating signs was made. This system highlights motion errors for beginners and helps understand hand motion completely via a 3D model. To facilitate communication for deaf and mute people, another advancement was applied using a combination of 5DT data gloves with five embedded sensors. Data obtained by using ultra motion glove were trained using the MATLAB simulator. A multilayered neural network with five inputs and 26 outputs was utilized for the training model for sign language recognition. A series of NN-based algorithms like resilient, back, quick, and Manhattan propagation, including scaled conjugated gradient, was used for the training model [88].
Another advancement in s
ign language
ns recognition was seen in ref. [89]. The authors used
a for capturing data, the techniques adopted for recognition DG5 V hand data glove for data acquisition. The internal structure of the DG5 V hand data glove contained five flex or bending sensors with one three axis accelerometer and three contact sensors. This data glove was capable of transmitting acquired data wirelessly. The overall system was made remotely functional by using a battery. The DG 5 V commercial data glove was used for American and Arabic Sign language recognition systems. The authors focused on Arabic Sign Language, whereas this data glove had already been used previously for American Sign Language. The only left-hand glove cost $750. A pair of DG5 V data gloves were used in Arabic Sign language recognition. Two glove-based models succeeded in acquiring data for 40 sentences. This dataset was classified using a modified K-Nearest Neighbor (MKNN) algorithm. The overall system succeeded in producing 98.9% accuracy. The hand gesture cannot be fully recognized without knowing hand orientation and posture. Therefore, an advancement in the traditional system was brought by fusing the concept of Electromyography and inertial sensors within the system [90]. Using a combination of the Accelerometer (ACC) sensor with Electromyograp
hy (EMG), the au
rposes, the identificathors achieved multiple degrees of freedom for hand movement. This setup was used for Chinese Sign language recognition. The EMG sensors were attached at five muscle points over the forearm, and the MMA7361 modeled 3-axis accelerometer was attached over the wrist. Multi-layered Hidden Markov Model and decision tree algorithms were used for recognition purposes, which succeeded in producing 72.5% accuracy.
The same set
up of Accelerometer and Electromyography was used for German Si
on of the dataset in each agn Language. The authors used a single EMG with a single ACC sensor to recognize a small database of German vocabulary. The training was performed on seven words with seventy samples for each word. K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) classifiers were used. The system succeeded in achieving an average accuracy of 88.75% and 99.82% in the case of subject dependency [91]. A similar
hybrid approach of Acceleromet
icle, and the specificer and Electromyography was used for the Greek Sign language recognition system. The experimental setup consisted of five-channel Electromyography and an accelerometer sensor. The experiment was conducted on the singer with the intrinsic entropy mode. Experiments repeated ten times on three native singers produced training data. So, the system was trained using the intrinsic entropy mode on MATLAB. The system’s overall accuracy was 93% collectively (without the personal effect of native singers involved for data collection purposes) [92].
5. Hybrid Recognition Models
A vision-sensor-based a
pproach was also adopt
ion of the processing unit and output devices of ted in sign language recognition. The previously used combination of electromyography with an accelerometer was replaced with a vision-sensor-based hybrid approach. In the hybrid approach, the authors used a variety of accelerometers with vision-sensor cameras. The purpose of a hybrid system was to enhance data acquisition and accuracy. The vision-sensor-based hybrid prototype contained red, green, and blue (RGB) color model cameras, depth sensors, and accelerometer-based axis and orientation sensors. This combination of the smart hybrid approach was used for gesture identification purposes. The experimental setup included seven IMU accelerometer sensors attached to the arm, wrist, and fingers. For data acquisition, five different age group sign language speakers performed ten times repeated forty gestures. Parallel Hidden Markov Model (PaHMM) succeeded in producing 99.75% accuracy [93]. Anothe
r combination recognition systems. The comparaof an accelerometer-based glove and camera sensor was used for American Sign Language recognition. The experimental setup contained a camera attached to a hat for detecting correctly made signs. Nine accelerometer sensors were used for gesture formation: five attached on each finger and two on the shoulder and arm to detect arm and shoulder movement. Two sensors were attached to the back of the palm for hand orientation measurement. This setup was tested on 665 gestures using the Hidden Markov Model (HMM) and produced a per sign accuracy of 94% [94].
6. Framework-Based Recognition Models
Most
of the arti
cles [95][96][97][98] followed a predefined framework for sign language recognition. The main objective
of using the sa
name framework was to enhance data accuracy and dataset efficiency. The authors in ref. [99] correctly
developed a si
s gn language system and implemented that system using different classification and recognition algorithms. The authors in ref. [100] succeeded in creating a Vietnamese Sign Language framewo
rk that worked wirelessly. A two-handed wireless smart data glove was designed and developed u
ld be helpful to explore and desing bend and orientation measurement. The experimental setup included MEMS accelerometer sensors attached just like the Accele Glove and as an addition one more sensor was attached to the palm of hands for orientation measurement. Wireless communication was made feasible by using a Bluetooth module attached to a cellphone. The user-generated sign was compared with the standard sign language database, and the correctly found result was displayed on a cellphone screen. Finally, a text-to-speech Google translator was utilized to convert the recognized sign alphabet into speech. This sign language framework succeeded in producing a reasonable accuracy. Similarly, the authors in ref. [101] develop
ed a
translation sn Arabic Sign Language recognition system. The main purpose of developing another framework for static sign analysis was to minimize the number of sensors on data gloves. This experiment was simulated on the Proteus software. The two-handed glove system c
apaontained six flex sensors, four contact sensors, one gyroscope, and one accelerometer sensor on each hand.
Another algorithmic-b
ased sign l
e of inanguage recognition framework was designed in ref. [102]. St
re
rpreting different sign am segmentation-based sign descriptors and text auto-correction-based algorithm were utilized. The system also provided software architecture of descriptors for hand gesture recognition. The Sign Language-based Interpolator, which converted text into speech, was also designed in ref. [103]. The overall
system fra
nguages. Finally, datasets generated from mework contained four basic modules that included the smart data glove, training algorithms for the input sign dataset, wirelessly visible sign application, and sign language database for matching the input sign with the standard repository. A very simple resistor-based framework was developed and implemented by ref. [104]. The auth
ors use
se sensors can be used for tasks ofd ten resistors and detected finger movement only. This was a medical application used only for finger flexion and extension. This was a very simple, low-cost, efficient, reliable, and low-power trigger. A data glove containing resistor-based framework was directly connected with a microcontroller which further transmits captured data towards a computer for finger movement analysis. Another simple gesture recognition-based framework was developed by ref. [105]. The csmart spell
assifications and segmentation to assising data glove consisted of three bending sensors attached on three fingers. The authors worked only on five gestures, including thumbs-up and rest. Input gesture data were fed into the microcontroller for recognition purposes, and analyzed gestures were combined in a row to form meaningful data before transmitting them to the receiver. A detailed review on all the frameworks based on Chinese Sign Language was discussed in ref. [106]. All t
he in continuoustechnical approaches that are only related to the regional Chinese Sign Language recognition and classification mechanisms were discussed in detail. Another detailed review on all the wearable frameworks and prototypes related to sign gesture
s re classification was discussed in ref. [107]. The authors foc
used o
gnitionn maximum frameworks that are related to and had been previously used by authors in the same field.
This is also a review article with good depth of technologies and frameworks in SLR.