Sign Language Recognition Models

Sign Language Recognition Models: Comparison

Please note this is a comparison between Version 2 by Conner Chen and Version 3 by Conner Chen.

ASign language recognition is challenging due to the lack of communication between normal and affected people. Many social and hybrid systemphysiological impacts are created due to speaking or hearing disability. A lot of different dimensional techniques have been proposed previously to overcome this gap. A sensor-based smart glove for sign language is a combination of both vision-sensor-based and combination of differentrecognition (SLR) proved helpful to generate data based on various hand movements related to specific signs. A detailed comparative review of all types of available techniques and sensors used for sign language recognition was presented in this article. The focus of this paper was to explore emerging trends and strategies for sign language recognition and to point out deficiencies in existing systems. This paper will act as a guide for other researchers to understand all materials and techniques like flex resistive sensors based models-based, vision sensor-based, or hybrid system-based technologies used for sign language until now.

sign language
sign language review
American Sign Language
Sensor Recognition
Vision recognition models

1. Sensor-Based Models

1. Introduction

A speaking or hearing disability is a cause by which people are affected naturally or accidentally. In the world’s whole population, there are approximately 72 million deaf-mute people. Lack of communication is seen between ordinary and deaf-mute people. This communication gap affects their whole lives. A unique language based on hand gestures and facial expressions lets these affected people interact with the environment and society. This language is known as sign language. Sign language varies according to the region and native languages. However, when we speak of standards, American Sign Language is considered a standard for number and alphabets recognition. This standard is considered the best communication tool for affected people only. An average healthy person with all abilities to speak and hear is not required to know this prototype because that person is entirely unfamiliar with these signs. There are two ways to make communication feasible between a healthy and affected person. Firstly, convince a healthy person to learn all sign language gestures for communication with the deaf-mute person or, secondly, make any deaf-mute person capable of translating gestures into some normal speaking format so everyone can understand sign language easily. Considering the first option, it almost looks impossible to convince any healthy person to learn sign language for communication. This is also the main drawback of sign language. Therefore, technologists and researchers have focused on the second option to make deaf-mute people capable of converting their gestures into some meaningful voice or texture information. For Sign language recognition, a smart glove embedded with sensors was introduced that can convert handmade gestures into meaningful information easily understandable by ordinary people.

Smart technology-based sign language interpreters that remove the communication gap between normal and affected people use different techniques. These techniques are based on image processing or vision-sensor-based techniques, sensor fusion-based smart data glove-related techniques, or hybrid techniques. No such limitations are seen in these technological interpreters as extracting required features from an image usually creates problems due to the foreground and background environmental conditions. If we consider an image or vision-sensor-based recognition system, there is no limitation of foreground or background in gesture recognition. Considering a sensor-based smart data glove, there is no limitation in carrying this data glove as it is mobile, lightweight, and flexible. Research has shown that many applications based on vision sensors, flex-based sensors, or hybrid techniques with different combinations of sensors are currently being used as communication tools. These applications also act as a learning tool for normal people to comfortably communicate with deaf or mute people. Latest technologies like robotics, virtual reality, visual gaming, intelligent computer interfaces, and health-monitoring components use sign language-based applications. The goal of this sign language recognition-based article was to deeply understand the current happenings and emerging techniques in sign language recognition systems. This article completely reflected on the evolution of gesture recognition-based systems and their performance, keeping in mind the limitations and pros and cons of each module. The aim of this study was to understand technological gaps and provide analysis to researchers so they can work on highlighted limitations in future perspectives. So, the aims and objectives of the prescribed study were fulfilled by considering published articles based on the specified domain, the technology used, gestures and hand movements recognized, and sensor types and languages targeted for recognition purposes. This paper also reflected on the method of performance evaluation and effectiveness level achieved for analyzing sign language techniques used previously.

People with speech and hearing disabilities use sign language based on hand gestures. Communication is performed with specific finger motions to represent the language. A smart glove is designed to create a communication link for individuals with speech disorders. It provides a close analysis of the engineering and scientific aspects of the system. The fundamentals are taken into account for the social inclusion of such individuals. A smart glove is an electronic device that translates sign language into text. This system is designed to make communication feasible between the mute people and the public. Sign language recognition-based techniques consist of three main streams that are vision-sensor based, flex sensor-based, and a combination of both vision sensor and flex sensor fused systems and are listed below.

1. Vision-sensor based SLR system

2. Sensor-based SLR system

3. Hybrid SLR system

An American Sign Language Recognition system with flex and motion sensors was implemented in ref. ^[1]. Tcomprehis system produced better results than a cyber glove embedded with a Kinect camera. Authors succeeded in proposing a model which performs better recognition of signs using different algorithms of machine learning ^[2]. Response nsive study relatime and accuracy were increased using better sensors and an efficient algorithm for the specified task. The traditional approach of sid to sign language was changed by embedding sensor networks for rerecognition purposes. A proposed model was implemented by the combination of Artificial Neural Network (ANN) and Support Vector Machine (SVM) in sign language recognition. This combined algorithm produced better results than the Hidden Markova model (HMM) ^[3]. A smart data glowas also conducted. Reve named “E-Voice” was developed by authors for alphabetical gesture recognition of ASL. The prototype was designed using flex sensors and accelerometer sensors. The data glove was successful in recognizing sign gestures with improved accuracy and increased efficiency ^[4]. Sign lew-based articles played anguage is a subjective matter, so a new method of recognition was developed using surface electromyography (sEMG). Here, sensors were connected to the right forearm of the subject and collected data for training and testing purposes. They used the Support Vector Machine (SVM) algorithm for recognition and obtained better results for real-time gesture recognition ^[5]. Another model was propovital role in understanding sed as a combination of three types of sensors. These sensors included flux, motion, and pressure sensors to determine SVM impact on sign language recognition ^[6]. Daily activity was recognized using a smart-data glove. Two basic techniques for gesture interpretation were used using data glove interaction ^[7].

Implendamentation of a more advanced approach including a deep learning model was developed in refs. ^[8]. Static gestures were converted into American Sign Language alpThabets using Hough Transform, and this technique was applied on 15 samples per alphabet and obtained 92% accuracy. A combination of motion tracker and the smart sensor was used in sign language recognition. An Artificial Neural Network approach was implemented to obtain the desired results ^[9]. The Arti general review paper reficial Neural Network translates American Sign Language into alphabets. A sensory glove called a smart hand glove with a motion detection mechanism was used for data collection purposes, and as a result, the transmitter-receiver network processed input data to control home appliances and generate recognition results ^[10]. Handected an application-body language-based data analysis was performed using a machine-learning approach. A sensor glove embedded with 10 sensors was used to capture 22 different kinds of postures. KNN, SVM, and PNN algorithms were applied to perform sign language posture recognition ^[11]. The authors in ref. ^[12] presentesed discussion with several pros and a device named the “Electronic Speaking Glove”. This device was developed using a combination of Flex sensors. Flex sensor data were fed into a low-power, high-performance, and fast 8-bit TMEGA32L AVR microcontroller. A reduced instruction set architecture (RISC)-based AVR microcontroller used the “template matching” algorithm for sign language recognition.

Anothecons. Emerging tr Sign language recognition-based system was developed by ^[13]. This virtual image interaction-based sensor system succeeded in recognizing six letters, i.e., “Ands, B, C, D, F, and K,” and a digit, “8”. So, the prototype was developed using two flex sensors attached to the index and middle finger of the right hand. These sensor data were transmitted towards the Arduino Uno microcontroller. In this experimental setup, MATLAB-based Fuzzy logic was implemented. A sign gesture recognition-based prototype was developed by the authors in ref. ^[14]. This preveloping technologies, and sign-detectiototype consisted of a smart glove embedded with five flex sensors. This acquired data were then sent towards the Arduino Nano microcontroller, and a template matching algorithm was used for gesture recognition purposes. This experimental set succeeded in recognizing four gestures made for Sign Language. A Liquid Crystal Display (LCD) and a speaker were used to display and speaking recognized gestures, respectively. The authors in ref. ^[15] developed an experimental m device characteristics were the fodel for Standard American Sign Language (ASL) alphabet recognition. A programmable intelligent computer (PIC) was used to store the predefined alphabet data of ASL alphabets. This experimental setup was also based on template matching phenomena. For data acquisition purposes, a smart prototype based on three flex sensors along with an analogue to digital (ADC) converter, an LCD, and an INA 126 instrumentation speaker were utilized. In this setup, the 16F377A modeled microcontroller was used, which succeeded in recognizing 70% of ASL alphabet gestures.

A more advaus of general review articles. Sign lanced, intelligent, and smart system was implemented by the authors in ref. ^[16]. Their experimentual setup included eight contact sensors and nine flex sensors. These sensors were placed inside and outside of fingers. The outer five sensors were deployed to detect bending changes, and the inner four sensors were attached to measure hand orientation. This system was also based on a template matching algorithm where a unique 36 gesturee recognition-based standard ASL dataset was matched with input data. The ATmega 328P microcontroller was used for matching purposes, which succeeded in producing 83.1% and 94.5% accuracy for alphabet and digits, respectively.

In ref. ^[17], the authors made a Standard Srevign language recognition prototype. This prototype consisted of five flex sensors embedded with an ATmega 328 based microcontroller. Senor-based acquired data were compared with the already stored ASL dataset. This experimental set succeeded in producing 80% overall accuracy. To facilitate the deaf-mute community, another important contribution was presented by ref. ^[18]. The auw articles covered most thors succeeded in developing a prototype which translates sign language into its perfectly matched alphabet or digit. Eleven resistive sensors were used to measure the bending of each finger. Two separate sensors were utilized in this scenario to detect wrist bending. The developed smart device worked perfectly on static gestures and produced good results in alphabet rechniques achieved for gesture recognition with an overall accuracy of 90%. A Vietnamese Sign language recognition system was developed using six accelerometer sensors ^[19]. . This prototype was designed for 23 local Vietnamese gestures, including two extra postures of “space” and “punctuation”. Gesture classification was performed using Fuzzy logic-based algorithms. This device, named “AcceleGlove”, succeeded in producing 92% overall accuracy in Vietnamese Sign language recognition. A posture recognition system was developed in ref. ^[20]. This innovative focus of these articles was on technologlove-based system was assembled using a flex sensor, force sensor, and a MPU6050 Accelerometer sensor. Five flex and five force sensors were attached to each finger, and an accelerometer was attached to to the wrist. The experimental setup comprised data from flex sensors, force sensors, gyro sensors, accelerometer sensors, and IMU sensor data. All these sensors related to Arduino Mega for data acquisition. Based on data classification, the output was displayed on LCD. This system achieved 96% accuracy on average. A real-time sign-to-speech translator was developed to convert static signs into speech by using “Sign Language trainer & Voice converter” software. Choice of the right sensory material and limitations ^[21]. Data were acquired using five flex sensors and a 3-axis accelerometer sensor connected with an Arduino-based microcontroller.

A handmade si existingn recognition system was developed with the help of LabVIEW software using the Arduino board. The user interacted with the environment using the LabVIEW provided Graphical User Interface (GUI), and recognition was performed with the help of Arduinos are reflected in ^[22]. Another smart-sensor-embedded glove was developed by ^[23]. A good combination of flex sensoe ars, contact sensors, and a 3-axis ADXL335 accelerometer was used for recognition purposes. Flex sensors were attached to each finger of the hand, and contact sensors were placed in between two consecutive fingers. The sign language-based gestures were obtained using a described smart glove. These sign-based analog data were transferred towards the Arduino Mega environment for recognition purposes. Classified sign gestures were displayed with the help of a 16 × 2 Liquid Crystal Display (LCD) and were converted into speech with the help of a speaker. A smart glove based on five flex sensors and an accelerometer was designed for sign language recognition ^[24]. This icles. So, these articles provide a deep undata glove transferred analog signal data to the microcontroller for recognition. Lastly, the output was shown with the help of pre-recorded voice matched with a recognized sign. A sign language recognition system based on numeric data was developed in ref. ^[25]. The authors used a combrstanding of recognition materination of a Hall sensor and a 3-axis accelerometer. The smart data glove was composed of four Hall sensors attached to the fingers only. Hand orientation was measured with the help of an accelerometer, and finger bend was detected by using Hall sensors. These analog sensor data were passed towards MATLAB code to ideally recognize signs made by smart gloves. This experimental setup was only tested on numbers ranging from 0 to 9. The developed system succeeded in producing an als and methods used to obtain an efficient sign dataset with maximum accuracy of 96% in digit recognition.

Despite trBaditional sensor-based bright data gloves, another advanced approach was utilized by ^[26]. A smart gsicallove for gesture recognition was created by using LTE-4602 modeled light emitting diodes (LEDs), photodiodes, and polymeric fibers. This combination was used only to detect finger bending. Hand motion was also captured using a 3-axis Accelerometer and gyroscope. This portable smart glove succeeded in hand gesture recognition made for sign language translation. The authors also made regional sign language systems from different origins. An Urdu Sign Language-based system was developed in ref. ^[27]. The smar, any machine can be made intelligent wit data glove was composed of five flex sensors attached to each finger, and a 3-axis accelerometer was placed at the palm. To display the output, a liquid crystal display (LCD) of 16 × 2 dimensions was used. The authors succeeded in creating a dataset of 300 × 8 dimensions, and the Principal Component Analysis (PCA) technique was utilized using the MATLAB software to detect static sign gestures. Using PCA, the authors succeeded in achieving 90% accuracy. Another regional sign language recognition system was presented in ref. ^[28]. The authors made a prototype to convert Malaysian Si the help of the machine learningn Language. A smart data glove made of Tilt sensors and a 3-axis accelerometer was developed for recognition purposes. Microcontroller and Bluetooth modes were also involved in this prototype to classify detected signs and transmit them to a smartphone. The microcontroller operated on template-matching phenomena and succeeded in recognizing a few Malaysian Sign Language gestures. Overall system accuracy was from 78.33% to 95%. Flex sensor and accelerometer-based smart gloves can perform alphanumeric data classification. Using this prototype, 26 alphabets and ten digits can be recognized using a template-matching algorithm ^[29]. Five flex sensors attached to each finger prodapproach. Machine learning techniquced an analog signal of a performed gesture which was transferred towards an Arduino Uno microcontroller. Including an accelerometer for hand motion detection, the authors obtained eight valued data for a sign gesture. In ref. ^[30], ts are counted under the authors developed two gloves-based models. These models contained ten flex sensors attached to each finger of both hands and a 9 degree of freedom (DoF) accelerometer for motion detection. Two glove-based systems were tested on phonetic letters, including a, b, c, ch, and zh. With the help of a matching algorithm, the authors performed static sign recognition with approximately 88% accuracyree of artificial intelligence.

American Sign Language classification and recognition-system-based probabilistic segmentation were presented in ref. ^[31]. T, this system was divided into two main modules. The first module performed segmentation based on the Bayesian Network (BN). Data obtained during this session were used for training purposes. The second module was based on classification using a combination of Support Vector Machine (SVM) classifier with multilayer Conditional Random Field (CRF). This system succeeded in producing 89% accuracy on average. The authors in ref. ^[32] brought some innovation in existing sign data by which our alanguage recognition systems by combining data obtained from sensor gloves and the data obtained using hand-tracking systems. A very well-known methodology known as Dumpster-Shafer theory was implemented on the obtained and fused data for evidence assembling. This fused system achieved 96.2% recognition accuracy on 100 two-handed ArSL. Hand motion and tilt sensor-based sign data were collected using Cyber Glove ^[33]. The classification of 27 hand-shaorithm learns to pes based on Signing Exact English (SEE) was performed using Fisher’s linear discriminant embedded with a linear decision tree. Vector Quantization Principal Component Analysis (VQPCA) was used as a classification tool for sign language recognition. This system was successful in obtaining 96.1% overall accuracy.

An Arabic Sign Languag operations were recognition-based deep learning framework focusing on the singer independent isolated model was discussed in ref. ^[34]. The main focus of rovided by this research was on the regional sign gestures. In a vast variety of regional domains, these authors focused on only Arabic sign gestures and implemented deep learning-based approaches to achieve the desired results. Implementation of hand gestures recognition for posture classification was implemented in ref sensor data collected in a file. ^[35]. The prototype was purely based on real-time hand gesture recognition. For implementation, an IMU-based data glove embedded with different sensors was used to achieve the desired results. Another advancement in the field of sensor-based gestures recognition was implemented by ref. ^[36]. A de data were used for training pual leap motion controller (LMC)-based prototype was designed to capture and identify data. Gaussian Mixture Model and Linear Discriminant based approaches were implemented to achieve results. A case study-related implementation based on regional data was implemented in ref. ^[37]. Tposes. In this way, the authors focused on Pakistani Sign Language models to work on Multiple Kernel Learning-based approaches. Working with signal-based sensor values for classification of real-time gestures was implemented in ref. ^[38]. The authors workeesture is recognized on, wrist-worm-based real-time hand and surface postures. EMGs and IMU-based sensors were embedded to achieve the desired values of sign postures. An armband EMG sensor-based approach was implemented by the authors in ref. ^[39]. The main and the algorithm effocus was to classify finger language by utilizing ensemble-based artificial neural network learning. The sensor values helped ANN to classify gestures accurately. A sign language interpretation-based smart glove was designed by the authors in ref. ^[40]. A sensor-fuseciency can be computed data glove was used to recognize and classify SL postures. Another novel approach to capture sign gestures was discussed in ref. ^[41]. The develia testing opers of the smart data glove named it SkinGest as it completely grips skin with no detachments. For capturing gestures and postures data, filmy stretchable strain sensors were used. Leap motion-based identification of sign gestures was implemented with thtions. With the help of a modified LSTM model in ref. ^[42]. Continuous sign gestures was perfectly classified using an LSTM model to get the desired results. Another novel approach of working with key frame sampling was implemented in ref. ^[43]. Tis technique, the authors also focused on skeletal features to utilize an attention-based sign language recognition network. A Turkish sign language dataset was processed using based line methods in ref. ^[44]. Lrrier farge scale multimodal data were classified based on regional postures to achieve good recognition results. Similarly, the authors used Multimodal Spatiotemporal Networks to classify sign language postures in ref. ^[45]. Deveed by mute peoplopment of a low cost model for translating sign gestures was targeted in ref. ^[46]. The main fo in cus was the development of a smart wearable device with a very reasonable price.

2. Vision Based Models

The ammuthors in ref. ^[47] developed a model using vision-sensor-based techniques to extract temporal and spatial features from video sequences. The CNN algorithm was applied on removed lines to identify the recognized activity. An American Sign Language dataset was used for feature extraction and activity recognition. An Intel Real sense camera was used to translate American Sign Language (ASL) gestures into text. The proposed system included an Intel Real camera-based setup and applied SVM and Neural Network (NN) algorithms to recognize sign language ^[48]. Due to the laating with society can be reduced to a grge set of classes, inter-class complexity was increased to a large extent. This issue was resolved using the Convolutional Neural Network CNN-based approach. Depth images were captured using a high-definition Kinect Camera. Obtained images were processed using CNN to obtain alphabets ^[49].at extent when someone wants Real-time sign language was interpreted using CNN to perform real-time sign detection. This approach does not include outdated datasets or predefined image datasets. The authors manually implemented a real-time data analysis mechanism rather than the traditional approach of using predefined datasets in ref. ^[50]. In visio share with people who are non-sensor-based recognition, 20 alphabets with numbers were recognized using Neural Network-based Hough Transform ^[51]. Du able to speak due to the image’s dataset, a specific threshold value of 0.25 was used for efficiency achievement in the Canny edge detector. This system succeeded in achieving 92.3% accuracy. Fifty samples of alphabets and numbers were recognized by the Indian sign language system using a vision-sensor-based technique ^[52]. A sup vocally import vector machine (SVM)-based classifier with B-spline approximation was used, which achieved 91% accuracy on average. A hybrid pulse-coupled neural network (PCNN) embedded with a nondeterministic finite automaton (NFA) algorithm was used collectively to identify image-based gesture data ^[53]ired disability. This prototype achieved 96% accuracy based on the best match phenomena.

Prommunincipal component analysis (PCA) along with local binary patterns (LBP) extracted Hidden Markov Model (HMM) features with 99.97% accuracy in ref. ^[54]. In ref. ^[55], hand segmentatiation gap creates a pron based on skin color detection was used. For hands identification and tracking, a skin blob tracking system was used. This system achieved 97% accuracy on 30 recognition words. In ref. ^[56]lem. Therefore, Arabsic Sign language recognition was performed using various transformation techniques like Log-Gabor, Fourier, and Hartley transform. Hartley transform and Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) classifiers helped produce 98% accuracy. Combined orientation histogram and statistical (COHST) features along with wavelet feature techniques were used in refis used for communication. ^[57]. These techniques succeeded in recognizing static signs made for numbers from zero to nine in ASL. The neural network produced efficient results based on the feature values of COHST, wavelet, and histogram with 98.17% accuracy. Static gesture recognition based on alphabets was performed using neural network-based wavelet transform. This system achieved 94.06% accuracy in recognizing Persian sign language ^[58]. Manual signs were re are several people who cannot understandetected using the finger, palm, and place of articulation. Equipment arranged for manual sign extracted data from a video sequence and matched it with a 2D image of standard American Sign Language alphabets. The proposed setup resulted in accurate sign detection of alphabets ^[59].

Deep learning-based SLR these sign gestures. Commodels are also focused on vision-based approaches. The authors in ref. ^[60] fnicatiocused on current deep learning-based techniques, trends, and issues in deep models for SL generation. Keeping in mind standard American Sign Language models, the authors in ref. ^[61] focused on the development of a deep image- hindrance based user independent approach. Their main work was based on PCANet features based on depth analysis. Another edge computing-based thermal image detection system was presented by the authors in ref.tween the public and mute ^[62]. Thpey worked on digit-based sign recognition model using deep learning approaches. Different computer vision-based techniques were applied for SLR tasks. A camera sensor-based prototype was used by the authors in ref. ^[63] to correctly identify sign ple is the main postures. A convolutional neural network-based approach was implemented by using video sequences in ref. ^[64]. A oblem three-dimensional attention-based model was designed for a very large vocabulary to acquire data from video sequences and classify them using a 3D-CNN model be addressed. Similarly, th

2. Different Recognition Models

A lot of image processing and sensor-based techniques have been applied for sign language recognition. Recent studies have shown the latest framework for sign recognition with the advent of time. Detailed literature analysis and a deep understanding of sign language recognition categorized this process into different sub-sections. The further division was completely application-based depending on the type of sensors used in the formation of data gloves. So, these subdivisions were based on non-commercial prototypes for data acquisition, commercial data acquisition prototypes, Bi channel-based systems, and hybrid systems. Introducing non-commercial systems, these prototypes are self-made systems that use sensors to gather data. These sensor values are transmitted to another device, usually, any processor or controller, to understand transmitted sensor data and convert these data into their respective sign language format. Most of the sensor-based systems are non-commercial prototype-based systems discussed in the literature review. In non-commercial systems, most of the authors worked on finger bend detection regarding any sign made. So, a large variety of different solo sensors or combinations of different sensors were used to detect this finger bending. So, SLR models can be further divided into non-commercial prototype-based and framework-based prototypes.

1. Sensor-based models

2. Vision-based models

3. Non-commercial models for data glove

4. Commercial data glove models

5. Hybrid recognition models

6. Frameworks based recognition models

All details of recognition models are discussed in the paper referenced below.

3. Analysis

Sign language recognition is one of the emerging trends in today’s modern era. Much research has already been conducted, and currently, most researchers are working on this very domain. The focus of this article is to provide a brief analysis of all related work that has been done until now. For this purpose, a complete breakdown of all research activities was developed. Some authors worked on a general discussion about sign language. Most of their work was based on introduction and hypothesis to deal with sign language scenarios. There was no practical implementation of the proposed hypothesis; therefore, these authors lie in the general article category. A group of authors worked on developing systems that are able to recognize sign gestures. This group of authors is categorized in the developer domain. A good combination of sensors was used to develop a Sign language recognition system. Most of the authors used sensor-based gloves to recognize sign gestures. Another group of authors worked on existing sensor-based models and improved the accuracy and efficiency of the system. Their focus was to use a good combination of Machine Learning and Neural Network-based algorithms for accuracy achievement. Considering the author’s intentions, Machine Learning based algorithms were used by authors working on sensor-based models like sensory gloves. The authors used Neural Network-based models working on vision-sensor-based models for sign gesture recognition.

Considering literature work, some trends were also kept under consideration. Most of the authors in the sign language domain preferred to develop their own sensor-based models. The focus of authors working on this trend was to develop their own cheap and efficient model that could detect and recognize gestures easily. These models were not made for commercial use. Authors obtained another trend to develop commercial gloves. These gloves contained a maximum number of sensors e.g., 18–20 sensors, to detect sign gestures. Cost and efficiency were the main problems in commercial gloves. Analyzing these research articles, the advantages and disadvantages of vision-sensor-based, sensor-based, and hybrid-based recognition models were listed. Additionally, the last trend of focus in sign language articles, including this article, was the group of those authors who worked on surveys and reviewed articles on sign recognition. These authors provided a deep understanding of research work done previously and provided detailed knowledge of hardware modules, sensor performances, efficiency analysis, and accuracy comparisons. The advantage of review and survey articles over general and development research articles was the filtered knowledge of consideration in one article. Survey-based research articles proved to be a good help for learners and newcomers to that specific topic. Survey and review articles also provided researchers with upcoming challenges, trends, motivations, and future recommendations. A detailed comparative study help use determines uses, limitations, benefits, and advancement in the sign language domain.

4. Conclusion

De samve authors implemented a boundary adaptive encoder usinloping an attention-based method on a regional Chinese language dataset in ref. ^[65]. A novel key-frautomatic me-centered clip-based approach was implemented on the same Chinese Sign Language-based dataset in ref. ^[66]. The regional Chinese sign dataset was classified using video sequences in the form of images. This vision-based novel approach producchine-based challenging results in CSL. Another fingerspell-based smart model was developed by the authors in ref. ^[67]. They focu transed on the development of an Indian quiz portfolio that was based on camera-oriented posture classification. The main point of identification was based on ASLR models using a vision-based approach. A vision sensor-based three-dimensional approach was implemented by the authors in ref. ^[68]. Three-dimensionation system that tral sign language representation was classified with the help of spatial three-dimensional relational geometric features. These 3-D data were classified and recognized with the help of a S3DRGF-based technique quite efficiently. Another vision-based technique focusing on color mapping-based classification and recognition was developed by the authors in ref. ^[69]. A CNN-basesforms SL into speech and deep learning model was trained on the three-dimensional data of signs. Color texture coded-based joint angular displacement maps were classified efficiently with the help of a 3-D deep CNN model. Another advanced approach based on three-dimensional data manipulation for sign gestures was implemented in ref. ^[70]. The ext or vice versa is pauthors focused on classification and recognition of angular velocity maps with the help of the deep ResNet model. Connived Feature ResNet was deployed specifically to classify and recognize 3-D sign data. Another video sequences-based novel approach to classify sign gestures was implemented in ref. ^[71]. A BiLSTM-based three-dimensiticularly helpful in impronal residual neural network was used to capture video sequences and identify the posture data. A novel deep learning-based hand gesture recognition approach was implemented by the authors in ref. ^[72]. Image-based fine postures were ng intercaptured and perfectly recognized using deep learning-based architecture. A virtual sign channel for visual communication was developed in ref. ^[73]. The authoPrs’ main focus was to create a virtual communication channel for deaf-mute and hearing individuals. Another three-dimensional data representation for Indian sign language was developed in ref. ^[74]. The gress in pauthors used an adoptive kernel-based motionlets-matching technique to classify gesture data. A video sequence and text embedding-based continuous sign language model was implemented in ref. ^[75]. Jtern recognitioint latent spaces-based data were processed using cross model alignment of a continuous sign language recognition model.

3. Non-Commercial Models for Data Glove

In non-commercial promisesystems, most authors work on finger bend detection regarding any sign made. So, a large variety of different solo sensors or a combination of different sensors were used to detect this finger bending. The authors in ref. ^[76] developed a non-commercial-baautomated transed prototype for sign language recognition. This system was completely based on the finger bending method. To detect finger bending, ten flex sensors were used. A pair of sensors were attached to two joints of each finger. To deal with analogue flex data, a MPU-506A multiplexer was used. Selected data coming from the multiplexer were sent to the MSP430G2231 microcontroller. A Bluetooth module was used to transmit data towards a smart cell phone. This captured data were then compared with the sign language database and the sorted result was converted into speech using a text-to-speech converter. The authors in ref. ^[77] also succeeded in developing a non-commercial sign language recognition-tion systems, but many complex problems need to be solved based prototype. This prototype included five ADXL 335 accelerometer sensors connected with an ATmega 2560 microcontroller system. Based on axis orientation, sign language was identified and transmitted via a Bluetooth module towards mobile application for text-to-speech conversion. In ref. ^[78], fore they become a prototype was developed to help handicapped people. This prototype converted finger orientation into some actions. For this purpose, five optical fibers sensors were used to collect finger bending data. These 8-bit analog data were used to train multilayered neural networks (NN) using MATLAB. So, six hand gesture-based operations were performed using the Backpropagation training algorithm. For data validation, a tenfold validation method was implemented on 800 sample records. Similarly, for Sign Language Recognition, the authors made a non-commercial prototype based on five flex sensors ^[79]. Teality. Several aspects of SLR teche MSP430F149 microcontroller was used to classify incoming analog data. These data were compared with standard American Sign Language (ASL) data, and the output was displayed on Liquid Crystal Display (LCD). Using text-to-speech methodology, the recognized letter was converted into speech using a good quality speaker. The authors in ref. ^[80] deology, particularly SLR that uses a gloveloped the Sign-to-Letter (S2L) system. This system contained six flex sensors and a combination of discrete-valued components and a microcontroller. Five flex sensors were attached to five fingers of the hand, and one sensor was attached to the wrist of the same hand. This combination of two different bending-based sensors succeeded in converting signs into the letter. The output of this system was displayed via the programming “IF-ELSE” condition. A combination of Light Emitting Diode- Laser Dependent Resistor (LED-LDR) sensors was used by ^[81]. MSP430G2553 microcontroller was used to detect signs made by finger bending. Using mentioned microcontroller, analog data were consensor approach, have been preverted into digital and ASCII codes related to 10 Sign Language Alphabets. Converted data were transmitted using a Bluetooth module named as ZigBee, and recognized ASCII code was displayed on a computer screen. This code was also converted into speech.

Anothously er fingerspell system was developed in ref. ^[82]. This prototype included four flex sensors and an accelerometer sensor. The main idea in this prototype design was to translate handmade signs into their corresponding American Sign Language (ASL) alphabets. For data acquisition, four deaf-mute individuals were gathered. This system succeeded in understanding 21 gestures out of 26. A hand gesture recognition system was developed by measuring inertial measurements along with altitude values ^[83]. Folored and investigated by resear data acquisition, six Inertial Measurement Units (IMUs) were used in this prototype. Each IMU was attached to each finger, and one IMU was attached to the wrist. This experimental setup succeeded in collecting hand gesture data by an accelerometer and a gyroscope, and a magnetometer sensor provided values. These values were refined using Kalman Filter and processed through the Linear Discriminant Analysis (LDA) algorithm. Overall, 85% accuracy was achieved by using this prototype in hand gesture recognition.

4. Commercial Data Glove Based Models

Besides following the thers. In this paper, an in-depth comparaditional way of making cheap data gloves, some of the authors used a commercial data glove named “CyberGlove”. This commercial glove was specifically designed for deaf-mute people. A lot of affected communities and research centers used this glove for communication and research purposes. CyberGlove was manufactured precisely with the combination of 22 ve analysis of different sensors embedded on the glove. The basic structure of the glove contained four sensors attached in between fingers and three sensors attached on each finger. Palm sensors and wrist bending measurement sensors were also included in this commercial prototype. This smart, thin layer, elastic fiber-based sensor glove had an approximate cost of $40,000 for each pair. Using this CyberGlove, authors in ref. ^[84] applied a in addressing and describing the combination of neural network-based algorithms to measure the accuracy and efficiency of the system. Finger orientation and hand motion projection were captured with a smart CyberGlove embedded with a 3D motion-tracker sensor. This analog signal data were transferred towards a pair of word recognition network and velocity network algorithms. These algorithms worked on 60 American Sign Language (ASL) combinations and obtained an accuracy of 92% and 95%, respectively. A posture recognition system based on a 3D hand posture model was developed in ref. ^[85]. A Java 3D-based model helped allenges, benefin classification and segmentation of real-time input posture data. These data were compared with pre-recorded CyberGlove-based data with the help of an index tree algorithm. Another CyberGlove embedded with a 3D motion tracker named as Folk of Birds was used for sign language recognition. CyberGlove-based data containing bend, axis, motion, and hand orientation were fed into the multilayered neural network. The Levenberg-Marquardt backpropagation algorithm was used for segmentation and sign classification. This prototype succeeded in producing 90% accuracy in American Sign Language (ASL) recognition ^[86].

In the sensor-based sign language recognition domain, another advancement s, and recommendations related to SLR was made by intproducing a new five-dimensional technology commercial data glove commonly known as the 5DT data glove. This 5DT commercial glove was made in two variants, one with five fiber optic sensors and the other with fourteen optic sensors. 5DT manufacturers named this fiber optic smart data device ultra-motion. Internationally this data glove’s cost was approximately $995. In five sensor-based data gloves, each optical sensor is attached to each finger, and one sensor is attached for hand orientation detection. In 14 optic fiber sensors, two sensors are kept in contact with one finger, and a sensor is also attached in between fingers to check finger abduction. Two-axis measurement-based sensors are also attached in that glove to determine axis and orientation, including pitch and roll of the hand. So, these 5DT-based bright data gloves were used by authors for Japanese Sign language recognition ^[87]. The esented. The paper discussed the literature work of other researchers main idea of developing this system was to automate the learning system. A 3D model based on the 5DT 14 sensor-based smart data glove for simulating signs was made. This system highlights motion errors for beginners and helps understand hand motion completely via a 3D model. To facilitate communication for deaf and mute people, another advancement was applied using a combination of 5DT data gloves with five embedded sensors. Data obtained by using ultra motion glove were trained using the MATLAB simulator. A multilayered neural network with five inputs and 26 outputs was utilized for the training model for sign language recognition. A series of NN-based algorithms like resilient, back, quick, and Manhattan propagation, including scaled conjugated gradient, was used for the training model ^[88].

Another advancement in y targeting the available glove types, the sign language recognition was seen in ref. ^[89]. The authnsors used a DG5 V hand data glove for data acquisition. The internal structure of the DG5 V hand data glove contained five flex or bending sensors with one three axis accelerometer and three contact sensors. This data glove was capable of transmitting acquired data wirelessly. The overall system was made remotely functional by using a battery. The DG 5 V commercial data glove was used for American and Arabic Sign language recognition systems. The authors focused on Arabic Sign Language, whereas this data glove had already been used previously for American Sign Language. The only left-hand glove cost $750. A pair of DG5 V data gloves were used in Arabic Sign language recognition. Two glove-based models succeeded in acquiring data for 40 sentences. This dataset was classified using a modified K-Nearest Neighbor (MKNN) algorithm. The overall system succeeded in producing 98.9% accuracy. The hand gesture cannot be fully recognized without knowing hand orientation and posture. Therefore, an advancement in the traditional system was brought by fusing the concept of Electromyography and inertial sensors within the system ^[90]. Using a combination of the Accelerometer (ACC) sensor with Electromyography (EMG)for capturing data, the techniques adopted for recognition purposes, the authors achieved multiple degrees of freedom for hand movement. This setup was used for Chinese Sign language recognition. The EMG sensors were attached at five muscle points over the forearm, and the MMA7361 modeled 3-axis accelerometer was attached over the wrist. Multi-layered Hidden Markov Model and decision tree algorithms were used for recognition purposes, which succeeded in producing 72.5% accuracy.

The same setup of Accedentification of the dataset in each articlerometer, and Electromyography was used for German Sign Language. The authors used a single EMG with a single ACC sensor to recognize a small database of German vocabulary. The training was performed on seven words with seventy samples for each word. K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) classifiers were used. The system succeeded in achieving an average accuracy of 88.75% and 99.82% in the case of subject dependency ^[91]. A similathe specification of the pr hybrid approach of Accelerometer and Electromyography was used for the Greek Sign language recognition system. The experimental setup consisted of five-channel Electromyography and an accelerometer sensor. The experiment was conducted on the singer with the intrinsic entropy mode. Experiments repeated ten times on three native singers produced training data. So, the system was trained using the intrinsic entropy mode on MATLAB. The system’s overall accuracy was 93% collectively (without the personal effect of native singers involved for data collection purposes) ^[92].

5. Hybrid Recognition Models

A vision-sensor-based approaccessing unit and output devices of th was also adopted in sign language recognition. The previously used combination of electromyography with an accelerometer was replaced with a vision-sensor-based hybrid approach. In the hybrid approach, the authors used a variety of accelerometers with vision-sensor cameras. The purpose of a hybrid system was to enhance data acquisition and accuracy. The vision-sensor-based hybrid prototype contained red, green, and blue (RGB) color model cameras, depth sensors, and accelerometer-based axis and orientation sensors. This combination of the smart hybrid approach was used for gesture identification purposes. The experimental setup included seven IMU accelerometer sensors attached to the arm, wrist, and fingers. For data acquisition, five different age group sign language speakers performed ten times repeated forty gestures. Parallel Hidden Markov Model (PaHMM) succeeded in producing 99.75% accuracy ^[93]. Another com systems. The comparative analysis would bination of an accelerometer-based glove and camera sensor was used for American Sign Language recognition. The experimental setup contained a camera attached to a hat for detecting correctly made signs. Nine accelerometer sensors were used for gesture formation: five attached on each finger and two on the shoulder and arm to detect arm and shoulder movement. Two sensors were attached to the back of the palm for hand orientation measurement. This setup was tested on 665 gestures using the Hidden Markov Model (HMM) and produced a per sign accuracy of 94% ^[94].

6. Framework-Based Recognition Models

Mos helpful to explore and develop a t of the articles ^{[95][96][97][98]} followed a predefined framework for sign language recognition. The main objective of using the same framework was to enhance data accuracy and dataset efficiency. The authors in ref. ^[99] slation system correctly developed a sign language system and implemented that system using different classification and recognition algorithms. The authors in ref.pable of ^[100] succeeded in creating a Vietnamese Sign Language framework that worked wirelessly. A two-handed wireless smart data glove was designed and developed using bend and orientation measurement. The experimental setup included MEMS accelerometer sensors attached just like the Accele Glove and as an addition one more sensor was attached to the palm of hands for orientation measurement. Wireless communication was made feasible by using a Bluetooth module attached to a cellphone. The user-generated sign was compared with the standard sign erpreting different sign language database, and the correctly found result was displayed on a cellphone screens. Finally, a text-to-speech Google translator was utilized to convert the recognized sign alphabet into speech. This sign language framework succeeded in producing a reasonable accuracy. Similarly, the authors in ref. ^[101] developed an Arabic Si datasets gn Language recognition system. The main purpose of developing another framework for static sign analysis was to minimize the number of sensors on data gloves. This experiment was simulated on the Proteus software. The two-handed glove system contained six flex sensors, four contact sensors, one gyroscope, and one accelerometer sensor on each hand.

Anothnerated from ther algorithmic-based sign language recognition framework was designed in ref. ^[102]. Stream segmentati senson-based sign descriptors and text auto-correction-based algorithm were utilized. The system also provided software architecture of descriptors for hand gesture recognition. The Sign Language-based Interpolator, which converted text into speech, was also designed in ref. ^[103]. The overall system framework containedcan be four basic modules that included the smart data glove, training algorithms for the input sign dataset, wirelessly visible sign application, and sign language database for matching the input sign with the standard repository. A very simple resistor-based framework was developed and implemented by ref.sed for tasks of ^[104]. The authors used ten resistors and detected finger movement only. This was a medical application used only for finger flexion and extension. This was a very simple, low-cost, efficient, reliable, and low-power trigger. A data glove containing resistor-based framework was directly connected with a microcontroller which further transmits captured data towards a computer for finger movement analysis. Another simple gesture recognition-based framework was developed by ref. ^[105]. The smarassifications and segmentation t spelling data glove consisted of three bending sensors attached on three fingers. The authors worked only on five gestures, including thumbs-up and rest. Input gesture data were fed into the microcontroller for recognition purposes, and analyzed gestures were combined in a row to form meaningful data before transmitting them to the receiver. A detailed review on all the frameworks based on Chinese Sign Language was discussed in ref. ^[106]. All th assist in continuous gesture technical approaches that are only related to the regional Chinese Sign Language rerecognition and classification mechanisms were discussed in detail. Another detailed review on all the wearable frameworks and prototypes related to sign gesture classification was discussed in ref. ^[107]. The authors focused on maximum frameworks that are related to and had been previously used by authors in the same field. This is also a review article with good depth of technologies and frameworks in SLR.