Visual Tracking Related to Age or Gender Information

Visual Tracking Related to Age or Gender Information: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

Visual tracking of multiple targets, also referred to as multiple object tracking (MOT), since the target can be any moving object or entity, is a well-investigated computer vision task. Actually, the goal is to detect one or more targets in a time-variate scene and then obtain their trajectories in terms of following their tracklets, for a given video sequence. This is completed by associating newly detected instances with current ones. Typically, the association part assumes a prediction task whose aim is to favor the most possible correspondence among detections of consecutive frames for a given target. When the targets of interest are real people, resulting detections from this procedure are usually post-processed so as to extract useful information related, for instance, with their age or gender.

multi-object tracking
age estimation
gender classification
multi-attribute classification
images

1. Multi-Object Tracking

A taxonomy of methods that solve multi-object tracking involve the way an MOT system processes video sequences, namely, online and offline. Online methods operate on the video in a frame-by-frame basis, and thus, perform tracking by only using information up to the current frame, a design principle that makes them suitable for trending applications. In contrast, offline methods have access to the entire video sequence, including future frames, as they process videos in batches. While the latter are designed to handle the problem of data association more efficiently by utilizing future information, they are limited in their applicability in scenarios with real-time and scalability requirements.

A different way to separate methods is the strategy that they follow to handle the different aspects of the MOT task. Most approaches to the problem of multi-object tracking (MOT) generally follow the tracking-by-detection design framework. They formulate the tracking problem as a two-stage workflow. A detection step which localizes targets in an image and a second data association step, where the goal is to match the detections with existing, corresponding trajectories, generate new ones in case a new target appears on the scene, or discard old ones, when a target is no longer visible. The second step in this paradigm involves the actual tracking part and typically consists of several subtasks, such as motion forecasting, embedding extraction (image representation), and data association, among others.

Most approaches that follow this strategy initially utilized two separate models, usually deep-learning-based architectures for their success in both tasks of detection and feature extraction. A popular choice is convolutional neural networks (CNNs) ^[1]^[2]^[3] even though, more recently, graph neural networks ^[4]^[5] and transformers ^[6]^[7]^[8] have also been used. The descriptive capabilities of deep networks have enabled these methods to achieve remarkable results, by continuously improving upon one of the two models, as they are both important in the final tracking performance. However, using two computationally intensive models entails some drawbacks. Most importantly, the computational overhead required to run both models prohibits their application in real-time scenarios because of slow running speeds. In addition, considering the fact that two resource intensive types of neural networks are typically used in both steps, this requires two separate training processes and also results in a significant amount of redundant computations that are generally similar and can be avoided.

To tackle some of these shortcomings, a similar approach has emerged that utilizes a single model to perform both steps of detection and target tracking, avoiding some of the aforementioned issues. Such methods jointly train a unified network to handle both tasks and are known as joint-detection-and-tracking methods ^[9]^[10]^[11]^[12]. Apart from applications in MOT, this design has also been applied to human pose estimation ^[13]. Similarly to the previous strategy, CNNs remain the most prevalent models for this task due to their significant research improvements over the last few years that enable them to handle both steps while constantly improving their accuracy and running speeds.

More recently, target association methods that rely solely on detector outputs to associate all detected bounding boxes have been proposed ^[14]^[15]. These methods match new detections with existing tracklets without the necessity of an embedding extraction step, which was typically handled by a deep learning network (e.g., ^[2]). Consequently, the use of a single computationally intensive module in the overall pipeline leads to a reduction in the system’s required resources and latency, rendering such methods accurate trackers with real-time capabilities, depending on detector performance. This design benefits from a simplified training procedure as well, since the only trainable component is the detector unit, as without an embedding extraction network, a second dataset is no longer necessary, reducing the amount of training data and enabling faster deployment.

2. Age Estimation

The task of human age estimation has been well studied for a few decades by researchers. Age estimation techniques are often based on shape- and texture-based cues from faces, which are then followed by traditional classification or regression methods. Earlier approaches to the task utilized classic computer vision methods for feature extraction, such as Gabor filters ^[16]^[17], histogram of oriented gradients (HoG) ^[18], or local binary patterns (LBP) ^[19]^[20].

Currently, with advances in machine learning research as well as hardware capabilities, the predominant approach is the application of deep learning methods to solve the problem of feature extraction. CNNs have been widely adopted for their performance as capable feature extractors to obtain powerful representations of the input data. For instance, the works presented in ^[21]^[22]^[23]^[24] utilized convolutional-based networks and structures, whereas, Pei and co-workers ^[25] proposed an end-to-end architecture that uses CNNs as well as recurrent neural networks (RNNs). Duan et al. ^[26] combined a CNN with an extreme learning machine (ELM) ^[27], which is a feed-forward neural network that achieves very fast training speeds and can outperform SVMs in many applications, while the authors of ^[28]^[29]^[30] explored more compact and low resource convolutional models. Other deep learning methods, such as auto-encoders ^[21] and random forests ^[31], have also been adopted.

Most of these works make use of face images due to the fact that they provide more descriptive information about age ranges, since as people get older, certain common changes in facial characteristics can be observed, leading to better representations and higher accuracy of age estimation. Additionally, the majority of available corpora in the literature comprise media that depict faces exclusively, or at least contain face images, which are utilized after a detection and cropping step, discarding any other information.

Using images of the full body for this task has largely been an unexplored research topic, in part because of challenges in associating visual information from the body with apparent age, but also due to the lack of large publicly available datasets. Consequently, very few works have been proposed that use whole body images to estimate just the age of a person, for example, earlier approaches include ^[20]^[32]^[33], in which hand-crafted features were used. More recently, CNNs have been applied to the problem ^[23] obtaining accurate results demonstrating that full body images provide adequate visual information and can be successfully used to deal with this problem.

A subcategory to this problem is apparent age estimation, meaning that the actual age of the persons is not known beforehand, but is based on the subjective estimations of the annotator(s). In these methods, evaluation is performed on apparent ground-truth data ^[34]. Due to the nature of real-world data, apparent age estimation is a well-suited subclass for real-time applications where visual perception of age plays an important role.

3. Gender Classification

The task of classifying the gender of people that appear in images is similar in nature with that of estimating their age. Over the last decades, a few works that focus solely on this task have been proposed. Conventional methods rely on shallow-learned features, such as histogram of gradients ^[35]^[36] or local binary patterns ^[37]^[38]^[39] for feature extraction and support vector machines for classification ^[40]^[41] and still remain popular and are widely used.

As with most image processing and computer vision problems, CNNs have also been adopted for gender classification, usually to obtain robust representations ^[42]. For example, Aslam and colleagues ^[43] propose wavelet-based convolutional neural networks for gender classification, while isolated facial features and foggy faces are used as inputs in CNNs in ^[44]^[45]. Ref. ^[46] provides a comparison of traditional and deep-learned features for gender recognition from faces in the wild, and ^[47] explored several popular convolutional architectures used in other tasks for identifying the gender of humans wearing masks.

Since images of the face contain more relevant information about gender compared to full body, they lead to better accuracy, and therefore, most methods that have been proposed for this task utilize datasets that contain images of faces. This reason is also an additional factor that contributes to the lack of publicly available full-body datasets. In contrast to age estimation, using full body images for this task has received some attention ^[35]^[36]^[40], but still remains an open area of research. Some methods deviate from the standard approach of using two-dimensional images to the application of three-dimensional data for gender recognition to alleviate some difficulties present in 2D data ^[48]^[49].

A different avenue of research for this problem is the combination of different modalities to assist with performance by taking advantage of features from different sources. More specifically, multi-modal data, such as depth ^[50] or thermal images ^[51]^[52]^[53], of the body have also been explored as auxiliary inputs to classification systems for improving performance and helping to overcome challenges arising when only RGB images of the body are available.

Apart from aforementioned approaches, a few works have focused on gait ^[54]^[55]^[56] as an indicator of gender. Gait-based methods assume information accrued from the gait of a person, which is related to change of pose in consecutive frames. The typical assumption is a controlled environment where multiple views of the objects are available so that the change in pose can be determined ^[57]. As a consequence, this limitation does not allow gait-based methods to be employed for practical consumer demographic estimation.

4. Related Age and Gender Multi-Attribute Classification Methods

Both the age and gender information about a person can be estimated from face images with great accuracy, and therefore, several works have been published that attempt to solve both tasks. Due to challenges present when using body images as previously discussed, as well as owing to dataset availability, the preferred form of data used by these works favors facial images. One of the earliest methods can be found in ^[58]^[59], where classic image processing techniques are employed to extract information based on textures of wrinkles and colors. More recently, Eidinger et al. ^[60] proposed a SVM-based approach for age and gender classification from face images in the wild.

With advances in deep learning, various CNNs have been adopted for predicting age along with gender, replacing older methods, typically comprising feature extractors as parts of larger systems or end-to-end models that handle the additional process of classification. For example, all works presented in ^[61]^[62]^[63]^[64]^[65]^[66] used only convolution-based architectures to tackle both problems with images of faces as inputs, whereas Uricár and co-workers ^[67] proposed a combined CNN feature extractor with a SVM classifier. In a similar fashion, Duan et al. ^[68] developed a hybrid technique that utilizes CNNs for feature extraction, whereas classification is handled by an extreme learning machine (ELM) for faster training and more accurate predictions. Another hybrid method that leverages non-convolutional neural networks and CNNs by fusing their decisions for a final prediction is presented in ^[69]. Lately, owing to their success in various tasks, vision transformers have also been explored for age and gender classification ^[70].

Using full body images is a much more rare approach, and in this case, most works that classify age as well as gender do so as part of a multi-attribute classification problem, where the goal is to predict a larger set of attributes. Analogous to the problem of gender-only estimation, gait-based methods have also been developed for the combined task, featuring multiple views of a person’s entire body ^[71], operating on a single image in real-time ^[72], or employing data from wearable sensors ^[73]. However, such approaches often assume a controlled monitoring environment of the involved subjects of interest, not readily applicable in real-time consumer tracking.

This entry is adapted from the peer-reviewed paper 10.3390/s23239510

References

Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple Online and Realtime Tracking. In Proceedings of the 2016 IEEE 23rd International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468.
Wojke, N.; Bewley, A.; Paulus, D. Simple Online and Realtime Tracking With a Deep Association Metric. In Proceedings of the 2017 IEEE 24th International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649.
Wan, X.; Wang, J.; Kong, Z.; Zhao, Q.; Deng, S. Multi-Object Tracking Using Online Metric Learning with Long Short-Term Memory. In Proceedings of the 2018 IEEE 25th International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 788–792.
Liu, Q.; Chu, Q.; Liu, B.; Yu, N. GSM: Graph Similarity Model for Multi-Object Tracking. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Yokohama, Japan, 7–15 January 2020; pp. 530–536.
Li, J.; Gao, X.; Jiang, T. Graph Networks for Multiple Object Tracking. In Proceedings of the 2020 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass village, CO, USA, 2–5 March 2020; pp. 719–728.
Chu, P.; Wang, J.; You, Q.; Ling, H.; Liu, Z. TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking. arXiv 2021, arXiv:2104.00194.
Zeng, F.; Dong, B.; Zhang, Y.; Wang, T.; Zhang, X.; Wei, Y. MOTR: End-to-End Multiple-Object Tracking with Transformer. In Proceedings of the 17th European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 659–675.
Tsai, C.Y.; Shen, G.Y.; Nisar, H. Swin-JDE: Joint Detection and Embedding Multi-Object Tracking in Crowded Scenes Based on Swin-Transformer. Eng. Appl. Artif. Intell. 2023, 119, 105770.
Wang, Z.; Zheng, L.; Liu, Y.; Li, Y.; Wang, S. Towards Real-Time Multi-Object Tracking. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 107–122.
Peng, J.; Wang, C.; Wan, F.; Wu, Y.; Wang, Y.; Tai, Y.; Wang, C.; Li, J.; Huang, F.; Fu, Y. Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 145–161.
Pang, B.; Li, Y.; Zhang, Y.; Li, M.; Lu, C. TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model. In Proceedings of the 2020 IEEE/CVF 33rd Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6308–6318.
Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087.
Zhang, Y.; Wang, C.; Wang, X.; Liu, W.; Zeng, W. VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2613–2626.
Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z.; Luo, P.; Liu, W.; Wang, X. ByteTrack: Multi-object Tracking by Associating Every Detection Box. In Proceedings of the 17th European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 1–21.
Cao, J.; Weng, X.; Khirodkar, R.; Pang, J.; Kitani, K. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. arXiv 2022, arXiv:2203.14360.
Gao, F.; Ai, H. Face Age Classification on Consumer Images with Gabor Feature and Fuzzy LDA Method. In Proceedings of the Third International Conference on Biometrics (ICB), Alghero, Italy, 2–5 June 2009; Third International Conferences on Advances in Biometrics, 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 132–141.
Guo, G.; Mu, G.; Fu, Y.; Huang, T.S. Human age estimation using bio-inspired features. In Proceedings of the 2009 IEEE 22nd Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 112–119.
Hajizadeh, M.A.; Ebrahimnezhad, H. Classification of age groups from facial image using histograms of oriented gradients. In Proceedings of the 2011 7th Iranian Conference on Machine Vision and Image Processing, Tehran, Iran, 16–17 November 2011; pp. 1–5.
Gunay, A.; Nabiyev, V.V. Automatic age classification with LBP. In Proceedings of the 2008 23rd International Symposium on Computer and Information Sciences (ISCIS), Istanbul, Turkey, 27–29 October 2008; pp. 1–4.
Ge, Y.; Lu, J.; Fan, W.; Yang, D. Age estimation from human body images. In Proceedings of the 2013 IEEE 38th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, BC, Canada, 26–31 May 2013; pp. 2337–2341.
Zaghbani, S.; Boujneh, N.; Bouhlel, M.S. Age estimation using deep learning. Comput. Electr. Eng. 2018, 68, 337–347.
Ranjan, R.; Zhou, S.; Cheng Chen, J.; Kumar, A.; Alavi, A.; Patel, V.M.; Chellappa, R. Unconstrained Age Estimation with Deep Convolutional Neural Networks. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 109–117.
Yuan, B.; Wu, A.; Zheng, W.S. Does A Body Image Tell Age? In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 2142–2147.
Xie, J.C.; Pun, C.M. Deep and Ordinal Ensemble Learning for Human Age Estimation From Facial Images. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2361–2374.
Pei, W.; Dibeklioğlu, H.; Baltrušaitis, T.; Tax, D.M. Attended End-to-End Architecture for Age Estimation From Facial Expression Videos. IEEE Trans. Image Process. 2019, 29, 1972–1984.
Duan, M.; Li, K.; Li, K. An Ensemble CNN2ELM for Age Estimation. IEEE Trans. Inf. Forensics Secur. 2017, 13, 758–772.
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501.
Yang, T.Y.; Huang, Y.H.; Lin, Y.Y.; Hsiu, P.C.; Chuang, Y.Y. SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018; International Joint Conferences on Artificial Intelligence Organization, 2018. Volume 5, pp. 1078–1084.
Zhang, C.; Liu, S.; Xu, X.; Zhu, C. C3AE: Exploring the Limits of Compact Model for Age Estimation. In Proceedings of the 2019 IEEE/CVF 32nd Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12587–12596.
Deng, Y.; Teng, S.; Fei, L.; Zhang, W.; Rida, I. A Multifeature Learning and Fusion Network for Facial Age Estimation. Sensors 2021, 21, 4597.
Shen, W.; Guo, Y.; Wang, Y.; Zhao, K.; Wang, B.; Yuille, A. Deep Differentiable Random Forests for Age Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 404–419.
Ge, Y.; Lu, J.; Feng, X.; Yang, D. Body-based human age estimation at a distance. In Proceedings of the 2013 IEEE 14th International Conference on Multimedia and Expo Workshops (ICMEW), San Jose, CA, USA, 15–19 July 2013; pp. 1–4.
Wu, Q.; Guo, G. Age classification in human body images. J. Electron. Imaging 2013, 22, 033024.
Escalera, S.; Fabian, J.; Pardo, P.; Baró, X.; Gonzalez, J.; Escalante, H.J.; Misevic, D.; Steiner, U.; Guyon, I. ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 1–9.
Cao, L.; Dikmen, M.; Fu, Y.; Huang, T.S. Gender recognition from body. In Proceedings of the 16th ACM International Conference on Multimedia, Vancouver, BC, Canada, 26–31 October 2008; pp. 725–728.
Guo, G.; Mu, G.; Fu, Y. Gender from Body: A Biologically-Inspired Approach with Manifold Learning. In Proceedings of the 9th Asian Conference on Computer Vision (ACCV), Xi’an, China, 23–27 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 236–245.
Tianyu, L.; Fei, L.; Rui, W. Human face gender identification system based on MB-LBP. In Proceedings of the 2018 30th Chinese Control and Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 1721–1725.
Omer, H.K.; Jalab, H.A.; Hasan, A.M.; Tawfiq, N.E. Combination of Local Binary Pattern and Face Geometric Features for Gender Classification from Face Images. In Proceedings of the 2019 IEEE 9th International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 29 November–1 December 2019; pp. 158–161.
Fekri-Ershad, S. Developing a gender classification approach in human face images using modified local binary patterns and tani-moto based nearest neighbor algorithm. arXiv 2020, arXiv:2001.10966.
Kakadiaris, I.A.; Sarafianos, N.; Nikou, C. Show me your body: Gender classification from still images. In Proceedings of the 2016 IEEE 23rd International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3156–3160.
Moghaddam, B.; Yang, M.H. Gender classification with support vector machines. In Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), Grenoble, France, 28–30 March 2000; pp. 306–311.
Dammak, S.; Mliki, H.; Fendri, E. Gender estimation based on deep learned and handcrafted features in an uncontrolled environment. Multimed. Syst. 2022, 9, 421–433.
Aslam, A.; Hayat, K.; Umar, A.I.; Zohuri, B.; Zarkesh-Ha, P.; Modissette, D.; Khan, S.Z.; Hussian, B. Wavelet-based convolutional neural networks for gender classification. J. Electron. Imaging 2019, 28, 013012.
Aslam, A.; Hussain, B.; Cetin, A.E.; Umar, A.I.; Ansari, R. Gender classification based on isolated facial features and foggy faces using jointly trained deep convolutional neural network. J. Electron. Imaging 2018, 27, 053023.
Afifi, M.; Abdelhamed, A. AFIF4: Deep gender classification based on AdaBoost-based fusion of isolated facial features and foggy faces. J. Vis. Commun. Image Represent. 2019, 62, 77–86.
Althnian, A.; Aloboud, N.; Alkharashi, N.; Alduwaish, F.; Alrshoud, M.; Kurdi, H. Face Gender Recognition in the Wild: An Extensive Performance Comparison of Deep-Learned, Hand-Crafted, and Fused Features with Deep and Traditional Models. Appl. Sci. 2020, 11, 89.
Rasheed, J.; Waziry, S.; Alsubai, S.; Abu-Mahfouz, A.M. An Intelligent Gender Classification System in the Era of Pandemic Chaos with Veiled Faces. Processes 2022, 10, 1427.
Tang, J.; Liu, X.; Cheng, H.; Robinette, K.M. Gender Recognition Using 3-D Human Body Shapes. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2011, 41, 898–908.
Tang, J.; Liu, X.; Cheng, H.; Robinette, K.M. Gender recognition with limited feature points from 3 to D human body shapes. In Proceedings of the 2012 IEEE 42nd International Conference on Systems, Man, and Cybernetics (SMC), Seoul, Republic of Korea, 14–17 October 2012; pp. 2481–2484.
Linder, T.; Wehner, S.; Arras, K.O. Real-time full-body human gender recognition in (RGB)-D data. In Proceedings of the 2015 IEEE 35th International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 3039–3045.
Nguyen, D.T.; Kim, K.W.; Hong, H.G.; Koo, J.H.; Kim, M.C.; Park, K.R. Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction. Sensors 2017, 17, 637.
Nguyen, D.T.; Park, K.R. Body-Based Gender Recognition Using Images from Visible and Thermal Cameras. Sensors 2016, 16, 156.
Nguyen, D.T.; Park, K.R. Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body. Sensors 2016, 16, 1134.
Lu, J.; Wang, G.; Huang, T.S. Gait-based gender classification in unconstrained environments. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR 2012), Tsukuba Science City, Japan, 11–15 November 2012; pp. 3284–3287.
Lu, J.; Wang, G.; Moulin, P. Human Identity and Gender Recognition from Gait Sequences with Arbitrary Walking Directions. IEEE Trans. Inf. Forensics Secur. 2013, 9, 51–61.
Hassan, O.M.S.; Abdulazeez, A.M.; TİRYAKİ, V.M. Gait-Based Human Gender Classification Using Lifting 5/3 Wavelet and Principal Component Analysis. In Proceedings of the 2018 First International Conference on Advanced Science and Engineering (ICOASE), Duhok, Zakho, Kurdistan Region of Iraq, 9–11 October 2018; pp. 173–178.
Isaac, E.R.; Elias, S.; Rajagopalan, S.; Easwarakumar, K. Multiview gait-based gender classification through pose-based voting. Pattern Recognit. Lett. 2019, 126, 41–50.
Hayashi, J.i.; Yasumoto, M.; Ito, H.; Niwa, Y.; Koshimizu, H. Age and gender estimation from facial image processing. In Proceedings of the 41st SICE Annual Conference (SICE 2002), Osaka, Japan, 5–7 August 2002; Volume 1, pp. 13–18.
Hayashi, J.I.; Koshimizu, H.; Hata, S. Age and Gender Estimation Based on Facial Image Analysis. In Proceedings of the 7th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2003), Oxford, UK, 3–5 September 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 863–869.
Eidinger, E.; Enbar, R.; Hassner, T. Age and Gender Estimation of Unfiltered Faces. IEEE Trans. Inf. Forensics Secur. 2014, 9, 2170–2179.
Levi, G.; Hassner, T. Age and gender classification using convolutional neural networks. In Proceedings of the 2015 IEEE 7th Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 7–12 June 2015; pp. 34–42.
Zhang, K.; Gao, C.; Guo, L.; Sun, M.; Yuan, X.; Han, T.X.; Zhao, Z.; Li, B. Age Group and Gender Estimation in the Wild With Deep RoR Architecture. IEEE Access 2017, 5, 22492–22503.
Lee, S.H.; Hosseini, S.; Kwon, H.J.; Moon, J.; Koo, H.I.; Cho, N.I. Age and gender estimation using deep residual learning network. In Proceedings of the 2018 International Workshop on Advanced Image Technology (IWAIT), Chiang Mai, Thailand, 7–9 January 2018; pp. 1–3.
Boutros, F.; Damer, N.; Terhörst, P.; Kirchbuchner, F.; Kuijper, A. Exploring the Channels of Multiple Color Spaces for Age and Gender Estimation from Face Images. In Proceedings of the 2019 22nd International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2–5 July 2019; pp. 1–8.
Debgupta, R.; Chaudhuri, B.B.; Tripathy, B.K. A Wide ResNet-Based Approach for Age and Gender Estimation in Face Images. In Proceedings of the International Conference on Innovative Computing and Communications, Delhi, India, 2 August 2020; Advances in Intelligent Systems and Computing (AISC), vol 1165; Springer: Singapore, 2020; pp. 517–530.
Sharma, N.; Sharma, R.; Jindal, N. Face-Based Age and Gender Estimation Using Improved Convolutional Neural Network Approach. Wirel. Pers. Commun. 2022, 124, 3035–3054.
Uricár, M.; Timofte, R.; Rothe, R.; Matas, J.; Van Gool, L. Structured Output SVM Prediction of Apparent Age, Gender and Smile from Deep Features. In Proceedings of the 2016 IEEE 12thConference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 25–33.
Duan, M.; Li, K.; Yang, C.; Li, K. A hybrid deep learning CNN–ELM for age and gender classification. Neurocomputing 2018, 275, 448–461.
Rwigema, J.; Mfitumukiza, J.; Tae-Yong, K. A hybrid approach of neural networks for age and gender classification through decision fusion. Biomed. Signal Process. Control 2021, 66, 102459.
Kuprashevich, M.; Tolstykh, I. MiVOLO: Multi-input Transformer for Age and Gender Estimation. arXiv 2023, arXiv:2307.04616.
Makihara, Y.; Mannami, H.; Yagi, Y. Gait Analysis of Gender and Age Using a Large-Scale Multi-view Gait Database. In Proceedings of the 10th Asian Conference on Computer Vision (ACCV 2010), Queenstown, New Zealand, 8–12 November; Springer: Berlin/Heidelberg, Germany, 2011; pp. 440–451.
Xu, C.; Makihara, Y.; Liao, R.; Niitsuma, H.; Li, X.; Yagi, Y.; Lu, J. Real-Time Gait-Based Age Estimation and Gender Classification from a Single Image. In Proceedings of the 2021 IEEE 9th Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 3460–3470.
Ahad, M.A.R.; Ngo, T.T.; Antar, A.D.; Ahmed, M.; Hossain, T.; Muramatsu, D.; Makihara, Y.; Inoue, S.; Yagi, Y. Wearable Sensor-Based Gait Analysis for Age and Gender Estimation. Sensors 2020, 20, 2424.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.