Deep neural network augmentation: generating faces for affect analysis

Article


Kollias, D., Cheng, S., Ververas, E., Kotsia, I. and Zafeiriou, S. 2020. Deep neural network augmentation: generating faces for affect analysis. International Journal of Computer Vision. 128 (5), pp. 1455-1484. https://doi.org/10.1007/s11263-020-01304-3
TypeArticle
TitleDeep neural network augmentation: generating faces for affect analysis
AuthorsKollias, D., Cheng, S., Ververas, E., Kotsia, I. and Zafeiriou, S.
Abstract

This paper presents a novel approach for synthesizing facial affect; either in terms of the six basic expressions (i.e., anger, disgust, fear, joy, sadness and surprise), or in terms of valence (i.e., how positive or negative is an emotion) and arousal (i.e., power of the emotion activation). The proposed approach accepts the following inputs:(i) a neutral 2D image of a person; (ii) a basic facial expression or a pair of valence-arousal (VA) emotional state descriptors to be generated, or a path of affect in the 2D VA space to be generated as an image sequence. In order to synthesize affect in terms of VA, for this person, 600,000 frames from the 4DFAB database were annotated. The affect synthesis is implemented by fitting a 3D Morphable Model on the neutral image, then deforming the reconstructed face and adding the inputted affect, and blending the new face with the given affect into the original image. Qualitative experiments illustrate the generation of realistic images, when the neutral image is sampled from fifteen well known lab-controlled or in-the-wild databases, including Aff-Wild, AffectNet, RAF-DB; comparisons with generative adversarial networks (GANs) show the higher quality achieved by the proposed approach. Then, quantitative experiments are conducted, in which the synthesized images are used for data augmentation in training deep neural networks to perform affect recognition over all databases; greatly improved performances are achieved when compared with state-of-the-art methods, as well as with GAN-based data augmentation, in all cases.

KeywordsArticle, Special Issue on Generating Realistic Visual Data of Human Behavior, Dimensional, Categorical affect, Valence, Arousal, Basic emotions, Facial affect synthesis, 4DFAB, Blendshape models, 3DMM fitting, DNNs, StarGAN, GANimation, Data augmentation, Affect recognition, Facial expression transfer
PublisherSpringer
JournalInternational Journal of Computer Vision
ISSN0920-5691
Electronic1573-1405
Publication dates
Online22 Feb 2020
Print31 May 2020
Publication process dates
Deposited08 Jun 2020
Submitted31 Oct 2018
Accepted05 Feb 2020
Output statusPublished
Publisher's version
License
File Access Level
Open
Copyright Statement

© The Author(s) 2020.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Digital Object Identifier (DOI)https://doi.org/10.1007/s11263-020-01304-3
LanguageEnglish
Permalink -

https://repository.mdx.ac.uk/item/88z87

Download files


Publisher's version
  • 28
    total views
  • 10
    total downloads
  • 0
    views this month
  • 0
    downloads this month

Export as

Related outputs

Unconstrained face recognition
Zafeiriou, S., Kotsia, I. and Pantic, M. 2018. Unconstrained face recognition. in: Management Association, I. (ed.) Computer Vision: Concepts, Methodologies, Tools, and Applications Hershey, PA IGI Global. pp. 1640-1661
Dense 3D face decoding over 2500FPS: Joint texture and shape convolutional mesh decoders
Zhou, Y., Deng, J., Kotsia, I. and Zafeiriou, S. 2019. Dense 3D face decoding over 2500FPS: Joint texture and shape convolutional mesh decoders. International Conference on Computer Vision and Pattern Recognition. Long Beach, California, USA 16 - 20 Jun 2019 IEEE. pp. 1097-1106 https://doi.org/10.1109/CVPR.2019.00119
GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction
Gecer, B., Ploumpis, S., Kotsia, I. and Zafeiriou, S. 2019. GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction. 2019 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. Long Beach, California 16 - 20 Jun 2019 IEEE. pp. 1155-1164 https://doi.org/10.1109/CVPR.2019.00125
Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond
Kollias, D., Tzirakis, P., Nicolaou, M., Papaioannou, A., Zhao, G., Schuller, B., Kotsia, I. and Zafeiriou, S. 2019. Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond. International Journal of Computer Vision. 127 (6-7), pp. 907-929. https://doi.org/10.1007/s11263-019-01158-4
The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking
Deng, J., Roussos, A., Chrysos, G., Ververas, E., Kotsia, I., Shen, J. and Zafeiriou, S. 2019. The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking. International Journal of Computer Vision. 127, pp. 599-624. https://doi.org/10.1007/s11263-018-1134-y
4DFAB: a large scale 4D facial expression database for biometric applications
Cheng, S., Kotsia, I., Pantic, M. and Zafeiriou, S. 2018. 4DFAB: a large scale 4D facial expression database for biometric applications. CVPR 2018: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA 18 - 22 Jun 2018 Institute of Electrical and Electronics Engineers (IEEE). pp. 5117-5126
Dynamic probabilistic linear discriminant analysis for video classification
Fabris, A., Nicolaou, M., Kotsia, I. and Zafeiriou, S. 2017. Dynamic probabilistic linear discriminant analysis for video classification. ICASSP 2017: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, USA 05 - 09 Mar 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 2781-2785 https://doi.org/10.1109/ICASSP.2017.7952663
Recognition of affect in the wild using deep neural networks
Kollias, D., Nicolaou, M., Kotsia, I., Zhao, G. and Zafeiriou, S. 2017. Recognition of affect in the wild using deep neural networks. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA Institute of Electrical and Electronics Engineers (IEEE). pp. 1972-1979 https://doi.org/10.1109/CVPRW.2017.247
Aff-Wild: Valence and Arousal ‘in-the-wild’ Challenge
Zafeiriou, S., Kollias, D., Nicolaou, M., Papaioannou, A., Zhao, G. and Kotsia, I. 2017. Aff-Wild: Valence and Arousal ‘in-the-wild’ Challenge. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA 21 - 26 Jul 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 1980-1987 https://doi.org/10.1109/CVPRW.2017.248
AgeDB: the first manually collected, in-the-wild age database
Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I. and Zafeiriou, S. 2017. AgeDB: the first manually collected, in-the-wild age database. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA 21 - 26 Jul 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 1997-2005 https://doi.org/10.1109/CVPRW.2017.250
Facial affect "in the wild": a survey and a new database
Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M. and Zhao, G. 2016. Facial affect "in the wild": a survey and a new database. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Affect "in-the-wild" Workshop. Las Vegas, USA 26 Jun - 01 Jul 2016 Institute of Electrical and Electronics Engineers (IEEE). pp. 1487-1498 https://doi.org/10.1109/cvprw.2016.186
Unconstrained face recognition
Zafeiriou, S., Kotsia, I. and Pantic, M. 2014. Unconstrained face recognition. in: De Marsico, M., Nappi, M. and Tistarelli, M. (ed.) Facerecognition in adverse conditions IGI.
Support tensor action spotting
Kotsia, I. and Patras, I. 2012. Support tensor action spotting. IEEE International Conference on Image Processing (ICIP 2012). Orlando, FL, USA 30 Sep - 03 Oct 2012
Exploring the similarities of neighboring spatiotemporal points for action pair matching
Kotsia, I. and Patras, I. 2012. Exploring the similarities of neighboring spatiotemporal points for action pair matching. 11th Asian Conference on Computer Vision (ACCV 2012). Daejeon, Korea 05 - 09 Nov 2012
Recognition of facial expressions in presence of partial occlusion
Buciu, I., Kotsia, I. and Pitas, I. 2003. Recognition of facial expressions in presence of partial occlusion. 9th Panhellenic Conference in Informatics (PCI 2003). Thessaloniki 21 - 23 Nov 2003
Facial expression analysis under partial occlusion
Buciu, I., Kotsia, I. and Pitas, I. 2005. Facial expression analysis under partial occlusion. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005). Philadelphia 18 - 23 Mar 2005
Real time facial expression recognition from image sequences using support vector machines
Kotsia, I. and Pitas, I. 2005. Real time facial expression recognition from image sequences using support vector machines. Visual Communications and Image Processing (VCIP 2005). Beijing, China 12 - 15 Jul 2005
Multimodal caricatural mirror
Martin, O., Adell, J., Huerta, A., Kotsia, I., Savran, A. and Sebbe, R. 2005. Multimodal caricatural mirror. eNTERFACE 2005. Belgium 18 Jul - 12 Aug 2005
Real time facial expression recognition from image sequences using support vector machines
Kotsia, I. and Pitas, I. 2005. Real time facial expression recognition from image sequences using support vector machines. IIEEE International Conference on Image Processing (ICIP 2005). Genova, Italy 11 - 14 Sep 2005
Affective gaming: a comprehensive survey
Kotsia, I., Zafeiriou, S. and Fotopoulos, S. 2013. Affective gaming: a comprehensive survey. Conference on Computer Vision and Pattern Recognition Workshops (CVPR) Behavior Analysis in Games and modern Sensing devices (BAGS) workshop.
On one-shot kernels: explicit feature maps and properties
Zafeiriou, S. and Kotsia, I. 2013. On one-shot kernels: explicit feature maps and properties. Proceedings of IEEE Int’l Conf. on Computer Vision (ICCV 2013).
The eNTERFACE'05 audio-visual emotion database
Martin, O., Kotsia, I., Macq, B. and Pitas, I. 2006. The eNTERFACE'05 audio-visual emotion database. 22nd International Conference on Data Engineering Workshops (ICDEW 2006). Atlanta, USA 03 - 07 Apr 2006
Facial expression recognition using shape and texture information
Kotsia, I. and Pitas, I. 2006. Facial expression recognition using shape and texture information. IFIP TC12 and WG12.5: Conference and Symposium on Artificial Intelligence. Santiago, Chile 21 - 24 Aug 2006
Fusion of geometrical and texture information for facial expression recognition
Kotsia, I., Nikolaidis, N. and Pitas, I. 2006. Fusion of geometrical and texture information for facial expression recognition. IEEE International Conference on Image Processing (ICIP 2006). Atlanta, USA 08 - 11 Oct 2006
Affective gaming: beyond using sensors
Kotsia, I., Patras, I. and Fotopoulos, S. 2012. Affective gaming: beyond using sensors. 5th International Symposium on Communications Control and Signal Processing (ISCCSP 2012). Rome, Italy 02 - 04 May 2012
Higher rank support tensor machines
Kotsia, I., Guo, W. and Patras, I. 2012. Higher rank support tensor machines. 8th International Symposium on Visual Computing (ISVC 2012). Crete, Greece 16 - 18 Jul 2012
Tensor learning for regression
Guo, W., Kotsia, I. and Patras, I. 2012. Tensor learning for regression. IEEE Transactions on Image Processing. 21 (2), pp. 816-827.
Max-margin non-negative matrix factorization
Kumar, B., Kotsia, I. and Patras, I. 2012. Max-margin non-negative matrix factorization. Image and Vision Computing. 30 (4-5), pp. 279-291.
Higher rank support tensor machines for visual recognition
Kotsia, I., Guo, W. and Patras, I. 2012. Higher rank support tensor machines for visual recognition. Pattern Recognition. 45 (12), pp. 4192-4203.
Facial expression recognition in videos using a novel multi-class support vector machines variant
Kotsia, I., Nikolaidis, N. and Pitas, I. 2007. Facial expression recognition in videos using a novel multi-class support vector machines variant. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Honolulu, Hawai, USA 15 - 20 Apr 2007
Multiclass support vector machines and metric multidimensional scaling for facial expression recognition
Kotsia, I., Zafeiriou, S., Nikolaidis, N. and Pitas, I. 2007. Multiclass support vector machines and metric multidimensional scaling for facial expression recognition. IEEE Workshop on Machine Learning for Signal Processing (MLSP 2007). Thessaloniki, Greece 27 - 29 Aug 2007
Texture and shape information fusion for facial action unit recognition
Kotsia, I., Zafeiriou, S., Nikolaidis, N. and Pitas, I. 2008. Texture and shape information fusion for facial action unit recognition. The First International Conference on Advances in Computer-Human Interaction (ACHI 2008). Sainte Luce, Martinique 10 - 15 Feb 2008
Discriminant Non-negative Matrix Factorization and projected gradients for frontal face verification
Kotsia, I., Zafeiriou, S. and Pitas, I. 2008. Discriminant Non-negative Matrix Factorization and projected gradients for frontal face verification. The First COST 2101 Workshop on Biometrics and Identity Management (BIOID 2008). Roskilde University, Denmark 07 - 09 May 2008
Multi-modal emotion-related data collection within a virtual earthquake emulator
Ververidis, D., Kotsia, I., Kotropoulos, C. and Pitas, I. 2008. Multi-modal emotion-related data collection within a virtual earthquake emulator. 6th Language Resources and Evaluation Conference (LREC 2008). Marrakech, Morocco
Frontal view recognition in multiview video sequences
Kotsia, I., Nikolaidis, N. and Pitas, I. 2009. Frontal view recognition in multiview video sequences. IEEE International Conference on Multimedia and Expo (ICME 2009). Cancun, Mexico 28 Jun - 03 Jul 2009
Multiplicative update rules for Multilinear Support Tensor Machines
Kotsia, I. and Patras, I. 2010. Multiplicative update rules for Multilinear Support Tensor Machines. 20th International Conference on Pattern Recognition (ICPR 2010). Istanbul, Turkey 23 - 26 Aug 2010
Relative Margin Support Tensor Machines for gait and action recognition
Kotsia, I. and Patras, I. 2010. Relative Margin Support Tensor Machines for gait and action recognition. ACM International Conference on Image and Video Retrieval (CIVR 10). Xidian, China 05 - 07 Jul 2010
Higher order support tensor regression for head pose estimation
Guo, W., Kotsia, I. and Patras, I. 2011. Higher order support tensor regression for head pose estimation. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011). Delft, The Netherlands 13 - 15 Apr 2011
Support tucker machines
Kotsia, I. and Patras, I. 2011. Support tucker machines. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011). Providence, RI 20 - 25 Jun 2011
Action spotting exploiting the frequency domain
Kotsia, I. and Argyriou, V. 2011. Action spotting exploiting the frequency domain. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2011). Colorado Springs, USA 20 - 25 Jun 2011
Max-margin semi-NMF
Kumar, B., Kotsia, I. and Patras, I. 2011. Max-margin semi-NMF. The 22nd British Machine Vision Conference (BMVC 2011). University of Dundee 29 Aug - 02 Sep 2011
A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems
Kotsia, I., Zafeiriou, S. and Pitas, I. 2007. A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Transactions on Information Forensics and Security. 2 (3), pp. 588-595.
Texture and shape information fusion for facial expression and facial action unit recognition
Kotsia, I., Zafeiriou, S. and Pitas, I. 2008. Texture and shape information fusion for facial expression and facial action unit recognition. Pattern Recognition. 41 (3), pp. 833-851. https://doi.org/10.1016/j.patcog.2007.06.026
Facial expression recognition in image sequences using geometric deformation features and support vector machines
Kotsia, I. and Pitas, I. 2007. Facial expression recognition in image sequences using geometric deformation features and support vector machines. IEEE Transactions on Image Processing. 16 (1), pp. 172-187.
Synthesis of expressive facial animations: a multimodal caricatural mirror
Martin, O., Kotsia, I., Pitas, I., Savran, A., Adell, J., Huerta, A. and Sebbe, R. 2007. Synthesis of expressive facial animations: a multimodal caricatural mirror. Journal on Multimodal User Interfaces. 1 (1), pp. 21-30.
An analysis of facial expression recognition under partial facial image occlusion
Kotsia, I., Buciu, I. and Pitas, I. 2008. An analysis of facial expression recognition under partial facial image occlusion. Image and Vision Computing. 26 (7), pp. 1052-1067.
Novel multiclass classifiers based on the minimization of the within-class variance
Kotsia, I., Pitas, I. and Zafeiriou, S. 2009. Novel multiclass classifiers based on the minimization of the within-class variance. IEEE Transactions on Neural Networks. 20 (1), pp. 14-34. https://doi.org/10.1109/TNN.2008.2004376