Kollias, D.; Tzirakis, P.; Nicolaou, M.; Papaioannou, A.; Zhao, G.; Schuller, B.; Kotsia, I.; Zafeiriou, S.

Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond

Article

Kollias, D., Tzirakis, P., Nicolaou, M., Papaioannou, A., Zhao, G., Schuller, B., Kotsia, I. and Zafeiriou, S. 2019. Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond. International Journal of Computer Vision. 127 (6-7), pp. 907-929. https://doi.org/10.1007/s11263-019-01158-4

Publication dates
Type	Article
Title	Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond
Authors	Kollias, D., Tzirakis, P., Nicolaou, M., Papaioannou, A., Zhao, G., Schuller, B., Kotsia, I. and Zafeiriou, S.
Abstract	Automatic understanding of human affect using visual signals is of great importance in everyday human–machine interac- tions. Appraising human emotional states, behaviors and reactions displayed in real-world settings, can be accomplished using latent continuous dimensions (e.g., the circumplex model of affect). Valence (i.e., how positive or negative is an emo- tion) and arousal (i.e., power of the activation of the emotion) constitute popular and effective representations for affect. Nevertheless, the majority of collected datasets this far, although containing naturalistic emotional states, have been captured in highly controlled recording conditions. In this paper, we introduce the Aff-Wild benchmark for training and evaluating affect recognition algorithms. We also report on the results of the First Affect-in-the-wild Challenge (Aff-Wild Challenge) that was recently organized in conjunction with CVPR 2017 on the Aff-Wild database, and was the first ever challenge on the estimation of valence and arousal in-the-wild. Furthermore, we design and extensively train an end-to-end deep neural architecture which performs prediction of continuous emotion dimensions based on visual cues. The proposed deep learning architecture, AffWildNet, includes convolutional and recurrent neural network layers, exploiting the invariant properties of convolutional features, while also modeling temporal dynamics that arise in human behavior via the recurrent layers. The AffWildNet produced state-of-the-art results on the Aff-Wild Challenge. We then exploit the AffWild database for learning features, which can be used as priors for achieving best performances both for dimensional, as well as categorical emo- tion recognition, using the RECOLA, AFEW-VA and EmotiW 2017 datasets, compared to all other methods designed for the same goal. The database and emotion recognition models are available at http://ibug.doc.ic.ac.uk/resources/first-affect-wild-challenge.
Publisher	Springer
Journal	International Journal of Computer Vision
ISSN	0920-5691
Electronic	1573-1405
Online	13 Feb 2019
Print	01 Jun 2019
Publication process dates
Deposited	01 May 2019
Accepted	29 Jan 2019
Output status	Published
Publisher's version	Kollias2019_Article_DeepAffectPredictionIn-the-Wil.pdf Kollias2019_Article_DeepAffectPredictionIn-the-Wil_VoR.pdf License CC BY 4.0
Copyright Statement	© The Author(s) 2019. Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Digital Object Identifier (DOI)	https://doi.org/10.1007/s11263-019-01158-4
Language	English

Permalink -

https://repository.mdx.ac.uk/item/88407

Log in to edit

Download files

Publisher's version

	Kollias2019_Article_DeepAffectPredictionIn-the-Wil.pdf
	Kollias2019_Article_DeepAffectPredictionIn-the-Wil_VoR.pdf
License: CC BY 4.0

Fetching citation counts from Clarivate.

59
total views
11
total downloads
12
views this month
0
downloads this month

Export as

Related outputs

Deep neural network augmentation: generating faces for affect analysis

Kollias, D., Cheng, S., Ververas, E., Kotsia, I. and Zafeiriou, S. 2020. Deep neural network augmentation: generating faces for affect analysis. International Journal of Computer Vision. 128 (5), pp. 1455-1484. https://doi.org/10.1007/s11263-020-01304-3

Unconstrained face recognition

Zafeiriou, S., Kotsia, I. and Pantic, M. 2018. Unconstrained face recognition. in: Management Association, I. (ed.) Computer Vision: Concepts, Methodologies, Tools, and Applications Hershey, PA IGI Global. pp. 1640-1661

Dense 3D face decoding over 2500FPS: Joint texture and shape convolutional mesh decoders

Zhou, Y., Deng, J., Kotsia, I. and Zafeiriou, S. 2019. Dense 3D face decoding over 2500FPS: Joint texture and shape convolutional mesh decoders. International Conference on Computer Vision and Pattern Recognition. Long Beach, California, USA 16 - 20 Jun 2019 IEEE. pp. 1097-1106 https://doi.org/10.1109/CVPR.2019.00119

GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction

Gecer, B., Ploumpis, S., Kotsia, I. and Zafeiriou, S. 2019. GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction. 2019 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. Long Beach, California 16 - 20 Jun 2019 IEEE. pp. 1155-1164 https://doi.org/10.1109/CVPR.2019.00125

The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking

Deng, J., Roussos, A., Chrysos, G., Ververas, E., Kotsia, I., Shen, J. and Zafeiriou, S. 2019. The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking. International Journal of Computer Vision. 127, pp. 599-624. https://doi.org/10.1007/s11263-018-1134-y

4DFAB: a large scale 4D facial expression database for biometric applications

Cheng, S., Kotsia, I., Pantic, M. and Zafeiriou, S. 2018. 4DFAB: a large scale 4D facial expression database for biometric applications. CVPR 2018: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA 18 - 22 Jun 2018 Institute of Electrical and Electronics Engineers (IEEE). pp. 5117-5126

Dynamic probabilistic linear discriminant analysis for video classification

Fabris, A., Nicolaou, M., Kotsia, I. and Zafeiriou, S. 2017. Dynamic probabilistic linear discriminant analysis for video classification. ICASSP 2017: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, USA 05 - 09 Mar 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 2781-2785 https://doi.org/10.1109/ICASSP.2017.7952663

Recognition of affect in the wild using deep neural networks

Kollias, D., Nicolaou, M., Kotsia, I., Zhao, G. and Zafeiriou, S. 2017. Recognition of affect in the wild using deep neural networks. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA Institute of Electrical and Electronics Engineers (IEEE). pp. 1972-1979 https://doi.org/10.1109/CVPRW.2017.247

Aff-Wild: Valence and Arousal ‘in-the-wild’ Challenge

Zafeiriou, S., Kollias, D., Nicolaou, M., Papaioannou, A., Zhao, G. and Kotsia, I. 2017. Aff-Wild: Valence and Arousal ‘in-the-wild’ Challenge. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA 21 - 26 Jul 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 1980-1987 https://doi.org/10.1109/CVPRW.2017.248

AgeDB: the first manually collected, in-the-wild age database

Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I. and Zafeiriou, S. 2017. AgeDB: the first manually collected, in-the-wild age database. CVPRW 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA 21 - 26 Jul 2017 Institute of Electrical and Electronics Engineers (IEEE). pp. 1997-2005 https://doi.org/10.1109/CVPRW.2017.250

Facial affect "in the wild": a survey and a new database

Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M. and Zhao, G. 2016. Facial affect "in the wild": a survey and a new database. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Affect "in-the-wild" Workshop. Las Vegas, USA 26 Jun - 01 Jul 2016 Institute of Electrical and Electronics Engineers (IEEE). pp. 1487-1498 https://doi.org/10.1109/cvprw.2016.186

Unconstrained face recognition

Zafeiriou, S., Kotsia, I. and Pantic, M. 2014. Unconstrained face recognition. in: De Marsico, M., Nappi, M. and Tistarelli, M. (ed.) Facerecognition in adverse conditions IGI.

Support tensor action spotting

Kotsia, I. and Patras, I. 2012. Support tensor action spotting. IEEE International Conference on Image Processing (ICIP 2012). Orlando, FL, USA 30 Sep - 03 Oct 2012

Exploring the similarities of neighboring spatiotemporal points for action pair matching

Kotsia, I. and Patras, I. 2012. Exploring the similarities of neighboring spatiotemporal points for action pair matching. 11th Asian Conference on Computer Vision (ACCV 2012). Daejeon, Korea 05 - 09 Nov 2012

Recognition of facial expressions in presence of partial occlusion

Buciu, I., Kotsia, I. and Pitas, I. 2003. Recognition of facial expressions in presence of partial occlusion. 9th Panhellenic Conference in Informatics (PCI 2003). Thessaloniki 21 - 23 Nov 2003

Facial expression analysis under partial occlusion

Buciu, I., Kotsia, I. and Pitas, I. 2005. Facial expression analysis under partial occlusion. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005). Philadelphia 18 - 23 Mar 2005

Real time facial expression recognition from image sequences using support vector machines

Kotsia, I. and Pitas, I. 2005. Real time facial expression recognition from image sequences using support vector machines. Visual Communications and Image Processing (VCIP 2005). Beijing, China 12 - 15 Jul 2005

Multimodal caricatural mirror

Martin, O., Adell, J., Huerta, A., Kotsia, I., Savran, A. and Sebbe, R. 2005. Multimodal caricatural mirror. eNTERFACE 2005. Belgium 18 Jul - 12 Aug 2005

Real time facial expression recognition from image sequences using support vector machines

Kotsia, I. and Pitas, I. 2005. Real time facial expression recognition from image sequences using support vector machines. IIEEE International Conference on Image Processing (ICIP 2005). Genova, Italy 11 - 14 Sep 2005

Affective gaming: a comprehensive survey

Kotsia, I., Zafeiriou, S. and Fotopoulos, S. 2013. Affective gaming: a comprehensive survey. Conference on Computer Vision and Pattern Recognition Workshops (CVPR) Behavior Analysis in Games and modern Sensing devices (BAGS) workshop.

On one-shot kernels: explicit feature maps and properties

Zafeiriou, S. and Kotsia, I. 2013. On one-shot kernels: explicit feature maps and properties. Proceedings of IEEE Int’l Conf. on Computer Vision (ICCV 2013).

The eNTERFACE'05 audio-visual emotion database

Martin, O., Kotsia, I., Macq, B. and Pitas, I. 2006. The eNTERFACE'05 audio-visual emotion database. 22nd International Conference on Data Engineering Workshops (ICDEW 2006). Atlanta, USA 03 - 07 Apr 2006

Facial expression recognition using shape and texture information

Kotsia, I. and Pitas, I. 2006. Facial expression recognition using shape and texture information. IFIP TC12 and WG12.5: Conference and Symposium on Artificial Intelligence. Santiago, Chile 21 - 24 Aug 2006

Fusion of geometrical and texture information for facial expression recognition

Kotsia, I., Nikolaidis, N. and Pitas, I. 2006. Fusion of geometrical and texture information for facial expression recognition. IEEE International Conference on Image Processing (ICIP 2006). Atlanta, USA 08 - 11 Oct 2006

Affective gaming: beyond using sensors

Kotsia, I., Patras, I. and Fotopoulos, S. 2012. Affective gaming: beyond using sensors. 5th International Symposium on Communications Control and Signal Processing (ISCCSP 2012). Rome, Italy 02 - 04 May 2012

Higher rank support tensor machines

Kotsia, I., Guo, W. and Patras, I. 2012. Higher rank support tensor machines. 8th International Symposium on Visual Computing (ISVC 2012). Crete, Greece 16 - 18 Jul 2012

Tensor learning for regression

Guo, W., Kotsia, I. and Patras, I. 2012. Tensor learning for regression. IEEE Transactions on Image Processing. 21 (2), pp. 816-827.

Max-margin non-negative matrix factorization

Kumar, B., Kotsia, I. and Patras, I. 2012. Max-margin non-negative matrix factorization. Image and Vision Computing. 30 (4-5), pp. 279-291.

Higher rank support tensor machines for visual recognition

Kotsia, I., Guo, W. and Patras, I. 2012. Higher rank support tensor machines for visual recognition. Pattern Recognition. 45 (12), pp. 4192-4203.

Facial expression recognition in videos using a novel multi-class support vector machines variant

Kotsia, I., Nikolaidis, N. and Pitas, I. 2007. Facial expression recognition in videos using a novel multi-class support vector machines variant. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Honolulu, Hawai, USA 15 - 20 Apr 2007

Multiclass support vector machines and metric multidimensional scaling for facial expression recognition

Kotsia, I., Zafeiriou, S., Nikolaidis, N. and Pitas, I. 2007. Multiclass support vector machines and metric multidimensional scaling for facial expression recognition. IEEE Workshop on Machine Learning for Signal Processing (MLSP 2007). Thessaloniki, Greece 27 - 29 Aug 2007

Texture and shape information fusion for facial action unit recognition

Kotsia, I., Zafeiriou, S., Nikolaidis, N. and Pitas, I. 2008. Texture and shape information fusion for facial action unit recognition. The First International Conference on Advances in Computer-Human Interaction (ACHI 2008). Sainte Luce, Martinique 10 - 15 Feb 2008

Discriminant Non-negative Matrix Factorization and projected gradients for frontal face verification

Kotsia, I., Zafeiriou, S. and Pitas, I. 2008. Discriminant Non-negative Matrix Factorization and projected gradients for frontal face verification. The First COST 2101 Workshop on Biometrics and Identity Management (BIOID 2008). Roskilde University, Denmark 07 - 09 May 2008

Multi-modal emotion-related data collection within a virtual earthquake emulator

Ververidis, D., Kotsia, I., Kotropoulos, C. and Pitas, I. 2008. Multi-modal emotion-related data collection within a virtual earthquake emulator. 6th Language Resources and Evaluation Conference (LREC 2008). Marrakech, Morocco

Frontal view recognition in multiview video sequences

Kotsia, I., Nikolaidis, N. and Pitas, I. 2009. Frontal view recognition in multiview video sequences. IEEE International Conference on Multimedia and Expo (ICME 2009). Cancun, Mexico 28 Jun - 03 Jul 2009

Multiplicative update rules for Multilinear Support Tensor Machines

Kotsia, I. and Patras, I. 2010. Multiplicative update rules for Multilinear Support Tensor Machines. 20th International Conference on Pattern Recognition (ICPR 2010). Istanbul, Turkey 23 - 26 Aug 2010

Relative Margin Support Tensor Machines for gait and action recognition

Kotsia, I. and Patras, I. 2010. Relative Margin Support Tensor Machines for gait and action recognition. ACM International Conference on Image and Video Retrieval (CIVR 10). Xidian, China 05 - 07 Jul 2010

Higher order support tensor regression for head pose estimation

Guo, W., Kotsia, I. and Patras, I. 2011. Higher order support tensor regression for head pose estimation. 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2011). Delft, The Netherlands 13 - 15 Apr 2011

Support tucker machines

Kotsia, I. and Patras, I. 2011. Support tucker machines. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011). Providence, RI 20 - 25 Jun 2011

Action spotting exploiting the frequency domain

Kotsia, I. and Argyriou, V. 2011. Action spotting exploiting the frequency domain. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2011). Colorado Springs, USA 20 - 25 Jun 2011

Max-margin semi-NMF

Kumar, B., Kotsia, I. and Patras, I. 2011. Max-margin semi-NMF. The 22nd British Machine Vision Conference (BMVC 2011). University of Dundee 29 Aug - 02 Sep 2011

A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems

Kotsia, I., Zafeiriou, S. and Pitas, I. 2007. A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Transactions on Information Forensics and Security. 2 (3), pp. 588-595.