Datasets
Code
Frames from the Aff-Wild database which show subjects in different emotional states, of different ethnicities, in a variety of head poses, illumination conditions and occlusions.
The Aff-Wild database has been extended with more videos and annotations in terms of action units, basic expressions (and valence-arousal). The newly formed database is called Aff-Wild2. More details can be found here.
The training and test video samples, annotation files, bounding boxes and landmarks can be downloaded from here
The source code and the trained weights of various models can be found in the following github repository:
https://github.com/dkollias/Aff-Wild-models
If you use the above data or the source code or the model weights, please cite the following papers:
@article{kollias2019deep, title={Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond}, author={Kollias, Dimitrios and Tzirakis, Panagiotis and Nicolaou, Mihalis A and Papaioannou, Athanasios and Zhao, Guoying and Schuller, Bj{\"o}rn and Kotsia, Irene and Zafeiriou, Stefanos}, journal={International Journal of Computer Vision}, pages={1--23}, year={2019}, publisher={Springer} }
@inproceedings{zafeiriou2017aff, title={Aff-wild: Valence and arousal ‘in-the-wild’challenge}, author={Zafeiriou, Stefanos and Kollias, Dimitrios and Nicolaou, Mihalis A and Papaioannou, Athanasios and Zhao, Guoying and Kotsia, Irene}, booktitle={Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on}, pages={1980--1987}, year={2017}, organization={IEEE} }
@inproceedings{kollias2017recognition, title={Recognition of affect in the wild using deep neural networks}, author={Kollias, Dimitrios and Nicolaou, Mihalis A and Kotsia, Irene and Zhao, Guoying and Zafeiriou, Stefanos}, booktitle={Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on}, pages={1972--1979}, year={2017}, organization={IEEE} }
These data are in accordance with the paper Deep Affect Prediction in-the-wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond.
In this paper, we introduce the AffWild benchmark, we report on the results of the First Affect-in-the-wild Challenge, we design state-of-the-art Deep Neural Architectures -including the AffWildNet (best performing Network on Aff-Wild)- and exploit the AffWild database for learning features, which can be used as priors for achieving best performances for dimensional and categorical emotion recognition.
In the download link, you will find a tar.gz file, which contains 4 folders named: videos, annotations, bboxes and landmarks.
The videos' folder contains the training and testing videos, which are named in the format #.avi or #.mp4, where # is the video name: a number/id.
The annotations' folder contains the annotations for only the training videos. The annotations for the test videos are not provided in order to keep the integrity of the challenge. Also we are currently extending the Aff-Wild in order to rerun the contest this year (2019).
The training annotation folder contains two folders named valence and arousal, with each containing the annotation files for valence/arousal, whose name is in the format: #.txt, where # is the video name/id. Each line in these files corresponds to a valence/arousal annotation for a specific video frame. For instance, the first line in valence annotation file 450.txt, shows the valence value of frame 0 of video 450.mp4, the second line shows the valence value of frame 1, etc.
The bboxes and landmarks folders contain, for each video and for each frame, pts files with the corresponding extracted bounding boxes (4 points) and landmarks (68 points).
If you want to evaluate your models on the Aff-Wild's test set, send an email with your predictions to: dimitrios.kollias15@imperial.ac.uk
As far as the format is concerned, send the files with names as the corresponding videos. Each line of each file should contain the values of valence and arousal for the corresponding frame separated by comma ,i.e. for file 271.csv:
line 1 should be: valence_of_first_frame,arousal_of_first_frame
line 2 should be: valence_of_second_frame,arousal_of_second_frame
...
last line: valence_of_last_frame,arousal_of_last_frame
Note that in your files you should include predictions for all frames in the video (irregardless if the bounding box failed or not).
The Affect-in-the-Wild Challenge to be held in conjunction with International Conference on Computer Vision & Pattern Recognition (CVPR) 2017, Hawaii, USA.
Stefanos Zafeiriou, Imperial College London, UK s.zafeiriou@imperial.ac.uk
Mihalis Nicolaou, Goldsmiths University of London, UK m.nicolaou@gold.ac.uk
Irene Kotsia, Hellenic Open University, Greeece, drkotsia@gmail.com
Fabian Benitez-Quiroz, Ohio State University, USA benitez-quiroz.1@osu.edu
Guoying Zhao, University of Oulu, gyzhao@ee.oulu.fi
Dimitris Kollias, Imperial College London, UK dimitrios.kollias15@imperial.ac.uk
Athanasios Papaioannou, Imperial College London, UK a.papaioannou11@imperial.ac.uk
The human Face is arguably the most studied object in computer vision. Recently, tens of databases have been collected under unconstrained conditions (also referred to as “in-thewild”) for many face related task such as face detection, face verification and facial landmark localisation. However, well-established databases and benchmarks “in-the-wild” do not exist, specifically for problems such as estimation of affect in a continuous dimensional space (e.g., valence and arousal) in videos displaying spontaneous facial behaviour. In CVPR 2017, we propose to make a significant step further and propose new comprehensive benchmarks for assessing the performance of facial affect/behaviour analysis/understanding “in-the-wild”. To the best of our knowledge, this is the first time that an attempt for benchmarking the efforts of valence and arousal "in-the-wild".
For analysis of continuous emotion dimensions (such as valence and arousal) we propose to advance previous works by providing around 300 videos (over 15 hours of data) annotated with regards to valence and arousal all captured “in-the-wild” (the main source being Youtube videos). 252 videos will be provided for training and the remaining ones (46) for testing.
Even though the majority of the videos are under the creative commons licence (https://support.google.com/youtube/answer/2797468?hl=en-GB), the subjects have been notified about the use of their videos in our study.
The training data contain the videos and their corresponding annotation (#_arousal.txt and #_valence.txt, # is the number of video). Furthermore, to facilitate training, especially for people that do not have access to face detectors/tracking algorithms, we provide bounding boxes and landmarks for the face(s) in the videos.
Participants will have their algorithms tested on other videos which will be provided in a predefined date (see below). This dataset aims at testing the ability of current systems for estimating valence and arousal in unseen subjects. To facilitate testing we provide bounding boxes and landmarks for the face(s) present in the testing videos.
Performance will be assessed using the standard concordance correlation coefficient (CCC), as well as the mean squared error objective.
Our aim is to accept up to 10 papers to be orally presented at the workshop.
Challenge participants should submit a paper to the Faces-in-the-wild Workshop, which summarises the methodology and the achieved performance of their algorithm. Submissions should adhere to the main CVPR 2017 proceedings style. The workshop papers will be published in the CVPR 2017 proceedings. Please sign up in the submissions system to submit your paper.
Workshop Administrator: dimitrios.kollias15@imperial.ac.uk
[1] Stefanos Zafeiriou, Athanasios Papaioannou, Irene Kotsia, Mihalis Nicolaou, Guoying Zhao, Facial Affect in-the-wild: A survey and a new database, International Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Affect "in-the-wild" Workshop, 2016.
This challenge has been supported by a distinguished fellowship to Dr. Stefanos Zafeiriou by TEKES.