RoCKIn Datasets

Spoken Language Understanding Dataset

The Spoken Language Understanding Dataset is a resource that has been gathered to provide data for the Spoken Language Understanding task, specifically for the Functional Benchmark #3 (FBM3) in the @Home track of the RoCKIn Competitions ( It is basically composed by audio files representing possible commands given to a robot in a house servicing scenario. Each audio file is paired with its correct transcription. Moreover, a semantic representation of the action expressed in the command has been also provided, according to the formalism described in the @Home Rulebook. This resouces can be thus used for different purposes, ranging from Speech Recognition to Natural Language Processing. The dataset has been collected in different moments, inside and outside the scope of the RoCKIn project, as the camps and the competitions, or during the Robocup 2013.
The dataset can be downloaded here. The .zip file contains three directories, each representing a portion of the dataset. Each directory contains all the audio files, plus a file called transcriptions, where each file name is associated with the corresponding transcription.

• The Robocup directory contains the audio files that have been gathered during the Robocup 2013 in Eindhoven. Such dataset has been provided to the participants in the RoCKIn@Home Competition 2014 ( as a benchmarking resource to train and test the systems for the Functional Benchmark #3.
• The Rockin1 directory contains the audio files that have been recorded to be used for the Functional Benchmark #3 of the RoCKIn@Home Competition 2014. This directory contains also a file called interpretations, containing the semantic interpretations associated with each trascription.
• The Rockin2 directory, instead, contains audio files that have been collected during the RoCKIn Camp 2015 (, with simluated interactions between users and robots in the Casa Domotica at the Service Robotics and Ambient Assisted Living Lab in Peccioli.
• The Rockin2014_Toulouse directory contains the audio files that have been recorded by the microphone systems of the robots participating in the FBM3, during the live part of the benchmark. Some humans uttered a list of commands, reproduced with a loud speaker. Robots had to use their microphones to capture the audio and to analyse it. This directory contains the interpretation of the sentences in the interpretations file.

