Skip to main content

ASIST Study 3 Dataset

The ASIST Study 3 dataset was collected in 2022 as part of DARPA's Artificial Social Intelligence for Successful Teams (ASIST) program.

Experiment description

In this experiment, teams of 3 participants conduct urban search-and-rescue missions in Minecraft, with some teams being advised by AI agent advisors.

Each team participates in three 'missions' (or 'trials' -- we will use those terms interchangeably here): one 'training' mission, followed by two 'real' missions.

The preregistration for the experiment describes the motivation for the experiment as well as details on the data collection.

How to access the dataset

Option 1: ASU Dataverse

The dataset is publicly available via the ASU Dataverse Research Data Repository.

However, this option is less than ideal if you need to frequently access different subsets of the data, or work with the data programmatically. This option is recommended only if you do not have SSH access to the copy of this dataset on the lab servers.

Option 2: Access via the lab servers

This dataset is currently hosted on one of the lab's network storage volumes, mule. Specifically, the raw ASIST Study 3 data is located in the following directory:

/media/mule/projects/tomcat/protected/study-3_2022

The mule volume is mounted on the orca, kraken, and leviathan VMs (see Compute and Storage), so you should be able to access the data if you have SSH access to any of these VMs and permission to access the data (if you have SSH access to the VM and cannot access the data, please contact Adarsh).

File descriptions

There are multiple types of files in the dataset. They are described below.

Metadata

  • HSRData_MetaData_Study-3.csv: Metadata about the experiment, filenames, etc.

Message bus data

  • *.metadata: Messages sent on the message bus, one file for each trial. Each line of this file is a JSON object. The messages contain information about participant positions, actions, etc. This also includes automated transcriptions of the participants' dialog done in real time via Google Cloud Speech.

Documentation of the message formats can be found here.

Video recordings

  • *.mp4: Individual and team video recordings (of the missions).

Audio recordings

  • *.m4a: Zoom audio recordings, one per team
  • *.wav: Individual participant audio recordings. These audio recordings are captured via the participants' browser rather than Zoom, in order to have real-time, source-separated audio streams for automated speech recognition.

Survey data

  • HSRData_Surveys*.csv: Data from Qualtrics surveys filled out by participants.
  • *.sav: Alternate file format for Qualtrics survey exports

Other data

  • *.tar.gz: Docker logs from the different testbed components. One .tar.gz compressed archive per team.
  • *.txt: Quality control reports for the .metadata files.
  • *.vtt: Automated transcriptions of the experimental sessions generated by Zoom (one per team).