Paralinguistic datasets


ROMANIAN DEVA CRIMINAL INVESTIGATION AUDIO RECORDINGS (RODeCAR) DATABASE

License

Licensed under Creative Commons BY-NC-ND 3.0.

Description

The Romanian Deva Criminal Investigation Audio Recordings (RODeCAR) database consists of approximately 7.5 hours of recorded audio data, with approximately 5 hours representing actual speech content, acquired from 20 speakers (4 female, 16 male) during interviews and questionings conducted by Romanian law enforcement agencies, in which all participants were persons of interest (guilty parties, suspects, witnesses, etc.); 39.5% of the total speech content (excluding the prosecutors) is deceptive (false), objectively determined by thorough review of the recordings and associated case notes, together with the original prosecutors, and using ulterior confessions and timeline reconstructions as evidence regarding the truthfulness of each participant’s statements.

Content summary:
No. of files: Nfiles = 26 (duration between 04:39 and 57:40)
No. of speakers: Nspkrs = 20 (16 male, 4 female) + 2 (male; the prosecutors)
Total content duration: Tdur0 = 07:32:40 <- all speech segments (including pauses)
Total speech content duration: Tdur0_s = 04:45:51 <- all speech segments (excluding pauses)
Participant speech content duration: Tdur_s = 03:27:29 <- participants' speech segments (excluding the prosecutors)
Truthful speech content duration: Tdur_T = 02:05:36 <- participants' truthful speech segments (excluding the prosecutors); 60.5% of Tdur_s
Deceptive speech content duration: Tdur_D = 01:21:53 <- participants' deceptive speech segments (excluding the prosecutors); 39.5% of Tdur_s

If you use this corpus in your research please cite one of the following papers:

  • S. Mihalache, Gh. Pop, and D. Burileanu, “Introducing the RODeCAR Database for Deceptive Speech Detection,” in the Proceedings of the 10th International Conference on Speech Technology and Human-Computer Dialogue (SpeD), TimiÈ™oara, Romania, pp. 1-6, 10-12 Oct. 2019, ISBN: 978-1-7281-0983-1, DOI:10.1109/SPED.2019.8906542
  • S. Mihalache and D. Burileanu, “Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection,” in Sensors, vol. 22, iss. 3, 1228, 6 Feb. 2022, ISSN: 1424-8220, DOI:10.3390/s22031228

To obtain the RODeCAR database, please fill out the End User License Agreement available here and send it to serban.mihalache@upb.ro for download details.