2010 NIST Speaker Recognition Evaluation Test Set
hdl:11272/V7OXL
Version: 1 – Released: Fri Apr 28 13:10:50 PDT 2017
Cataloging Information
Data & Analysis
Comments (0)
Versions
 
Data Citation
If you use these data, please add the following citation to your scholarly references. Why cite?
Data Citation Details
Title2010 NIST Speaker Recognition Evaluation Test Set
Study Global IDhdl:11272/V7OXL
Other IDLinguistic Data Consortium: LDC2017S06; ISBN: 1-58563-795-5; ISLRN: 429-091-121-265-4
AuthorsGreenberg, Craig; Martin, Alvin; Graff, David; Brandschain, Linda; Walker, Kevin
ProducerLinguistic Data Consortium (LDC), University of Pennsylvania
Production DateApril 17, 2017
Production PlacePhiladelphia
DistributorLinguistic Data Consortium (LDC), University of Pennsylvania
Deposit DateApril 28, 2017
SeriesLDC, Linguistic Data Consortium
Versionv1.0, April 17, 2017
Original Dataverse
Description and Scope
Description

Introduction

2010 NIST Speaker Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and NIST (National Institute of Standards and Technology). It contains 2,255 hours of American English telephone speech and speech recorded over a microphone channel involving an interview scenario used as test data in the NIST-sponsored 2010 Speaker Recognition Evaluation (SRE).

The ongoing series of SRE yearly evaluations conducted by NIST are intended to be of interest to researchers working on the general problem of text independent speaker recognition. To this end the evaluations are designed to be simple, to focus on core technology issues, to be fully supported and to be accessible to those wishing to participate.

The 2010 evaluation was similar to the 2008 evaluation by including in the training and test conditions for the core test not only conversational telephone speech (CTS) recorded over ordinary telephone channels, but also CTS and conversational interview speech recorded over a room microphone channel. Unlike prior evaluations, some of the conversational telephone style speech was collected in a manner to produce particularly high, or particularly low, vocal effort on the part of the speaker of interest.

Data

The speech recordings in this release were collected in 2009 and 2010 by LDC at its Human Subjects Collection facility in Philadelphia. This collection was part of the Mixer 6 project, which was designed to support the development of robust speaker recognition technology by providing carefully collected and audited speech from a large pool of speakers recorded simultaneously across numerous microphones.

The telephone speech segments include two-channel excerpts of approximately 10 seconds and 5 minutes. There are also summed-channel excerpts in the range of 5 minutes. The microphone excerpts are 3-15 minutes in duration. As in prior evaluations, intervals of silence were not removed. The data included in this release is 8 bit ulaw with a sample rate of 8000.

In addition to evaluation data, this package also consists of answer keys, trial and train files, development data and evaluation documentation.

Description DateApril 17, 2017
KeywordsLinguistics (ACV)
Time Period Covered2009 - 2010
Date of Collection2009 - 2010
Country/NationUnited States (US)
Kind of DataLinguistic data
Data Collection / Methodology
Data SourcesMicrophone speech, Telephone speech
Data Availability
Number of Files 2
Terms of Use
Conditions

Linguistic Data Consortium Data Use Agreement

A. Except as to the extent prohibited by any user agreement, the user shall have the right to

  1. incorporate portions of the LDC (Linguistic Data Consortium) data into its own work products for internal, non-commercial use and not for redistribution,
  2. incorporate small excerpts of text or audio data from the LDC data for display or publication in a scientific or technical context, but only for the purpose of descriving the research and related issues, and
  3. publish statistics and other summaries of the LDC data.

B. License

Except as otherwise provided herein, the user shall have no right to copy, redistribute, transmit, publish, sell, transfer, or otherwise use the LDC data for any purpose. The user shall give appropriate attribution to the LDC data in all scholarly or similar publications for which the LDC data or potions thereof have been used.

C. Access to Individual Users

Only individuals who are then-current faculty, students or staff members of LDC Member institutions or consultants or individuals providing services or doing research for Member institutions shall have access to the LDC data.

D. Copyright

The LDC data is protected by copyright as a collective work or compilation under the laws of the United States and other countries. All content, material, and other elements comprising LDC data are also copyrighted works. Users must abide by all additional copyright notices or restrictions contained in the LDC data license agreement supplements.

Dataverse Terms of Use
View Terms of Use [+]

Terms of Use

  1. Introduction

    1. The "Service" means, collectively, all aspects of the Abacus / NESSTAR and associated services and websites.
    2. The term "Content" means the data, text, graphics, photos, sounds, music, videos, audiovisual combinations, interactive features, software, scripts, and any other electronic materials you may view on or access through the Service.
  2. Your Acceptance of this Agreement

    1. By clicking you agree to the terms and conditions of this Agreement, which supplement the policies, rules and requirements of your institution.
    2. If you do not agree to these Terms of Use you must not log in, access, browse or otherwise use the Service. If you have questions or concerns, please contact research.data@ubc.ca.
  3. Use of the Service and Content

    Use of the Service and Your Content. You may access and use the Content uploaded on the Service strictly in compliance with the copyright terms identified on or associated with such Content.
  4. General Conditions of Use

    1. Without limiting the foregoing and the prohibited uses set out in Policy #104, Acceptable Use and Security of UBC Electronic Information and Systems, which is hereby incorporated by reference, the following is not permitted:
      1. using any automated system, including without limitation, "robots," "spiders," or "offline readers," to harvest or scrape information from the Service or any part(s) thereof, or to send more request messages in a given period of time than a human can reasonably produce in the same period by using a conventional on-line web browser; or
      2. in any way intentionally placing undue burden on the technical systems or networks connected to the Service.
    2. UBC may suspend your account, or access to the Service, if it learns or is credibly notified (as determined by UBC) that your conduct is in violation of these Terms of Use.
  5. Liability and Indemnity

    1. The Service and the Content is provided to you AS IS. You understand that UBC does not endorse any Content submitted to the Service by any user, or any opinion, recommendation, or advice expressed therein, and UBC expressly disclaims any and all liability in connection with Content, including without limitation all direct, indirect, special, incidental or consequential damage or any other damages whatsoever and howsoever caused, arising out of or in connection with the use of the Service or any Content, or in reliance on the Service or the Content.
    2. In addition, the Service may contain links to third party websites. UBC has no control over, and assumes no responsibility for, the content, privacy policies, or practices of any third party websites.
    3. You agree to indemnify and hold harmless UBC, its Board of Governors, agents, contractors, licensors, and licensees against any all claims arising from or in any way relating to your use of the Service.
  6. Trademarks

    Certain words, phrases, names, designs or logos used on the Site may constitute trademarks, service marks or trade names of the UBC or other entities. The display of any such marks or names on the Site does not imply that UBC or other entities have granted a license or authorization of any kind to use such marks or names. You may not use any of UBC's trademarks, service marks or trade names without UBC's prior written permission.
  7. Choice of Law

    The laws of the Province of British Columbia and the laws of Canada applicable therein shall govern as to the interpretation, validity and effect of this document, notwithstanding any conflict of laws provisions of your domicile, residence or physical location. You hereby consent and submit to the exclusive jurisdiction of the courts of the Province of British Columbia in any action or proceeding instituted under or related to your use of the Service.
Other Information
NotesInfo (DCMI type) Sound; Info (Sample type) ulaw; Info (Sample rate) 8000; Info (Application) Speech recognition; Info (Language) English; Info (Language ID) eng; Info (Project) NIST LRE
Download the cataloging information in XML format - DDI (full)
Abacus Dataverse Network - British Columbia Research Library Data Services - Hosted at the University of British Columbia © 2017

"2010 NIST Speaker Recognition Evaluation Test Set", hdl:11272/V7OXL