Multi-Language Conversational Telephone Speech 2011 -- Turkish
hdl:11272/ACOKK
Version: 1 – Released: Thu May 18 11:02:19 PDT 2017
Cataloging Information
Data & Analysis
Comments (0)
Versions
 
Data Citation
If you use these data, please add the following citation to your scholarly references. Why cite?
Data Citation Details
TitleMulti-Language Conversational Telephone Speech 2011 -- Turkish
Study Global IDhdl:11272/ACOKK
Other IDLinguistic Data Consortium: LDC2017S09; ISBN: 1-58563-799-8; ISLRN: 466-022-433-410-8
AuthorsJones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie
ProducerLinguistic Data Consortium (LDC), University of Pennsylvania
Production DateMay 15, 2017
Production PlacePhiladelphia
DistributorLinguistic Data Consortium (LDC), University of Pennsylvania
Deposit DateMay 18, 2017
SeriesLDC, Linguistic Data Consortium
Versionv1.0, May 15, 2017
Original Dataverse
Description and Scope
Description

Introduction

Multi-Language Conversational Telephone Speech 2011 -- Turkish was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 18 hours of telephone speech in Turkish.

The data were collected primarily to support research and technology evaluation in automatic language identification, and portions of these telephone calls were used in the NIST 2011 Language Recognition Evaluation (LRE). LRE 2011 focused on language pair discrimination for 24 languages/dialects, some of which could be considered mutually intelligible or closely related.

LDC has released the following as part of the Multi-Language Conversation Telephone Speech 2011 series:

  • Slavic Group (LDC2016S11)
  • Turkish (LDC2017S09)

Data

Participants were recruited by native speakers who contacted acquaintances in their social network. Those native speakers made one call, up to 15 minutes, to each acquaintance. The data was collected using LDC's telephone collection infrastructure, comprised of three computer telephony systems. Human auditors labeled calls for callee gender, dialect type and noise. Demographic information about the participants was not collected.

All audio data are presented in FLAC-compressed MS-WAV (RIFF) file format (*.flac); when uncompressed, each file is 2 channels, recorded at 8000 samples/second with samples stored as 16-bit signed integers, representing a lossless conversion from the original mu-law sample data as captured digitally from the public telephone network. The following table summarizes the total number of calls, total number of hours of recorded audio, and the total size of compressed data:

group lng #calls #hours #MB
turkish tur 87 18.6 975

KeywordsLinguistics (ACV)
Time Period Covered2011 - 2011
Date of Collection2011 - 2011
Country/NationUnited States (US); Turkey (TR)
Kind of DataLinguistic data
Data Collection / Methodology
Data SourcesTelephone conversations
Data Availability
Number of Files 3
Terms of Use
Conditions

Linguistic Data Consortium Data Use Agreement

A. Except as to the extent prohibited by any user agreement, the user shall have the right to

  1. incorporate portions of the LDC (Linguistic Data Consortium) data into its own work products for internal, non-commercial use and not for redistribution,
  2. incorporate small excerpts of text or audio data from the LDC data for display or publication in a scientific or technical context, but only for the purpose of descriving the research and related issues, and
  3. publish statistics and other summaries of the LDC data.

B. License

Except as otherwise provided herein, the user shall have no right to copy, redistribute, transmit, publish, sell, transfer, or otherwise use the LDC data for any purpose. The user shall give appropriate attribution to the LDC data in all scholarly or similar publications for which the LDC data or potions thereof have been used.

C. Access to Individual Users

Only individuals who are then-current faculty, students or staff members of LDC Member institutions or consultants or individuals providing services or doing research for Member institutions shall have access to the LDC data.

D. Copyright

The LDC data is protected by copyright as a collective work or compilation under the laws of the United States and other countries. All content, material, and other elements comprising LDC data are also copyrighted works. Users must abide by all additional copyright notices or restrictions contained in the LDC data license agreement supplements.

Dataverse Terms of Use
View Terms of Use [+]

Terms of Use

  1. Introduction

    1. The "Service" means, collectively, all aspects of the Abacus / NESSTAR and associated services and websites.
    2. The term "Content" means the data, text, graphics, photos, sounds, music, videos, audiovisual combinations, interactive features, software, scripts, and any other electronic materials you may view on or access through the Service.
  2. Your Acceptance of this Agreement

    1. By clicking you agree to the terms and conditions of this Agreement, which supplement the policies, rules and requirements of your institution.
    2. If you do not agree to these Terms of Use you must not log in, access, browse or otherwise use the Service. If you have questions or concerns, please contact research.data@ubc.ca.
  3. Use of the Service and Content

    Use of the Service and Your Content. You may access and use the Content uploaded on the Service strictly in compliance with the copyright terms identified on or associated with such Content.
  4. General Conditions of Use

    1. Without limiting the foregoing and the prohibited uses set out in Policy #104, Acceptable Use and Security of UBC Electronic Information and Systems, which is hereby incorporated by reference, the following is not permitted:
      1. using any automated system, including without limitation, "robots," "spiders," or "offline readers," to harvest or scrape information from the Service or any part(s) thereof, or to send more request messages in a given period of time than a human can reasonably produce in the same period by using a conventional on-line web browser; or
      2. in any way intentionally placing undue burden on the technical systems or networks connected to the Service.
    2. UBC may suspend your account, or access to the Service, if it learns or is credibly notified (as determined by UBC) that your conduct is in violation of these Terms of Use.
  5. Liability and Indemnity

    1. The Service and the Content is provided to you AS IS. You understand that UBC does not endorse any Content submitted to the Service by any user, or any opinion, recommendation, or advice expressed therein, and UBC expressly disclaims any and all liability in connection with Content, including without limitation all direct, indirect, special, incidental or consequential damage or any other damages whatsoever and howsoever caused, arising out of or in connection with the use of the Service or any Content, or in reliance on the Service or the Content.
    2. In addition, the Service may contain links to third party websites. UBC has no control over, and assumes no responsibility for, the content, privacy policies, or practices of any third party websites.
    3. You agree to indemnify and hold harmless UBC, its Board of Governors, agents, contractors, licensors, and licensees against any all claims arising from or in any way relating to your use of the Service.
  6. Trademarks

    Certain words, phrases, names, designs or logos used on the Site may constitute trademarks, service marks or trade names of the UBC or other entities. The display of any such marks or names on the Site does not imply that UBC or other entities have granted a license or authorization of any kind to use such marks or names. You may not use any of UBC's trademarks, service marks or trade names without UBC's prior written permission.
  7. Choice of Law

    The laws of the Province of British Columbia and the laws of Canada applicable therein shall govern as to the interpretation, validity and effect of this document, notwithstanding any conflict of laws provisions of your domicile, residence or physical location. You hereby consent and submit to the exclusive jurisdiction of the courts of the Province of British Columbia in any action or proceeding instituted under or related to your use of the Service.
Other Information
NotesInfo (DCMI type) Sound; Info (Sample type) pcm; Info (Sample rate) 8000; Info (Application) Language identificaiton; Info (Language) Turkish; Info (Language ID) tur; Info (Project) NIST LRE
Download the cataloging information in XML format - DDI (full)
Abacus Dataverse Network - British Columbia Research Library Data Services - Hosted at the University of British Columbia © 2017    

"Multi-Language Conversational Telephone Speech 2011 -- Turkish", hdl:11272/ACOKK