CorpusPhon

CorpusPhon

 

Organizers

Eleanor Chodroff
University of Zurich
(Switzerland)

Christian DiCanio 
University at Buffalo
(United States)
Morgan Sonderegger
McGill University
(Canada)

Márton Sóskuthy 
University of British Columbia
(Canada)

 

Workshop information

Date/Time: 09:00-16:50, Wednesday 26 June 2024
LocationHIT, Hanyang University


Program

Time Event Title Authors
9:00-9:10 Intro    
9:10-10:00 Invited speaker Montreal Forced Aligner 3.0 Michael McAuliffe
10:00-10:20  BREAK    
10:20-10:40 Talk 1 Informativity effects can be probability effects in disguise Vsevolod Kapatsinski
10:40-11:00 Talk 2 Applying Big Data and Automation Techniques in Phonetics: A Case Study on Hyperarticulation in Korean Word-Initial Stops Cheonkam Jeong and Andrew Wedel
11:00-11:20 Talk 3 Predictability and phonological context interact in conditioning the acoustic reduction of Seoul Korean lenis obstruents Seung Suk Lee
11:20-11:40 Talk 4 Corpus Phonetics in the Signed Modality: One Approach Kathleen Currie Hall, Kaili Vesik, Anushka Asthana, Maggie Reid, Grace Zhang, Yiran Gao, Grace Hobby, Stanley Nam and Oksana Tkachman
11:40-12:00 Talk 5 Large-scale assessment of speech intelligibility Seung-Eun Kim, Matthew Goldrick and Ann R. Bradlow
12:00-13:00 LUNCH    
13:00-13:10 Lighting Talk 1 Introducing the Speech Maturity Dataset: Research opportunities for speech scientists and linguistic fieldworkers Margaret Cychosz, Kasia Hitczenko, William Havard, Loann Peurey, Madurya Suresh, Theo Zhang and Alex Cristia
13:10-13:20 Lighting Talk 2 Creating a corpus of web-data with Pyrlato. A demonstration. Giuseppe Magistro and Claudia Crocco
13:20-13:30 Lighting Talk 3 The Multi-ethnic Hong Kong Cantonese Corpus for the Study of Child-Directed Speech Alan Yu, Nathan Delisle, Nicholas Martin, Vivienne Zhang, Yao Yao and Carol To
13:30-13:40 Lighting Talk 4 The XPF Corpus: Rule-based grapheme to phoneme translation schemes for hundreds of languages Uriel Cohen Priva
13:40-13:50 Lighting Talk 5 Creating Multimodal Corpora for Co-Speech Gesture Research Walter Dych, Karee Garvin and Kathryn Franich
13:50-14:00 Lighting Talk 6 AutoRPT: Automatic Detection of Prosodic Prominence and Boundary Seth Heiney and Jonathan Howell
14:00-15:00 Walkabout:
Poster session + demos
See Poster List  
15:00-15:20 BREAK    
15:20-15:40 Talk 6 Language-specific /s/ acoustics for early Cantonese-English bilinguals? Molly Babel, Victor Wong, Sabrina Luk, Kai Fong and Ragul Loganathan
15:40-16:00 Talk 7 Cross-linguistic differences in the phonetic implementation of /s/ Massimo Lipari, Morgan Sonderegger and Meghan Clayards
16:00-16:20 Talk 8 Harvesting spontaneous speech data from digital reservoirs to study prosody Aviad Albert, Constantijn Kaland, T. Mark Ellison, Francesco Cangemi, Bodo Winter and Martine Grice
16:20-16:40 Talk 9 A corpus phonetics study of nominal prominence marking in two Australian languages Catalina Torres and Sarah Babinski
16:40 - 16:50  Closing    
       
14:00-15:00 Posters Posters  
  Poster 1 Patterns of misaccentuation of unaccented words in English speakers’ Japanese Kakeru Yazawa
  Poster 2 Variable /s/ weakening in Canary Islands Spanish – a sociophonetic corpus study Karolina Broś 
  Poster 3 Vowel Classification in Conversational Speech Corpus Hyun Jin Hwangbo
  Poster 4 Attention-LSTM Autoencoder for Phonotactics Learning from Raw Audio Input Frank Lihui Tan, Youngah Do
  Poster 5 AnglistikVoices: an L2 English speech dataset for educational and technological advancement in speech technology Akhilesh Kakolu Ramarao and Anna Sophia Stein
  Poster 6 Automatic analysis of phonemic context-dependent cue productions in acoustic cue-labeled speech Jeung-Yoon Elizabeth Choi, Sofie Chung and Stefanie Shattuck-Hufnagel
  Poster 7 A phonetic comparison of lexical /i/ and epenthetic /i/ in Korean speech corpus Hyunjin Lee
  Poster 8 Using corpus methods to evaluate implicational universals Jahnavi Narkar
  Poster 9 Spectral energy properties of non-modal phonations Yuan Chai, Padmini Bhagavatula, Serene Wong and Patricia Keating
  Poster 10  Investigating the Predictability of an Upcoming Code-switch in Cantonese-English Bilinguals Nikolai Andrés Schwarz-Acosta
  Poster 11 A sociophonetic study of tones on Jeju Island Moira Saltzman
  Poster 12 On the Advantages and Challenges of Working with Large Corpora of Naturalistic Speech Johanna Cronenberg and Ioana Chitoran

 

Call for papers

The production of speech can be simultaneously examined in laboratory and non-laboratory settings. While the former context allows researchers to carefully target specific, controlled aspects of production, the latter allows researchers to examine speech in more ecologically-real settings. Alongside advances in computational power and increased access to automated techniques, this perspective has elevated corpus phonetics as a major approach to research in phonetics and phonology. Corpus phonetic methods are now used in a wide range of contexts, from the analysis of fieldwork data from small numbers of speakers to the automated processing of cross-linguistic speech data sets representing hundreds or thousands of speakers. The primary goal of the CorpusPhon workshop is to create an inclusive forum for this diverse set of practitioners, bringing together researchers who use corpus phonetic tools with a view towards building a cohesive community.

The workshop will be held alongside LabPhon 19 in Seoul, South Korea at Hanyang University on June 26, 2024. It will offer a venue for discussing methodological best practices in corpus phonetics, demonstrating a diversity of approaches, examining the relevance of corpus data to laboratory phonology and phonetics, analyzing problems relating to collecting or analyzing corpus data at different scales, presenting results of corpus studies, and showcasing data and tools. We are pleased to welcome Dr. Michael McAuliffe, developer of the Montreal Forced Aligner, as an invited speaker.


Areas of interest

We are soliciting work on original and unpublished research on topics related to corpus phonetics, as well as tutorials on existing data/tools, or strong work in progress. Appropriate sub-topics include (but are not limited to) the following:

  • Corpus phonetic studies, including studies involving smaller speech corpora, endangered/underdocumented language data, prosody, sociophonetics, cross-linguistic/dialectal variation, longitudinal data, historical data, or large-scale corpora.
  • Processing tools, such as forced alignment, grapheme-to-phoneme conversion, automated annotation, and automated phonetic measurement;
  • Quantitative analysis (statistical methods, visualization) for corpus/observational data;
  • Issues in corpus development, such as validation and quality control; issues related to data storage, management, and metadata; and ethical issues;
  • Presentation of new corpora appropriate for research in laboratory phonology.

Submissions should specify whether the presentation is better suited for a standard conference talk (~20 min + 10 min questions) or a demonstration (10-min lightning talk + participation in a 1-hour walk-about session). For example, a talk could report new research using an existing corpus, summarize a “closed” corpus (e.g. co-developed with a language community), or discuss broader methodological and conceptual considerations for corpus phonetics. A demonstration could present a tool for automatic speech analysis, show a new “open” corpus, or give a quick tutorial.


Submission instructions

1-page abstract with a second page for figures and references. The formatting should adhere to the LabPhon abstract formatting requirements (Times New Roman, 12pt font, single spacing, 1-inch margins). Abstracts should be submitted on EasyChair.

Link for submission: https://easychair.org/conferences/?conf=corpusphon2024 

Please specify whether your abstract should be considered for a demonstration slot or a standard talk slot. Demonstrations should be given in person. We might be able to offer a hybrid presentation option for a limited number of presenters who are giving a standard talk.


Important dates

  • Submissions are due by Wednesday, March 6 March 13, 11:59P, Anywhere on Earth (AoE)
  • Notifications will be sent out by March 15 March 22, 2024.
  • Date/Time (Tentative)09:00-16:50, Wednesday 26 June 2024
  • Location: TBA (but the same place as the conference venue, HIT, Hanyang University)


Workshop structure

Participants can submit an abstract for two types of presentation:

  • Talks: ~20 min + 10 min questions
  • Demonstrations: ~10-min lightning talk; participation in 1 hour walk-about demo session

Variance and invariance in Phonological Representation: Insights from Articulation

Phonetic imitation: representation, sound change, and other theoretical implications