Kyle Gorman from Google AI and CUNY will be visiting the Department the week of November 12th. He will be giving a talk at 15:30 – 17:00 on Monday in Room 117 1085 Dr. Penfield (title and abstract will be sent out soon), and a Tutorial on Pynini, a Python library he developed for weighted finite-state grammar compilation, on Wednesday 12:00-15:00 in Ferrier room 230.
Grammar engineering in text-to-speech synthesis Many speech and language applications, including speech recognition and speech synthesis, require mappings between “written” and “spoken” representations of language. Despite substantial progress in applied machine learning, it is still the case that real-world industrial text-to-speech (TTS) synthesis systems largely depend on language-specific hand-written rules for these conversions. These may require a great deal of development effort and linguistic sophistication, and as such represent substantial barriers for quality control and internationalization. I first consider the case of number names, where the goal is to map written forms like 328 to three hundred twenty eight. I propose two computational models for learning this mapping. The first uses end-to-end recurrent neural networks. The second, inspired by prior literature on cross-linguistic variation in number naming, uses an induction strategy based on finite-state transducers. While both models achieve near-perform performance, the latter model is trained using several orders of magnitude less data, making it particularly useful for low-resource languages. The latter model is being used at Google to produce number grammars for dozens of languages and locales. I then consider the case of grapheme-to-phoneme conversion, where the task is to map written words onto their phonemic transcriptions. I describe a model in which the grammar engineering is performed by providing input and output vocabularies; in Spanish for instance, the input vocabulary includes digraphs like ll and rr, which denote single phonemes, and for Japanese kana, the output vocabulary includes entire syllables. This grammatical information, incorporated into a finite-state generative model, results in a significant improvement over a baseline system which lacks direct access to such information.
Pynini: Finite-state grammar development in Python Finite-state transducers are abstract computational models of relations between sets of strings, widely used in speech and language technologies and studied as computational models of morphophonology. In this tutorial, I will introduce the finite-state transducer formalism and Pynini (Gorman 2016; http://pynini.opengrm.org), a Python library for compiling and processing finitestate grammars. In the first part of the tutorial, we will cover the finite-state formalism in detail. In the second part, we will install the Pynini library and survey its basic functionality. In the third, we will tackle case studies including Finnish vowel harmony rules and decoding ambiguous text messages. Participants are assumed to be familiar with the Python programming language, but I do not assume any experience with finite-state methods or natural language processing. Note to participants: You are encouraged to bring a working laptop. We will reserve some time to install the necessary libraries so that you can follow along and participate in a few select exercises. This software has been tested on Linux, Mac OS X (with an up-to-date version of XCode), and Windows 10 (with the Ubuntu flavor of Windows Subsystem for Linux). In case you wish to get a head start, installation instructions are available here: http://wellformedness.com/courses/PyniniTutorial/installation-instructions.html
The next meeting of the Word Structure Research Group will take place Monday, 12 November, 3:30-5 PM at UQAM, room DS-3470, Pavillon J.-A.-DeSève, 320 Sainte-Catherine East.
Topic: Subsyllabic morphemes in Mandarin: Demonstratives zhei and nei, presented by Isabelle Boyer.
This Friday, Jason Borga will be leading a discussion on Rudin’s (2018) “Head-Based Syntactic Identity in Sluicing”. As usual, the meeting will take place in Room 117, from 3pm to 4:30pm. All are welcome to attend!
Congratulations to Lisa Travis, who will be retiring at the end of this year. Last week, current and former students and colleagues gathered for a surprise party in Lisa’s honour. McGill alums Laura Kalin, Ileana Paul and Jozina Vander Klok presented Lisa with a book of 44 papers written on the occasion of her retirement entitled Heading in the right direction: Linguistic treats for Lisa Travis, which was published by McGill Working Papers in Linguistics and will be available online shortly.
McGill will also host a workshop on parameters in honour of Lisa’s retirement this May. See details and a call for papers here.
Congratulations to Lisa Travis, who will be retiring at the end of this year. Last week, (former) students and colleagues gathered for a surprise party in Lisa’s honour. Laura Kalin, Ileana Paul and Jozina Vander Klok presented Lisa with a book of 44 papers written on the occasion of her retirement entitled Heading in the right direction: Linguistic treats for Lisa Travis, which was published by McGill Working Papers in Linguistics and will be available shortly.
In this week’s meeting, Jessica Coon will be giving a talk titled “Headless relative clauses and (possible?) free-choice free relatives in Ch’ol”. Jessica will present new work on Ch’ol headless relatives (collaborative with Juan Jesús Vázquez Álvarez, CIMSUR-UNAM), arguing that maximal and existential free relatives share an identical core structure, and receive different interpretations based on the environments in which they appear. Jessica will also present some puzzling data on a possible free-choice morpheme. As usual, the meeting will take place in Room 117 from 3pm to 4:30pm. All are welcome to attend!
Junko Shimoyama gave a colloquium talk on Friday, Nov. 2 at the University of Ottawa, on positively biased negative polar questions in Japanese and their embeddability. This is joint work with Dan Goodhue (PhD 2018) and Mako Hirotani at Carleton University.
The Fieldwork Lab will meet this Thursday from 4:00–5:30 in Linguistics room 117. This week we will hear short presentations on data elicitation and data gathering puzzles, with presentations by Natalia Brambatti Guzzo, Henrison Hsieh, Matthew Schuurman (UQÀM), Michaela Socolof, Simon Theriault (UdeM).
This week the Semantics Group won’t meet at the usual time on Friday, because the department’s colloquium is taking place at the same time. However, we just started a reading group on Keenan’s book “Logical Properties of Natural Language: Eliminating the Universe”. We shall be meeting on Friday at 2pm (Room: TBC). We will discuss Chapter 3. All are welcome to join the reading group! For more details, email Brendan, Justin or Francesco.
Speaker: Nico Baier
Date & Time: November 2, 3:30pm
Place: Education Bldg. rm. 211
Title: Unifying anti-agreement and wh-agreement
In this talk, I investigate the sensitivity of φ-agreement to features typically associated with Ā- extraction, including those related to wh-questioning, relativization, focus and topicalization. This phenomenon has been referred to as anti-agreement (Ouhalla 1993) or wh-agreement (Chung and Georgopoulos 1988; Georgopoulos 1991; Chung 1994) in the literature. While anti-agreement is commonly held to result from constraints on the Ā-movement of agreeing DPs, I argue that it reduces to an instance of wh-agreement, or the appearance of particular morphological forms in the presence of Ā-features. I develop a unified account of these Ā-sensitive φ-agreement effects in which they arise from the ability of φ-probes to copy both φ-features and Ā-features in the syntax. In the morphological component, partial or total impoverishment may apply to feature bundles containing both φ- and Ā-features, deleting some or all of the φ-features. Impoverishment blocks insertion of an otherwise appropriate, more highly specified agreement exponent. I present case studies of the effect of Ā-features on φ-agreement in three languages: the West Caucasian language Abaza (O’Herin 2002); the Berber language Tarifit (Ouhalla 1993; El Hankari 2010); and the Northern Italian dialect Fiorentino (Brandi and Cordin 1989; Suñer 1992). I show that in all three languages, the agreement exponents that appear in the context of Ā-features are systematically underspecified.
Morgan Sonderegger was at University of Oregon’s Department of Linguistics October 25-26, where he gave a workshop entitled “Topics in fitting and using mixed-effects regression models” and a colloquium talk, “Towards larger-scale cross-linguistic and cross-variety studies of speech”.
Junko Shimoyama gave an invited talk at the 14th Workshop on Altaic Formal Linguistics (WAFL) at MIT last week (Oct. 19-21), which was preceded by a workshop in honor of Shigeru Miyagawa (Oct. 18). Her talk was titled “Embeddability of biased negative polar questions in Japanese”, which is joint work with Dan Goodhue (PhD 2018) and Mako Hirotani (Carleton University).
McGill linguists were at New Ways of Analyzing Variation, NWAV 47 last week at NYU where they gave talks and ran a workshop.Workshop:
- Integrated Speech Corpus ANalysis – ISCAN: A new tool for large-scale, cross-corpus, sociolinguistic analysis – Jane Stuart-Smith (University of Glasgow), Morgan Sonderegger, James Tanner, Vanna Willerton, Michael McAuliffe (McGill University)
- Age vectors vs. axes of intraspeaker variation for North American and Scottish English vowel formants – Mielke, Fruehwald, Thomas, McAuliffe, Sonderegger and Dodsworth
- Dialectal and social factors affect the phonetic bases of English /s/-retraction – Stuart-Smith, Sonderegger, Macdonald, McAuliffe and Mielke
The paper “Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech,” by Thea Knowles (McGill BA 2012; now at Western University), Meghan Clayards and Morgan Sonderegger was published in the Journal of Speech, Language, and Hearing Research. Congrats!