Introducing visiting professor Matilde Marcolli

Matilde Marcolli is Professor of Mathematics at Caltech and visiting professor at Utrecht 木瓜福利影视. She is doing highly novel research, which applies mathematical methods to linguistics. During her recent visit to the Centre for Complex Systems Studies, she sat down for a chat with CLUe Coordinator Brian Dermody.
Brian Dermody: Hello Prof. Marcolli, very nice to have you here at the Centre for Complex Systems Studies Utrecht.
Matilde Marcolli: It鈥檚 great to be here, many thanks for having me.
Given your background in mathematics and physics how did you get into linguistics?
When I was a student in Italy I had a classical education studying ancient Greek and Latin and I really liked these ancient languages. Then when I was postdoc in Math at MIT I got the opportunity to sit a semester in one of Noam Chomsky鈥檚 classes. He was teaching on his minimalist programme at the time. So, it was always there but then at some point I thought I might as well do something more actively about this interest.
And so, using your approaches, what are you trying to uncover about language?
I am interested in the large-scale structure of language. At the large-scale we focus on syntax, where the sentence is the unit of structure. In the tradition of Chomsky, people have tried to classify language by essentially a set of binary variables, which is a yes-no question about whether a certain construction is possible in a certain language or not. This of course is an appealing framework for mathematicians. Using this approach, we can start to determine the geometric structure of language.
How do you make that transformation from words and sentences to a geometric structure?
We use various methods that are aimed at detecting structure in these data of syntactic parameters of world languages. For example, you might not have an explicit description of relations between variables, but you can detect some of them statistically.
Have you seen patterns in these language groups?
Yes, one of my students has carried out topological data analysis of these datasets which is aimed at detecting structure in these datasets which go beyond the type of structures identified by a linear classifier. So, it can illuminate if a language is structured around a specific geometry. And we find from this analysis that different language families have different topologies. So, for example, the Indo-European language family exhibits a circular geometry which we don鈥檛 see in other language families. We think this represents some phenomenon which occurred in the development of the Indo-European language. These structures probably have a reasonable explanation in terms of historical linguistics.
Okay, so you see these specific structures but it鈥檚 still a bit of a mystery what gives rise to them?
Right. This is actually a general problem with topological data analysis methods. They allow you to see the structures but then it comes to the difficult part, which is interpreting what the structure means. Usually we think it has something to do with history and the way the language developed over time within the language family.
We mentioned Noam Chomsky earlier. Do you see your methods as a way to start digging into this idea of universal grammar?
Universal grammar is obviously a working hypothesis and it states that these binary variables that specify the grammar of a certain language are somehow wired in our brains. It鈥檚 difficult to pin down a certain point in the brain that corresponds to this universal grammar.
It would be interesting if we could develop dynamical models to understand language acquisition. One thing we have been working on is the development of a dynamical system model to understand how language change via interactions between different languages. There are a lot of studies of bilingual populations such as the English-Spanish speaking population in California where they use both languages with similar frequency. In those situations, you find switching phenomena where people start to use certain constructions that are natural in one language and use them in the other. This of course, then changes the structure of these languages.
Of course, if you鈥檙e a physics-minded person, you tend to view phenomena like this a bit like an Ising model where each of these parameters is like a spin variable that are either up or down and these tend to align based on the interaction between the two. But interestingly there are also relations among the variables within each language. So, you are trying to flip one of the spins but if it is tied to other spins, it won鈥檛 be able to flip because it is dependent on other variables. So, you get coupling, which is the interesting part of these dynamical language systems.
Hearing that, it does seem that you would regard language as having complex dynamics. But it also has a lot of structure. So how do those two things meet?
That鈥檚 an interesting question because you would like to have some measure of the complexity in a language. It is a long-standing question in linguistics to find good measures of complexity in language. There are many different ways you can think about the complexity of a language. At the word level you may think about a language with a lot of cases, conjugated verbs and generally a lot of modification of words, which may be one possible measure of complexity.
At the level of syntax, it鈥檚 a difficult question to come up with a good measure of complexity. This is an interesting mathematical question to think about: What is a good measure of syntactic complexity of languages?
Measuring complexity in languages. Something to think about indeed! Many thanks Prof. Marcolli for the fascinating talk.