Plato and Darwin
I interview Norbert Hornstein about the field of linguistics—I ask him about Generative work, Noam Chomsky's contributions, and why Chomsky is excited.
Norbert Hornstein is a professor emeritus of linguistics at the University of Maryland—he’s been reading and thinking about Generative Grammar and its philosophical consequences since first reading Language and Mind as an undergraduate 50 years ago.
Language is a fascinating thing—Jerry Fodor apparently used to quip that cognitive science is “language, and some vision”.
And Noam Chomsky has a fascinating discussion about language on page 219 of Understanding Power. Chomsky says that we have very limited scientific insight even when it comes to simple physical problems; that human beings are qualitatively more complicated than something like the three-body problem; that understanding “declines very fast when we get past only the very simplest things”; that “nobody knows what kinds of laws apply to complex organisms”; that the problems around human beings are qualitatively more complicated than the baffling unsolved problems around the nematode, whose tiny 300-neuron brain has been fully diagrammed; and that human beings are more complicated than the nematode in a way that’s “just so qualitatively huge that the fall-off in understanding when it comes to human beings is extremely dramatic”.
Then Chomsky says that this context—about nematodes, the three-body problem, and so on—is “why the study of language is so particularly interesting”. He says that it’s curious that certain aspects of language are amenable to scientific study, though he adds that “still it’s like a little laser beam of light that goes through human behavior, leaving most things about language out”.
I love this interesting metaphor of “a little laser beam of light that goes through human behavior, leaving most things about language out”—it’s a fascinating metaphor.
I was honored and thrilled to be able to ask Hornstein about linguistics—see below my interview with Hornstein that I edited for flow and organized by topic. Hornstein contributed a bunch of hyperlinks, while I also contributed some.
Hornstein provided the following paper when I asked him for the best single paper for people to look at:
The paper seeks to clear up confusions about the Minimalist Program—Hornstein thinks that this research program has been “wildly successful” even though “anecdotal evidence suggests that conventional wisdom thinks it a failure”.
1) What are the most exciting projects that you’re currently working on?
For the last couple years, I’ve been writing a book that focuses on three topics: (1) the aims of the Minimalist Program (MP); (2) how successfully it has realized these aims; and (3) how MP fits into the overall Generative research project.
On (3), I think that most linguists—even those who are doing Generative work—are badly confused about how MP fits into the overall Generative research project. This concern partly prompted me to think about these issues—my upcoming book specifies the confusion and addresses why there’s so much of it.
To outline briefly what the confusion is, people think—based on a complete misunderstanding about how MP fits into previous Generative work—that Chomsky’s remark that MP moves “beyond explanatory adequacy” somehow means that MP has dumped any concern for Plato’s Problem and somehow means that MP countenances accounts of the structure of the Faculty of Language (FL) that no longer respond to the Logical Problem of Language Acquisition. I’d say that most linguists—even those who are working within the mentalist Chomskyan tradition—mistakenly think that MP has somehow abandoned the learnability/acquisition concerns that drove much of the research from the mid-’70s to mid-’90s.
Responding to confusion on this matter feeds into (1)—how do you accurately specify MP’s aims? I take MP’s core research problem to be reconciling Plato’s Problem with Darwin’s Problem.
Plato’s Problem is: How can we know so much given the sparse evidence that the acquirer has to go on? Specifically, there’s a huge observable gap between the data that the child uses to acquire knowledge of its language and the knowledge that the mature speaker attains—the Generative response to that gap is to say that we come equipped with a highly structured FL that helps guide the process.
Government–Binding Theory (GB) was a huge step forward—it was the first comprehensive theory of FL’s fine structure that had the wherewithal to meet the gap’s dimensions. GB offered a decent outline of the sorts of properties FL innately contains—it also offered a decent empirically justifiable outline of the linguistic specificity of these properties, meaning that GB’s principles refer to structures, operations, and primitives that are narrowly linguistic and that aren’t special cases of more domain-general structures, principles, and operations. And critically, GB was empirically well grounded thanks largely to the explosive growth of research into comparative grammar that took place from the mid-’70s to the mid-’90s.
Now for the problem that Chomsky identified. He rightly noted that rich and linguistically bespoke accounts of FL—like GB—make it hard to explain how FL might’ve evolved. Call this Darwin’s Problem, which you can sum up as the following: the more linguistically specific the structure of FL, the harder to explain how FL could have emerged as a human mental organ.
Furthermore, Darwin’s Problem is hard if we make some simple standard assumptions that seem empirically reasonable. The first assumption is that the human language capacity differs qualitatively from any animal capacity—this assumption suggests that at least some of FL’s computational features are specifically dedicated to linguistic computations.
The second assumption is that the capacity for human language has emerged rather recently in evolutionary time—roughly in the last 100,000 years, which is an evolutionary eyeblink.
The third assumption is that FL has—since its inception—remained effectively stable across the species. This last assumption invites the reasonable speculation that FL is the evolutionary product of very few—preferably just one!—language-specific additions combined with the more domain-general cognitive and computational resources that formed the mental economy of our pre-linguistic ancestors.
Let’s suppose that these three assumptions more or less accurately frame the state of play. In that case, MP’s aim is to find a way to parse our current FL’s capacities into (A) the very, very few linguistically bespoke capacities and (B) those—potentially very numerous—cognitively and computationally general capacities characteristic of animal minds. If such a story can indeed be constructed, then (A) and (B) should—together—cover the empirical ground that the GB version of FL covered except now in a simpler and more principled fashion.
Suppose that MP actually does succeed in this aim. Then we won’t in any way have abandoned the boundary condition that Plato’s Problem sets regarding what an adequate theory of FL needs to achieve, since such a story would support a GB-like structure for FL—MP does derive such a GB-like theory after all. So on this conception of MP, moving “beyond explanatory adequacy” doesn’t somehow entail abandoning the centrality of Plato’s Problem within linguistics.
This understanding of MP also explains its timing—Chomsky’s first Minimalist Manifesto came out in 1993 after Generative research had given us a pretty decent peek into FL’s fine structure. A description of FL was available, which meant that addressing Darwin’s Problem could go beyond mere armchair speculation and actually become a practical scientific project with an actual well-defined target of explanation—asking how FL might’ve evolved was an idle question until we had a candidate description of FL.
Now for question (2) above—given this conception of the program, how successful has it been? In my opinion, it’s been wildly successful—more successful than we could’ve rationally expected when Chomsky first suggested the project in 1993.
We now have some excellent theories that explain some of human grammar’s most basic FL-produced properties; have nontrivial empirical support; and provide—if true—the schema of an answer to Darwin’s Problem. That’s not bad for less than 20 years of research, given how sophisticated and hard these problems are!
My upcoming book also discusses why so few of the people who discuss MP see the success that I see—my answer is that people either misunderstand the problem or else think that Minimalists should address a different problem instead of the problem that MP actually sets for itself.
One can address and redress misunderstandings. But what can one say regarding the objection that Minimalists shouldn’t be addressing the problem that MP has chosen to address? There’s no reason at all why the language sciences shouldn’t contain many different kinds of projects—it’s perfectly fine that not every linguist is interested in the mentalist project that Chomsky outlined more than 50 years ago. The mentalist project doesn’t somehow become less interesting or legitimate because certain people are interested in other questions—you measure a program’s success in terms of the problems it sets for itself, not in terms of problems others want to thrust upon it.
Incidentally, I think that there’s another much worse reason why people don’t recognize MP’s success—many critics of MP know nothing or almost nothing about either MP or the larger Generative research project that it’s embedded within, demonstrate no genuine interest in learning about these things, and offer bad-faith criticisms that would be ignored in any reasonable world.
But our world isn’t reasonable, so people do indeed pay attention to these bad-faith criticisms, which have nothing to do with anything intellectually serious and instead come down to the fact that many people dislike Noam Chomsky for one or another reason.
2) What are the most exciting projects that you know of that others are working on?
I’m a huge fan of Paul Pietroski’s work on propositional meaning as well as the work on quantification that he’s done with Jeff Lidz, Justin Halberda, and some other terrific colleagues—as far as I know, this is the first work that approaches meaning from an I-language perspective and that seeks to incorporate “semantics” into the larger Chomskyan mentalist project, so that’s very exciting.
I also follow David Poeppel’s research: his stuff on how different brain waves—alpha, theta—serve to package incoming stimuli of different sizes for processing; his methodological pieces with Dave Embick, especially their enlightening discussions of the parts problem as well as the mapping problem; and his work on brain indices of syntactic structure.
Then I closely follow Randy Gallistel’s work on the neural bases of memory. I love his book with King and have learned from his discussions of intra-neuronal processing—for what it’s worth, I believe that the latter work has the potential to change how we do neuroscience.
Then there’s Lila Gleitman’s work on word learning. I would particularly single out the work that Gleitman did with John Trueswell where they make a convincing argument that the acquisition profile of both “easy” and “hard” words is very non-Bayesian and that kids guess—basically—what words “mean”, stick to the guess until it fails, and then make a new guess.
For me, the big takeaway from this work is that learning—whatever it is—strongly diverges from the classic Empiricist picture we’re familiar with from the Associationists.
I also follow my departmental colleagues’ work very closely. I’m lucky to be an emeritus member of a terrific linguistics department—at the University of Maryland—that takes a very cognitive approach to linguistics research.
Before Covid, I’d have lunch several times a week with Bill Idsardi, who works on phonological systems’ computational properties and on how these properties match up with syntactic structures.
I attend Ellen Lau’s neuroscience lab meetings—she and Nina Kazanina are addressing some very deep questions right now about the possible relation between the cognitive mechanisms that track objects, events, and qualities and about how these mechanisms might map up with our linguistic representations.
I also used to regularly attend—and weakly participate in—Colin Phillips’s and Jeff Lidz’s lab meetings about how linguistic competence is used in processing and acquisition. The work presented in those meetings was always firmly based on reasonable assumptions about linguistic competence, unlike much of the work I know that studies linguistic performance. And in those lab meetings, I learned over and over how very complex the relation between competence theories and performance theories really is.
I also avidly consume the excellent work of my syntax colleagues Maria Polinsky, Omer Preminger, Juan Uriagereka, and Howard Lasnik.
In my opinion, Polinsky’s work with Eric Potsdam on inverse raising and control constructions is massively important—this work has deeply influenced my own thinking about movement and construal.
So too with Preminger’s work on Binding, Uriagereka’s work on linearization, and Lasnik’s many papers on virtually every major issue in syntax.
Last but by no means least, I follow Charles Yang’s work. He’s shown that the acquisition problem is even harder than we thought, since the learning-relevant data are—due to Zipfian considerations—even sparser than previously believed.
He’s also tackled the problem of how to deal with exceptions in basically rule-based grammars. His Tolerance Principle makes very specific quantitative predictions about the course of acquisition in several domains like phonology, morphology, and syntax—to my knowledge, this is the first time that anyone has managed to coax quantitative predictions from psycholinguistic theories.
The Field of Linguistics
1) What do you find most interesting about linguistics?
I’m most interested in the field’s role as “model discipline” in the emerging Rationalist revival in the neurocognitive sciences! I got into linguistics through philosophy—unsurprisingly I’m most interested in the philosophical questions that initially motivated the Chomsky project.
I studied linguistics as an undergrad at McGill—at the time, it was a real hotbed of work on language and cognition. Harry Bracken and Jim McGilvray were in the philosophy department talking Chomsky 24/7; David Lightfoot, Glyne Piggott, and Myrna Gopnik were in the linguistics department doing the same; and John Macnamara was in the psychology department doing very well-regarded philosophy-adjacent work on language acquisition.
It was a heady time. Lots of language and cognition people trained at McGill during that period: Mark Aranoff; Alan Prince; Mark Baltin; Alison Gopnik; Renée Baillargeon; Steven Pinker; Amy Weinberg; and—most importantly to me at the time—Elan Dresher.
Dresher and I would spend hours and hours and hours discussing linguistics and how it related to the contemporary Empiricism-vs.-Rationalism debate—it was a blast and the debate fascinated me. How much cognitive mental structure is innate? How much cognitive mental structure does experience fix? How do we know what we know? What does it mean to know something? Is there a mind-vs.-body distinction? And if there is indeed a mind-vs.-body distinction, then what does it imply methodologically? How does logic connect to grammar? Dresher and I discussed all of these questions as energetically as only undergrads can.
Dresher and I came to believe that the road to reasonable answers to these questions went through linguistics and through Chomsky’s thoughts on these matters. And actually, I can say—50 years later—that we were completely correct.
What’s been the Chomsky program’s greatest intellectual contribution to contemporary scientific thought? Arguably the answer is that the program has revived Rationalist precepts about the study of the mind/brain and about hypothesis formation.
Chomsky and his colleagues—Jerry Fodor, Jerrold Katz, Thomas Bever, Lila Gleitman—raised the obvious point that the mind/brain needs a structure in order to be able to “learn” or acquire anything at all.
But these scholars also demonstrated that it was possible to empirically investigate the general philosophical precepts that Rationalists like Plato, Galileo, Descartes, and Leibniz had advanced. And Chomsky and his colleagues shifted philosophical speculations into empirical debates between competing research programs whose leading ideas could then be further refined and tested. All of this—the demonstration and the actual shift—was very heady stuff for me as a budding undergrad in philosophy.
In addition to many smaller rewards, thinking about these issues had one big payoff for me personally—it got me to see the importance of explanatory power as a methodological principle in theory development. In my opinion, the Rationalist vision’s biggest idea is the claim that theories must explain and make rational what we see rather than just track—or even predict—the data.
So linguistics has fascinated me because it’s shown how Rationalist precepts can do serious scientific work in cognitive science.
That’s why Chomskyan linguistics has served as a model for other inquiries in the mental sciences—Elizabeth Spelke, Sue Carey, Gleitman, Baillargeon, Gallistel, Fodor, Bever, Poeppel, and many more have noted how linguistics has influenced theoretical speculation in other branches of the cognitive and brain sciences.
2) What do you find most rewarding about linguistics?
I find it rewarding that we’ve come so far. The Chomsky program is the only part of linguistics that I’ve worked on—it’s only 60 or so years old and has made innumerable discoveries about the nature of linguistic competence and the structure of FL.
I also find it rewarding that Generative work is still producing interesting questions—for example, MP has helped clarify how best to frame the question about the evolution of language (EvoLang). The last 20 years have seen a fair amount of speculation about EvoLang—the discussion has been quite contentious and unfortunately hasn’t done much to advance the issue.
Minimalism’s contribution to EvoLang has been to identify what requires explanation. The question that needs answering is how FL emerged/evolved in the species and not how language—or how particular languages like English—evolved/changed over time. If this framing is correct, we can’t proceed fruitfully or intelligently until we first have some idea about what FL is and how FL is structured. But much EvoLang discussion unfortunately fails to acknowledge the very simple point that we need to first describe FL before we can study language’s emergence in humans—it should be obvious that you can’t hope to study X’s evolution until you can first describe X.
I’d actually go further and say that Minimalism has even provided us with a sketch of how we might make headway on an abstract version of the EvoLang problem. Suppose that the last 60 years of Generative work is roughly correct—which I believe it is—and that we therefore finally have an outline of FL’s structure, which would be something along the lines of the GB description of FL. If so, we now have a useful target for EvoLang explanations to shoot at—how did something with these GB-like properties emerge in the species?
More interestingly, Minimalism has suggested a route to an answer. First we have to simplify—and unify—GB’s laws and factor out the relevant unifying principles that are linguistically dedicated from those that are cognitively and computationally general. And then we can concentrate on explaining how the linguistically dedicated ones might’ve arisen.
I believe that there are even specific hypotheses on offer—one proposal is the Merge Hypothesis (MH), which I discuss along with its extensions in my current book project.
These are all only first steps of course. But they’re meaningful first steps that could set the stage for further theoretical and empirical investigation that would advance our understanding as we gain a better grasp on the neural mechanisms undergirding cognitive capacities—our current grasp is pretty weak.
Nonetheless, I take MH to be a plausible bridge between—in Poeppel’s and Embick’s parlance—the linguistic/cognitive “parts list” and the neural/physiological “parts list”. In particular, if we can reconstruct FL in Merge-like terms then we can begin to ask how brain circuits realize Merge and how this kind of brain circuit embodying Merge might’ve evolved.
3) Maybe I’m just ignorant, but my guess is that scholars quickly sorted out the rules of arithmetic—if my guess is correct, why is syntax so hard?
It takes time—and lots of hard thinking—before you can understand things simply.
First it took us a while to identify the syntactic phenomena; then it took us a while to decently describe the sorts of formal systems that would generate these kinds of syntactic structures; then it took us even more time to figure out what kinds of meta-systems would be able to acquire these kinds of first-order systems on the basis of the sparse data available to the child; and—once we did all of these things—it finally became fruitful to ask why we find meta-capacities with these properties instead of others.
And I suspect that the same thing happened with basic arithmetic—you can look at how long it actually took us to distill out the Peano axioms or fully understand the successor function and the ancestral relation.
4) What’s next for linguistics?
Linguistics has benefited as new knowledge has periodically opened up new questions worth pursuing—Plato’s Problem wasn’t worth considering until we knew something about language-particular grammars, while Darwin’s Problem wasn’t worth considering until we had some idea about FL’s basic design features.
I think that the next big question is: How do brains embody linguistic computations? But we can’t advance that question until we have some solid Minimalist theory—assuming that MP has legs, FL has a simple structure underneath the superficial complexity of linguistic computation, which should greatly simplify the problem of mapping mental computation to neural circuits.
I also think that EvoLang questions will remain airy and unproductive until we have some idea about how brains realize minds and also some specific idea about how linguistic computations are realized in wetware.
1) What do you think about the 2019 paper “The Achievements of Generative Syntax”? Is it a good paper for people to read if they’re skeptical about Generative work?
It’s a good paper that reviews the discipline’s received wisdom.
I doubt that it’ll convince anyone who’s actually skeptical—a skeptic will want to see justifications that these are genuine achievements and that the identified mid-level generalizations are roughly correct, whereas this paper simply lists the achievements and doesn’t provide any justifications. That’s not a criticism of the paper, which doesn’t try to convince a skeptic but instead tries to review the state of the art for the professionals.
How can you learn about what Generative work has accomplished over the last 60 years? Just study Generative work in the exact same way you’d study any other domain of knowledge—grab some decent texts that cover the material and that show how the work is theoretically and empirically grounded.
2) What papers and books can people read to get up to speed on your work?
Here are some:
A Theory of Syntax (2009)
“On Merge” (2017)
“The Extended Merge Hypothesis and the Fundamental Principle of Grammar” (2021)
This material will bring people up to speed on what I’m up to.
Problems and Puzzles and Mysteries
1) What are the biggest problems and puzzles and mysteries in linguistics that you want to know the answer to?
My biggest question is: To what extent is MH correct?
MP tries to sail safely between the Scylla of Plato’s Problem and the Charybdis of Darwin’s Problem. And it’s a treacherous passage, since the two problems’ solutions pull in opposite directions—solutions to Plato’s Problem suggest an FL that has a very linguistically bespoke and language-specific computational structure, whereas solutions to Darwin’s Problem would prefer an FL that consists largely of nonlinguistic cognitive and computational principles and operations. So there’s a real tension there.
But you can resolve the tension. Maybe FL consists largely of nonlinguistic principles and operations but also includes one—in the best case only one—or just a few language-specific principles and operations. MP aims to find out whether this kind of resolution to the aforementioned tension is viable.
MP has generated at least one theory that aims to resolve the tension—MH is a theory that claims that Merge is the language-specific secret sauce that yields FL when combined with other cognitive and computational principles and operations.
MH is an empirical hypothesis that’s being empirically and theoretically investigated. So to my mind, the most interesting Minimalist question right now is whether MH is true. If it is, then we should be able to derive many—and ideally all—of FL’s properties from Merge in conjunction with some additional principles and operations that aren’t linguistically bespoke.
More concretely, suppose that GB provides an outline of FL’s basic structural features—in that case, we should expect MH to more or less derive GB’s properties. Of course, we want more than that—we also want to understand how FL is used and how it’s realized in brains, neither of which is likely to be comprehensible to us based on MH alone.
So how well has MH done? I think that it’s actually done pretty well. Chomsky has successfully managed to show that the right conception of Merge actually yields many of FL’s basic properties—I have a 2017 paper that discusses Chomsky’s progress on this front.
Can we go further still? My upcoming book tries to push this line of reasoning much further. I propose a modified version of MH that I call the Extended Merge Hypothesis (EMH) that I argue can derive many of GB’s properties—EMH proposes a strong version of MH that stipulates that all grammatical dependencies are Merge-mediated. One consequence of this assumption is that it forces particular analyses—of construal dependencies like Control and Binding—that I argue have more than a whiff of verisimilitude.
This could all be wrong. But I’d love to know how far EMH can go—if EMH were true, that would demonstrate to me that MP can be realized and that it’s possible to simultaneously solve both Plato’s Problem and Darwin’s Problem.
But as I mentioned above, this demonstration would just be a first step. Merge is a recursive function that can build unboundedly complex hierarchical objects—unfortunately, Stanislas Dehaene has observed that we don’t really know how to implement recursive functions in brain structures, so if we want to move beyond the abstract question posed in Darwin’s Problem and ask more detailed and interesting evolutionary questions then I think we need to at least understand how a recursive operation like Merge could potentially be implemented in wetware.
Having said that, even this isn’t enough—understanding how recursive operations like Merge are neurally implemented would still leave us with a very hard problem. As Chomsky has observed, we don’t have any examples of successful evolutionary explanations for complex mental capacities—I myself don’t know of any examples. I assume that the waggle dance is a simpler system than FL, but we currently have absolutely no idea how that simpler capacity evolved in bees.
So I’d love to have answers to whether MH is roughly right, whether EMH is roughly right, and what Merge’s neural analogues are—I’d even settle for some decent hints on these fronts.
2) What research and experiments might move the ball forward on these problems and puzzles and mysteries?
We want to show that Minimalist theories have empirical support.
But first we have to foreground the nonempirical analytical and theoretical work on MH and EMH, since that work seeks to demonstrate that Merge-based theories can indeed derive GB’s general principles. This demonstration is a really important first step and is—I think—the most pressing item on the MP agenda.
Of course, one also hopes that MH and EMH will have novel empirical consequences. But let me stress how important the nonempirical analytical and theoretical project is, since linguists unfortunately usually don’t pursue this type of research or highly prize it.
So have there been some novel consequences? In my opinion, the aforementioned work from Polinsky and Potsdam covers novel ground. They first found data whose analysis within the older GB theory is deeply problematic and indeed inexplicable—they then showed that this data is what we’d expect to find given standard Minimalist assumptions.
There’s a methodological precept that states that you have particularly strong evidence for a novel theory if the novel theory can explain exceptions to the prior theories—Polinsky and Potsdam succeed in showing that phenomena that are inexplicable in a GB framework follow seamlessly from Minimalist assumptions, so that’s a win for MH.
3) Is there a set of 10 carefully selected sentences that perfectly puts on display the biggest and most fascinating puzzles and mysteries in linguistics? It might be an interesting popularization to see these sentences, since it makes more concrete—for the layperson—a very abstract and technical field.
Almost any decent empirical paper will give some example sentences illustrating a paradigm whose structure we’d like to explain. But just 10 sentences won’t cover the full rich domain of regularities that linguists have uncovered.
I’ll show you some basic reasoning based on early work from Chomsky. And note that an asterisk indicates that the reading is unacceptable.
The man saw a woman walking to the train station
This sentence is at least three-ways ambiguous:
a. The man saw a woman who was walking to the train station
b. The man saw a woman while the man was walking to the train station
c. The man saw the following event: a woman walking to the train station
Now consider the following question:
Which train station did the man see a woman walking to?
(3) only has the (2c) reading and can’t mean (4a) or (4b), whereas (3) can be interpreted along the lines of (4c):
a. *Which train station did the man see a woman who was walking to?
b. *Which train station did the man see a woman while he was walking to?
c. Which train station did the man see a woman walk to?
But why is this the case? Why do the ambiguities somehow disappear when we turn into a question the nominal expression “the train station”? Why do we somehow go from three possible meanings to just one?
Linguists have an answer that’s based on certain locality conditions that characterize grammatical computations. And the details happen to be complicated, but the puzzle—of why question formation somehow causes us to go from three possible meanings to just one—is easy to pose to people.
The literature has 100s of examples like this that draw from virtually any language you might care to investigate.
And here’s a simpler version of the same problem. (5) is two-ways ambiguous—it can mean that John saw a woman and used a telescope to see her or alternatively it can mean that John saw a woman who possessed a telescope. But turn into a question “a telescope” as in (6) and only the first meaning survives—why is that the case?
John saw a woman with a telescope
Which telescope did John see a woman with?
You can explain the meaning restrictions in (5) and (6) in a way that’s analogous to the way that you can explain the meaning restrictions in (1)–(4)—there are different structures in play, but the restrictions have analogous explanations.
So Generative linguists love to study instances like these where meanings “disappear” under sentential transformations—this is our bread and butter regarding empirical investigation.
Noam Chomsky’s Ideas
1) To what extent do linguists agree with the main ideas that Chomsky has put forward in linguistics? My understanding is that Chomsky’s main ideas in linguistics have always been minority views in the field.
That’s not quite correct. For a while, the field’s official line was that Chomsky was largely right—let me emphasize that I’m talking about the field’s general view and not just MIT.
But things changed in the mid-’90s when MP came along—I think the main reason for the change is that most linguists take linguistics to be the study of languages, whereas Chomsky doesn’t share this view. He takes the central object of study within linguistics to be the structure of FL—Chomskyan linguistics studies the underlying capacity that native speakers have that makes them native speakers of their given language and studies the meta-capacity that humans have that allows them to become natively proficient in particular languages.
Pre-MP Generative work largely studied the properties of individual languages, so it wasn’t really relevant whether you agreed with Chomsky’s conception of what the proper object of study should be—a linguist could use GB-postulated principles for philological as well as mentalist ends and could interpret GB-postulated principles either as grammatical regularities helpful for philological and typological description or else as principles outlining a mental organ’s fine structure.
MP challenges this agnosticism. MP makes things less about why languages and language-specific grammars have certain properties and makes things more about FL’s abstract properties and why FL has a given principle or operation—there’s a much more abstract focus of investigation where individual grammars and their properties cease to be the center of inquiry.
This more abstract focus makes a lot of sense from a mentalist perspective but a lot less sense—though not zero sense—from a philological perspective. So it’s perfectly understandable that lots of linguists objected to MP and claimed that MP was raising questions that weren’t central to linguistic research—MP’s concerns will seem too abstract to be worthwhile if your main interests are philological.
For example, suppose that you want to use or refine GB’s generalizations to help describe some language-particular phenomenon, which is a typical research project for a linguist—in that case, your interest is in the particular UG principles and not whether these principles can be derived from simpler ones. So there’s a completely understandable attitude that does lead some to denigrate MP research.
It’s worth observing that there’s no genuine real conflict between standard linguistic research and MP-related investigations—MP’s focus on FL doesn’t in any way invalidate other kinds of linguistic research. Interest in MP doesn’t entail the suggestion that linguists shouldn’t study language-specific grammars or find novel generalizations that might reflect FL’s structure—MP enlarges the questions worth investigating but doesn’t somehow negate the value or relevance of earlier questions or of earlier forms of investigation.
2) Is Chomsky’s idea about universal grammar (UG) a minority view in linguistics?
It depends how you define “UG”—I suspect that most people who say “I don’t believe that UG exists” don’t know how the term “UG” has been used in the relevant literature and wouldn’t be able to define “UG” in any informed way.
I’ll offer three separate definitions—the first can’t possibly be controversial, the second is controversial even though skeptics have done nothing to actually show that their skepticism regarding it is justified, and the third pertains to a debate that’s currently very marginal and fancy in the field and that only a small number of linguists have any interest in.
Until around 1995, “UG” merely referred to FL’s organizing principles. So if that’s all that “UG” is, its existence can’t possibly be controversial unless you believe that FL doesn’t exist. And nobody really believes that, since “FL” is just the name that we give to whatever mental operations allow humans to become linguistically proficient—nobody doubts that humans have this capacity, so clearly everyone believes that FL exists if that’s all that FL refers to.
The second definition of UG assumes that there’s something cognitively special about FL’s structure—this definition is controversial, since some want to deny that FL has any uniquely linguistic characteristics, where “uniquely linguistic” means that they characterize only FL and not any other cognitive capacities.
My own view is that the weight of the evidence greatly favors the conclusion that FL has at least some special properties that are distinct from the properties that organize other cognitive activities. And indeed, I myself think that this is an obvious conclusion even though many disagree.
It’s a decidedly empirical issue as to which view of things is correct. And we even have some idea of how to resolve the matter—in Reflections on Language, Chomsky discusses some examples of what the skeptics of unique linguistic characteristics of FL need to demonstrate. Unfortunately, they’ve done absolutely zero of the work that they need to do if they want to actually show that there’s nothing cognitively special about language.
The third conception of UG reflects Minimalist concerns—here “UG” denotes FL’s linguistically special parts. Under this definition, asking whether “UG” exists means asking which—if any—features of FL are bespoke and which are recycled from general cognition and computation. And that’s a very marginal and fancy question that we’re miles away from being able to actually conclusively answer—few linguists care about this question or have an opinion on it.
3) Is Chomsky’s idea about semantic internalism a minority view in linguistics?
There are some misunderstandings about Chomsky’s views on this, so let me clarify some things before answering.
Referentialism is the idea that words refer to things of various sorts and that this reference relation is what endows propositions—and compositional sub-propositional parts—with their semantic properties.
Regarding natural language, Chomsky doesn’t think that referentialism is a very useful way to frame the empirical issues surrounding lexical or propositional meaning—he doesn’t see any serviceable notion of reference that can link words and things in a way that actually helps to explain the basic facts about lexical or propositional meaning. It’s important to clarify that Chomsky doesn’t deny that humans are able to use language referentially—he only denies that there’s a useful theoretical notion that can play the explanatory role envisaged in a theory of meaning.
How does Chomsky back up his view? He provides examples of some semantic facts, demonstrates that the technical notion of reference usually deployed doesn’t actually do any explanatory work, and observes that the available nontechnical conceptions of reference lead to a very counterintuitive sense of “object” that nobody would ontologically endorse. So the technical notion of reference does nothing to explain what’s going on, while the commonsensical notion leads to obvious paradox—Chomsky therefore concludes that there’s no serviceable notion of reference to ground the referentialist thesis, which is the thesis that claims that meaning supervenes on word–object relations.
Does Chomsky put forward any positive thesis regarding meaning? He makes some weak observations about possible positive approaches and cites Julius Moravcsik’s 1975 paper that discusses how Aristotle’s four aitia might be the basis of some semantic relations, but the overall answer is that Chomsky has no positive views. Rather, he concentrates on showing that natural-language words are weird in many respects—something that Pietroski has also recently shown—and that we’re very far from having any halfway decent theories that manage this complexity even a little.
Do many linguists agree with Chomsky about referentialism? Pietroski does and he’s a leading figure in both linguistic and philosophical semantics. And philosophers like Hartry Field and Gil Harman have written about referentialism’s limits and limited explanatory power, though not exactly in the same terms that Chomsky has.
Regarding linguistics, I don’t think that most linguists think—or care—about this issue at all. So they neither agree nor disagree—they instead just default to a kind of Fregean or Montagovian picture and don’t worry much about whether the picture really fits.
There’s a general “Shut up and calculate!” ethos in linguistic semantics where nobody questions the basic picture of things. And as an aside, what makes Pietroski so interesting is that he’s one of the few people who addresses the crucial foundational issues about meaning and links these issues to familiar linguistic concerns.
4) Is Chomsky’s idea that language evolved as a thought system—and was only externalized after the fact—a minority view? To support this idea, Chomsky cites evidence that computational efficiency wins out over communicative efficiency in every known instance where the two efficiencies conflict.
I think that most of the brouhaha is—as usual—more of a tempest in a teapot than an actual informed and serious disagreement.
It’s a staple in the EvoLang literature that language emerged for communication, but this claim is often more ornamental than central—I suspect that EvoLang people usually have no idea what “communication” means and give little thought to how to define it.
Further, if “communication” includes “communicating with oneself” then that’s confusing, since “communicating with oneself” is just another way to say “thinking”.
Last of all, it’s not clear what features of FL the fact that language can be used to communicate is supposed to explain—I’ve never seen a useful explanation of any feature of FL based on language’s communicative efficacy, at least when it comes to the formal properties of language that linguists focus on.
5) Will Chomsky’s main ideas catch on over time?
I hope many will. Most of these ideas aren’t at all controversial once you actually understand them—so much of the disagreement really does arise from people not reading or understanding the relevant literature.
Only a small number of researchers are currently pursuing research that’s based on Chomsky’s leading ideas like MH—this research is on the frontier so we don’t yet know if it’ll pan out or if these ideas are even roughly correct.
Most new ideas turn out to be false—I suspect that many of the ideas that Chomsky and others are working on right now will actually turn out to be false, but I suspect that we’ll look back and see that the ensemble of ideas had an important kernel of truth.
I personally think that Chomsky’s Minimalist proposals are indeed on the right track and really will serve as the foundation for much future research, although I disagree with certain of Chomsky’s specific proposals.
6) Which of Chomsky’s ideas do you disagree with?
I’ve been pursuing a somewhat different conception of Merge from the one that Chomsky is currently working on—I’m actually on the record as being a big fan of the idea that Labeling is the key language-specific operation. I’ve also departed from Chomsky’s approach to the analysis of construal dependencies.
But this is all inside-baseball stuff—an outsider wouldn’t be able to locate these minor departures.
In my view, Chomsky is completely right regarding the big issues, including: Plato’s Problem, Darwin’s Problem, the need to find a way to reconcile those two problems, the idea that hierarchical recursion is the core property of syntax, and the idea that syntax is autonomous in the sense that it can’t be reduced to the structures of meaning or sound.
I also think that Chomsky has made a compelling case that MH is part of the solution to the core Minimalist project of reconciling Plato’s Problem and Darwin’s Problem.
7) Is it accurate to say that Chomsky’s program seeks to find a very simple computation that accounts for language?
Language is a very complex object with many different moving parts—MP seeks the “very simple computation” that you refer to but doesn’t aim to explain every feature that language has, so there’s more to language than MP.
But narrowing down your question to the core features of syntax, I’d say that Minimalism’s main idea is that a very simple version of Merge yields grammars that have many—and maybe even most—of the properties that characterize natural-language grammars.
8) Let me ask a potentially odd question. Suppose that syntax research reaches some endpoint and we trace out what would’ve been the most efficient path X to that endpoint—will we then be able to say that everything departing from X wouldn’t have been useful or worthwhile or desirable to pursue had X been known in advance?
I’d say that linguistics isn’t any different from any other developing science in this regard.
So were Galileo’s laws—regarding inertia and regarding projectile motion—and Kepler’s laws useful and worthwhile and desirable stepping stones on the path to Newton’s laws? Regarding Galileo’s and Kepler’s laws, Newton used these laws as targets in the sense that he wanted his theory to explain the things that these laws explained in roughly the same way that these laws explained them. And his laws derive their laws.
Similarly, were Newton’s laws necessary stepping stones to Einstein’s theory of gravitation or to developing quantum mechanics? Einstein’s theory of gravitation derives Newton’s laws as a limiting case when the speed of light is small, which suggests that Newton’s laws were useful stepping stones to Einstein’s.
Newton and Einstein were proceeding rationally—the old laws were roughly accurate, so they served as reasonable boundary conditions for further theory.
So earlier theoretical efforts can be important stepping stones in developing later theoretical efforts. The ideal is that science is a cumulative enterprise—it’s certainly an important ideal even if the ideal isn’t always realized in practice.
And in linguistics, earlier theories serve as later theories’ targets of explanation. For example, MP has properly taken GB’s “laws” as MP’s targets of explanation and has—interestingly—actually successfully derived some of GB’s generalizations.
Consider the Principles and Parameters (P&P) framework—was the work that adopted P&P assumptions an important stepping stone or a skippable detour? Maybe P&P will be looked back on as a skippable detour if it turns out that FL doesn’t have a P&P design and that this fact about FL could’ve been discovered without doing any P&P work.
P&P played an important role in getting us to where we are today. It allowed us to start isolating construction-independent principles—pre-P&P the unit of analysis was the construction, whereas post-P&P it was the operation.
Maybe we could’ve actually arrived at our present state without the P&P detour. But that would’ve meant starting in a different place—it would’ve meant somehow skipping the idea that constructions are fundamental units of analysis, since P&P was the thing that allowed us to “kick away the ladder” regarding constructions.
9) Regarding language, I get the sense that a lot of people will associate it with communication before ever associating it with thought—is it strange that people prioritize communication over thought in this way?
It’s definitely standard to think that language’s function is communicative.
But function almost always underdetermines form—like I mentioned earlier, it’s not clear to me how far this idea about language’s communicative function will actually get you when it comes to understanding why natural-language grammars have their various structural properties. There’s a typical mistake where one thinks that one can easily explain a cognitive system’s structural properties in functional terms—I know of zero cases where functional reasoning has ever successfully fully explained anything’s structural features.
Omer Preminger has demonstrated the point when it comes to functional reasoning in linguistics. And there’s an interesting 2010 book where Fodor and Massimo Piatelli-Palmerini use this observation—that there’s only a loose fit between structure and function—to identify some problems with Darwinian accounts of evolution.
10) Chomsky says that language is a system of thought—is that a controversial idea?
It might be contentious, but it’s not clear that the question is well formed.
I’m not sure what such a comment means—the word “think” has no clear meaning, which Chomsky himself has observed when questioning whether the question “Can machines think?” is empirical.
11) Is there nonlinguistic thought?
This is a question—like “Can machines think?”—that looks empirical but is most likely definitional.
I think there’s definitely nonlinguistic cognition—many animals have remarkable and fascinating cognitive capacities that are being investigated.
And it’s clear to me that our system of linguistic knowledge isn’t really at all like what we find in nonhuman cognition. This of course doesn’t imply that humans are superior to other animals—other animals have many cognitive capacities that humans lack and that are qualitatively different from anything that humans can do.
12) In science, to what extent do we see irrationality, ideology, misrepresentation, attacks on imagined views that nobody actually holds, deception, and other bad things?
Scientists are humans—scientists do their science based on the same things that motivate humans in any other domain. Motives seldom remain pure in science—there’s money and power at stake, there’s recognition and status at stake, and there are real costs associated with failure. People work hard for recognition—people will of course also cheat for recognition.
The idea that scientists are somehow morally superior beings in virtue of being scientists has always struck me as laughable—scientists are people with all of the virtues and vices that that entails.
Scientists follow fashion just as much as they think independently and critically. We all know the standard inspiring notion that scientists follow the evidence where it leads—unfortunately, the dark observation that science advances one funeral at a time might be a more realistic perspective regarding much of science.
13) How many people have well-informed disagreements with Chomsky that are based on literacy? And on understanding and knowing the evidence that Chomsky uses to make his arguments?
Many informed linguists have differed with Chomsky on various topics.
And in response to other linguists’ work, Chomsky has often changed his mind and shifted his views, although he hasn’t changed his mind much regarding the big questions about how to frame and tackle the problems that language poses.
14) What general lessons can one take from Chomsky’s contributions to linguistics?
The three big takeaways are: (A) human minds are natively built for language, (B) the Rationalist conception of the mind provides a good model for the mental sciences, and (C) methodological pragmatism is the best scientific approach.
Regarding linguistics, humans have a dedicated cognitively bespoke capacity for language—one of its core features is a syntactic component that recursively generates hierarchical structures that feed meaning and articulation.
Regarding the mental sciences, the Rationalist conception of the mind is the right one, which makes linguistics an excellent model for the general study of cognition.
Regarding science in general, Chomsky is a methodological pragmatist—he thinks that there are several rules of thumb for successful inquiry.
First, be ready for something to surprise you—inquiry probably isn’t for you if the capacity for surprise is beyond you.
Second, always try to explain the phenomena you find puzzling.
Third, always evaluate the current explanations to see whether they constitute real answers or simply masquerade as answers. Our “explanations” often simply redescribe a given problem—redescriptions can be useful, but you’ll spend most of your time spinning your wheels if you can’t tell the difference between explanation and redescription.
Fourth, realize that things are always more complicated. The trick is to formulate questions that actually do allow for nontrivial explanatory answers—don’t get discouraged when you fail to pull off this trick, since sometimes you’ll succeed at it. And don’t get discouraged when succeeding at this trick leads to new problems—every good answer to any good question will generate new problems.
For Chomsky, I think the core virtues for promoting productive thinking are: an open mind; a sense of wonder; an impulse to explain; a clear-eyed appreciation of your accounts’ explanatory limits; and the intellectual courage to keep asking questions and answering honestly. And I personally find these dicta useful and admirable even while it’s hard to consistently implement them.
1) To what extent can one explain to a layperson the developments in syntax that Chomsky is excited about? Chomsky seems to be excited about:
the way that head movement might be eliminated from I-language
the way that something close to SMT suffices for unbounded unstructured sequences’ many problems
the way that an interesting range of core I-language properties can be genuinely explained based on SMT-accordant operations and general principles of economy and efficiency
the way that linguists are now able to make progress on the issue of how closely I-language approaches SMT
the way that linguists can—for the first time—start to resolve the issue of how UG can be rich enough to overcome POS problems, simple enough to have evolved, and the same for all possible languages
the idea that SMT might actually facilitate language’s richness in addition to constraining what can appear in language
work that relates to the issue of “good design” and to the issue of why—or if—certain properties hold of a well-designed system
Obviously a layperson won’t understand what Chomsky is excited about unless someone explains these things.
Anyone can get a feel for the basic nontechnical ideas in Generative work.
It’s true that you need to understand the more technical Generative material if you want to really appreciate what Chomsky is excited about, but the technical requirements actually aren’t that high at all when it comes to this material.
As for what Chomsky is excited about, we already discussed some of this.
First, MP is exciting because it puts Plato’s Problem and Darwin’s Problem—and the apparent tension between them—at the center of inquiry, which is a very ambitious move, since these two antagonistic tensions won’t be at all easy to reconcile. And more exciting still is the fact that we seem to have actually gotten some distance in explaining how the reconciliation might be possible—MH has the potential to be a big step in this direction.
Second, we’re finally developing deepish accounts of some fundamental properties of human language and grammar—interested readers can look at my 2017 paper “On Merge” where I explain how MH can actually derive seven properties that we’ve identified as characteristic of human language and grammar.
2) What’s the “Galilean challenge”? Chomsky says the following in a 2019 lecture: “I think we’re finally—maybe—in a position today to take the Galilean challenge seriously. Which—if true—is quite important.”
The Galilean challenge is to find simple theories that explain an extensive range of phenomena—science is the search for simple explanatory theories that have wide empirical reach, so I’d just refer to the “scientific challenge” instead of the “Galilean challenge”, though Galileo did emphasize simplicity.
This challenge applies to every scientific field, but the goal in linguistics is to explain why our grammars have the particular properties that they have. MH is a candidate “Galilean” proposal that’s simple and—in virtue of being simple—far-reaching. And as discussed, one of the great features of MH is that it has the potential to reconcile Plato’s Problem and Darwin’s.
The challenge excites every scientist, since scientists want to explain why things are the way they are.
Chomsky wants to understand why human syntax looks the way it does—why it has the specific properties it does. So he’s excited because he believes—with reason—that linguistics is in a position to meet the challenge.
For those of us who don't have time (and competence) to follow Chomsky's generative linguistics, this is a wonderful overview by a great commentator on the field with many links for further reading.