Will Semantics Inspire You?

An interview with Paul Pietroski.

Paul Pietroski teaches philosophy—and cognitive science—at Rutgers University

Pietroski’s work is absolutely mind-blowing—some might consider it today’s best work in semantics.

I strongly urge interested readers to do the following:

  • explore Pietroski’s fascinating website—and make sure to enjoy the website’s various beautifully-made PowerPoint presentations

  • watch Pietroski’s incredible 2014 talk where he discusses various cleverly-designed experiments that shed light on certain questions about human cognition

  • watch Pietroski’s superb 2021 talk where he discusses a major dogma in semantics

I was honored/thrilled to interview Pietroski. See below my interview with him that I edited for flow and added hyperlinks to.

Dr. Pietroski and I learned the sad news—while working on this piece—that Lila Gleitman had passed away. See this excellent autobiographical article that describes Gleitman’s work. See also these comments (from Chomsky and Wexler) about Gleitman.

Gleitman’s work profoundly influenced linguistics, and will continue to do so in the future.

1) What are the most exciting projects that you’re currently working on, and what are the main ideas in your 2018 book Conjoining Meanings?

For some time now, I’ve been working on a “big picture” project about the nature of linguistic meaning and a “nitty gritty” project that involves collaborative psycholinguistic studies of how people understand quantificational words like “most” and “every”. 

In the last couple of years, I’ve been trying to connect these projects explicitly. This has been exciting for me. 

The central idea is that linguistic meanings are mental instructions for how to access and assemble concepts of a special sort. On this view, the meaning of a phrase like “square window we looked through” is a psychologically encoded recipe for how to build a complex concept from ingredients that can be accessed via the component words. 

A recipe can—and this is important—leave room for variation in the specific ingredients that get used when the recipe is executed. For example, the classic recipe for a Negroni calls for equal parts of Campari, gin, and red vermouth. But that doesn’t tell you which gin or which red vermouth to use—and a well-stocked bar might provide several options, thereby allowing for subtle differences in the Negronis made at the bar in accordance with the recipe. 

Similarly, I think words correspond to storage bins for families of concepts. The noun “window” can be used to access a concept that applies to certain holes in walls, but not to framed panes of glass that can be bought at hardware stores. The same noun can be used to access a concept that applies to the framed panes of glass, but not to the holes. In my view, “window” can also be used to access other concepts—think of store windows where goods are displayed, windows for bank tellers, windows of opportunity, and so on. 

The meaning of the phrase “square window” can then be described as an instruction that calls for conjunction of a concept from the “window”-bin with a concept from the “square”-bin. This leaves room for variation in the concepts that can be built by executing the phrasal instruction. 

More complicated expressions call for combinatorial operations that go beyond simple conjunction. I tried to work out many details in Conjoining Meanings. But the larger goal was to provide an alternative to a more standard conception of linguistic meaning.

It’s often assumed that typical expressions are true of things that speakers use the expressions to talk about. The idea is that for any particular thing: “dog” is true of it if (and only if) it is a dog; “brown dog” is true of it if (and only if) it is a brown dog; and so on. Given this assumption, it can seem that an expression’s meaning at least determines—and perhaps just is—the set of things that the expression is true of. This alleged set is said to be the expression’s extension.

I think there are many good objections to this conception of linguistic meaning. But we’ve already seen one objection. Words like “window” (“book”, “line”, etc.) seem to be conceptually equivocal—or polysemous—in ways which suggest that words don’t have extensions. 

One might initially be tempted to say that “window” is true of both the holes and the framed panes of glass. But that overcounts the number of windows per room, at least when the windows are open. If a room was built with two windows that got filled by two windows that were bought, we don’t say that the room has four windows. 

Or think about “book”. If you’re carrying a single volume that contains two novels by Jane Austen—say Emma and Persuasion—you can be correctly described as having one book, or as having or two books, but not as having three books. 

Alternatively, you might be tempted to say that one word with the pronunciation of “window” is true of the holes, and that a distinct but homophonous word is true of the framed panes of glass. Compare “bank”, where two distinct words—one that we use to talk about financial institutions and one that we use to talk about edges of rivers—really do share a pronunciation. 

But there’s lots of evidence that tells against treating polysemy as homophony. To take just one example, consider the following sentence: “The window that you cut in the wall was nicer than the one that you bought.” The pronominal word “one” serves as a second occurrence of the word “window” that appears earlier in the sentence, and not as a device that somehow invokes a different word that doesn’t appear in the sentence.

Let me say a little about why other considerations can—nonetheless—make it seem that words like “most” and “every” really do have extensions, and that semanticists have to specify these extensions in order to specify what the words mean. Then I’ll say why the psycholinguistic experiments that I mentioned are relevant.

The sentence “Every dog ran” means roughly that the dogs are among the things that ran. We can represent this set-theoretically as shown below. 

{x: x is a dog} ⊆ {x: x ran}

So it might seem that “every” indicates the subset relation—with improper subsets allowed in case the set of dogs is identical to the set of things that ran. 

Put another way, it can seem that “every” is true of the corresponding ordered pairs of sets—namely, the ordered pairs <R, D> such that D is a subset of R. Likewise, “Most dogs ran” means roughly that more than half of the dogs ran. We can represent this set-theoretically as shown below, using “#{...}” to indicate the cardinality of the set in question.

#{x: x is a dog & x ran} > #{x: x is a dog}/2. 

So it can seem that “Most” is true of the pairs <R, D> such that the number of things in both sets is greater than half the number of things in set D. But there are still other mathematically equivalent ways of specifying these ordered pairs of sets, as indicated below.

#{x: x is a dog & x ran} > #{x: x is a dog & x did not run}

#{x: x is a dog & x ran} > #{x: x is a dog} − #{x: x is a dog & x ran}

Cardinality talk can also be replaced with talk of one-to-one correspondence. Given finitely many dogs, those that ran outnumber those that didn’t if and only if a proper subset of those that ran correspond one-to-one with those that didn’t. And if one assumes that meanings determine extensions, one might suspect that different speakers mentally specify the extension of “most” in different ways, perhaps depending on (among other things) their experience with counting. 

One might likewise suspect that different speakers represent the extension of “dog” in different ways. Maybe it’s typical for a word to be associated with a range of extensionally equivalent concepts across speakers, or even within a single person. This possibility was one important motivation, historically, for the idea that meanings are extensions as opposed to concepts. And if speakers of English actually associate “most” with the same extension via different specifications, that would be a point in favor of the idea that meanings at least determine extensions. 

The point wouldn’t be decisive, since one could still maintain that for each speaker, the meaning of “most” is an instruction that gets executed by fetching a concept from the lexical address associated with the relevant pronunciation. But if the fetchable concepts have a shared extension despite differing formally, that would be surprising if meanings are instructions that need not determine extensions.

On the other hand, it’s hard to see how “most” could have an extension if the nouns that “most” can combine with (e.g., “windows” and “books”) don’t have extensions—and likewise for other quantificational words. So while “most” and “every” may be importantly related to certain set-theoretic relations, it’s far from obvious that such words are true of ordered pairs of sets. 

And suppose it turns out that “most” is uniformly associated, across speakers, with an instruction for how to access a complex concept that has a specific representational format—say, one that involves representing cardinalities as opposed to one-to-one correspondence, and representing subtraction as opposed to negation. This would support the idea that meanings are instructions that don’t determine extensions—especially if there is also reason to think that “most” can be used to access distinct concepts that have the same general format but different conditions of application.

I think this is how things turn out. 

The evidence comes from the psycholinguistic studies that I worked on with Jeff Lidz, Justin Halberda, and a series of wonderful former students who did the really hard work: Tim Hunter, Darko Odic, Alexis Wellwood, and Tyler Knowlton. There are some references below.

On cardinalities vs. one-to-one correspondence, see Pietroski et al. 2009.

On subtraction vs. negation—and the related point that “Most dogs ran” calls for a mental representation of the total number of dogs—see Lidz et al. 2011 and Knowlton et al. 2021.

And see Odic et al. 2018 for evidence that the instruction associated with “most” is executed in a slightly different way when “most” combines with a mass noun (as in “most of the paint”) instead of a count noun (as in “most of the dogs”). 

The 2014 talk that you hyperlink—in your very kind introduction—presents some of this material, and it walks through some of the early experimental designs. 

More recent work (Knowlton et al. forthcoming) suggests that the issues play out the same way with “every” and “each”. There are many ways of specifying the alleged extension for universal quantifiers. But it seems that speakers associate “every” with a particular specification, while they associate “each” with a different specification. 

2) What are the most exciting projects that you know of that others are working on?

There are many, even if you’re asking only about projects in the same ballpark as mine. 

There has been a lot of innovative work that integrates traditional studies of linguistic expressions with various psycholinguistic methods, in attempts to make progress on large-scale questions about meaning. Let me offer three examples.

First, Lila Gleitman is a treasure. Her “collected papers” are terrific, and I think her work with John Trueswell on early lexical acquisition is deeply important. 

Second, Stephen Crain—and his many collaborators over many years—have done lots of good work on logical vocabulary in several languages. It’s really useful to think about how kids can acquire the relevant aspects of English, or the relevant aspects of Japanese, or both.

The third example concerns psychological vocabulary like “thinks” and “wants”, and how children acquire this vocabulary. There is a very interesting line of inquiry—closely related to the huge literature on “false-belief tasks”—that includes work by Jill de Villiers and a series of studies by a University of Maryland group that includes Valentine Hacquard, Shevaun Lewis, and Rachel Dudley. (See this nice overview paper by Hacquard and Jeff Lidz, which has references to more specific papers where Lewis and Dudley are authors.)

Farther afield but still related, you recently interviewed Randy Gallistel. So you know about his thoughts concerning computation, and in particular, the need to look for biological mechanisms of computation down near the scale of RNA (as opposed to associations at the level of neural activity). That’s a big idea. 

Gallistel has deeply influenced my thinking here. I certainly have no better idea about how biology could implement the instructions I talk about—or the analogs of lexical entries in scrub jays, who have prodigious memories for food caches

A typical scrub jay can evidently form a mental database with more than ten thousand entries, each of which represents a location, the kind of food stored there (e.g., seed vs. grub), when it was stored (grubs spoil faster than seeds), and even whether another bird was watching when the goodie was hidden. I find this fascinating. 

And this also invites thoughts about the cognitive bases of polysemy. If jays could acquire pronounceable words, maybe they would acquire one that could be used to talk about cache-locations, or cache-contents, or cache-times, or even cache-spies. And maybe jays, along with other corvids, already have mental “pointers” that point to families of concepts that let them think about their caches in various ways. 

The space of exciting projects expands exponentially once we start talking about biology and studies of non-human animals. But to mention just one other example in the vicinity of language, Louis Herman did amazing work with dolphins.

3) Why do certain special words allow us to investigate where the human mind meets the part of the mind that humans share with animals?

I’m not sure if it’s the specialness of words like “most” and “every” (“or” and “not”, etc.), as opposed to the limitations of currently available experimental techniques. 

Following Jerry Fodor, I suspect that the concepts accessed via nouns, verbs, and adjectives/adverbs are—with very few exceptions—atomic in the sense of not having other concepts as parts. (These lexically accessed concepts may not be developmentally primitive in other senses that cognitive scientists often care about. But that’s another story.) 

But it seems that many of the concepts that humans access via logical vocabulary are not atomic in this sense. People know a lot about how the concepts accessed via this vocabulary—including words like “every”, “some”, “no”, “all”, “most”, “almost”, etc.—are related. For example, you know that if every dog barked, then at least most of the dogs barked. You also know it’s possible that most but not all of the dogs barked. It’s hard to see how we could know what we do know, in this domain, if the concepts in question were atomic. 

Our team wanted to study how people understand logical vocabulary. So we focused on various candidate analyses of the concepts that correspond to “most” and “every”. Then we developed methods for getting evidence concerning which candidate analyses are (and are not) psychologically real. Knowlton et al. (2021) offer some discussion on this point. 

Other methods can be used to study, for example, the polysemy of nouns and verbs. Jake Quilty-Dunn has a recent paper that helpfully reviews a wide-ranging literature and offers a conception of polysemy that differs a bit from mine. Elliot Murphy has been pushing things in new and interesting directions.

4) How do particular lexical items interface with aspects of cognition that we share with non-human primates? 

That’s a great question, but there may be no general answer. We may have to sort this out one case study at a time.

One obvious place to start was by thinking about how numerical and quantificational words interface with the “approximate number system” (ANS) that we share with many other animals. 

Stan Dehaene’s The Number Sense is a great introduction to what has been learned about the ANS. Susan Carey’s wonderful book The Origins of Concepts reviews a lot of other relevant work in psychology that influenced the studies I’ve been involved with. I regularly teach a course on this stuff, with the Dehaene and Carey books at the center.

But lots of other work in cognitive science (e.g., studies of infants and causal reasoning) invites connections with lexical meanings once you start looking around with your question in mind—so long as you take meanings to be mentalistic, as opposed to sets of mind-independent things. And for a good example of relevant comparative work, check out Michael Tomasello’s studies of human toddlers and chimps on sharing and cooperation.

5) What do we know about the interfaces where grammar meets known aspects of animal cognition?

Still very little, in my view. 

But a few good case studies can be enough to build on. And there has certainly been progress since the days when I was a graduate student.

6) Why are you so optimistic about research on these interfaces?

I don’t know if I’m optimistic. But I’m not pessimistic, because a few good case studies can be enough to build on. 

There’s still a lot we don’t know about the ANS. But it’s one of the best studied cognitive systems, both in humans and across species. Smart people have thought about quantification, number, and quantification/number-words for a long time—see, e.g., Aristotle. Semanticists have at least made a range of hypotheses about meaning fairly precise. And developmental psychologists have learned a lot about the stages of “numerical competence” that children go through. 

In my view, the “cutting edge” research questions in this area have also been getting more interesting over time. I think this is one of the places where good but inadequate initial ideas from several fields have converged in productive ways. In my academic neighborhood, I don’t see any places to work that seem more promising.

7) What is Chomsky’s idea about mastering one system in detail in order to use that system to learn about other systems? 

I think the idea, not specific to Chomsky, is pretty clear. 

When faced with a complicated system like the human mind, you initially try to understand whatever aspects of it you can (to whatever minimal degree you can). If all goes well, over generations, a few pockets of deeper understanding will emerge. If all goes really well, a reasonable—or at least not unreasonable—conception of various interacting subsystems (or “modules” or “faculties”) will emerge, even if the detailed interactions remain mysterious. 

At that point, an obvious strategy is to dig in and try to develop real theories of a few subsystems (that have begun to seem less mysterious) in order to get some leverage on the interactions. 

Our psycholinguistic studies were certainly based on this general strategy: use what’s known about visual perception and the ANS to learn something about the (less well understood) lexical meanings that interact with some language-independent systems; if all goes well with the first experiment, use what you learned to leverage the next experiment; repeat until things stop going well.

8) How can mastery over grammar help us learn about pragmatics, where do things look optimistic in pragmatics, and where do things look pessimistic in pragmatics? 

Studies of grammar can at least carve out something that pragmatics isn’t, even if these studies don’t tell us much about how linguistic expressions get used to express thoughts in particular contexts. 

But the previous question/response is relevant, especially since it has become pretty clear that some aspects of (what often gets called) pragmatics are more tightly connected to grammar than others. 

To take one example of a kind that has been extensively discussed, consider the following sentence: “If a dog or cat ran by, then you may have cookies or ice cream.” The occurrence of “or” in the consequent of this conditional exhibits exclusive implicature: cookies or ice cream, but not both. But the occurrence of “or” in the conditional’s antecedent doesn’t exhibit exclusive implicature—if both a dog and a cat ran by, you still get to have dessert.

This kind of fact invites detailed grammatical investigation, especially given that “If any dog ever ran by, then a cat ran by” sounds fine, but “If a dog ran by, then any cat ever ran by” sounds terrible. (For detailed and illuminating discussion, see for example, work by Gennaro Chierchia and Danny Fox.)

That all seems very different from the pragmatic reasoning that is invited if a friend utters “The chairs were nice” when you ask them about their experience at a new restaurant. 

The pragmatics that seems more systematic and linguistically constrained is also being studied experimentally in fruitful ways—see, for example, the work by Emmanuel Chemla and colleagues. Here, one can hope for substantial connections with theories of grammar in a narrower sense.

9) What insights did medieval logicians have that you fear have been lost?

This gets back to my “big picture” project. 

The long story is in chapter two of Conjoining Meanings. The short story is that if you look at the medieval logicians as a group (and squint a bit while ignoring various distractions), they seem to have hit on the idea that the thoughts we humans express linguistically are governed by a “natural logic”—to which Aristotelian logic can be reduced—with conjunctive monadic predicates playing an important role. Here I draw on some work from the 1980s by Johan van Benthem and Victor Sanchez-Valencia, as well as a 2002 paper by Peter Ludlow, who explores these issues in a forthcoming book with Saso Zivanovic

10) What accounts for the opposition to semantic internalism?

I see less opposition than there used to be, especially among the younger crowd. And it’s now discussed as an option in the “Theories of Meaning” essay for the Stanford Encyclopedia of Philosophy

But if you work in an established field, it’s perfectly reasonable to be initially skeptical about any idea that runs contrary to the received wisdom in that field. 

That said, I think people often forget the intellectual history prior to Donald Davidson’s 1967 paper “Truth and Meaning”, even if people occasionally complain about Katz’s and Fodor’s earlier paper “The Structure of a Semantic Theory”. In my view, there was a nascent but plausible version of semantic internalism already available in 1967, and it got ignored rather than argued against. But that’s a very long story for a new book I’m working on.

11) Will semantic internalism catch on over time? 

I hope so. In my view, the arguments for externalism in this domain are extremely weak, and the arguments against are pretty powerful. 

But to connect back to your previous question, it can be hard to give up what one learned in graduate school. As a result, familiar views often get treated as “default views” that should be maintained absent “proof” that they are wrong. And proof is hard to come by outside of logic and mathematics.

So we’ll see.

12) What critique should people read of the idea that natural-language sentences have truth-conditions

I think it’s better to start by asking why this idea was ever supposed to be plausible. (It’s worth reading the classic papers and trying to find the arguments.) Before 1967, it certainly wasn’t any part of received wisdom in philosophy. 

For critique, consider Frege’s remarks about natural language, the later Wittgenstein’s response to his earlier Tractatus, and Strawson’s reply to Russell. Charles Travis has done important work here, inspired by Wittgenstein and Austin. I think it’s useful to read Mark Wilson’s “Predicate Meets Property” in this light. 

And of course, there’s Chomsky’s work—especially New Horizons, Essays on Form and Interpretation, Reflections on Language, and (to go back to 1964) Current Issues in Linguistic Theory.

In my 2005 paper “Meaning Before Truth”, I tried to develop a line of thought that connects the central points that Chomsky and others had been making for a long time.

13) Does a theory of meaning have to be a semantics (in the technical sense of “semantics”)? 

The word “semantics” was introduced late in the 19th century, via an analog in French, as a quasi-technical term for talking about the prospects of a scientific approach to the study of linguistic meaning. Given this sense of the now polysemous word, it’s close to trivial that a theory of meaning for a natural language would be a semantics for that language. I certainly think of myself as doing semantics in this sense. 

In the 1930s, Tarski put his own technical spin on the word by offering his “semantic conception” of truth. (John Burgess has a lovely discussion of this in his paper “Tarski’s Tort”.) Tarski showed how to characterize truth for certain invented languages in a scientifically respectable way via his ingenious technique for stipulating truth-theoretic interpretations for these languages. 

A generation later, Davidson boldly conjectured—despite Tarski’s explicit and apparently justified claims to the contrary—that a suitably formulated Tarski-style theory of truth for a language like English could itself be true and also serve as the core of a correct theory of meaning for the language. 

Then just a few years later, David Lewis insisted (in “General Semantics”) that a semantics for a natural language had to specify truth conditions for that language’s declarative sentences. Given Tarski’s very technical notion of semantics, Lewis’s point was perhaps truistic. But it wasn’t an argument that any such theory would be correct, much less an argument against internalistic conceptions of linguistic meaning. 

Given the older notion of semantics, Lewis was simply insisting on Davidson’s bold conjecture. Gil Harman and a few others tried to note this. But they were largely ignored, and Lewis’s pronouncement was often taken as gospel. 

Confusion about terminology really can have bad effects.

14) What papers can people read to get up to speed on your work? 

In general, I think it’s hard to get up to speed by reading journal articles intended for specialists. 

But I have been fortunate in that a few people have wanted to do interviews (including this one) about my work. There have also been some discussions of Conjoining Meanings in the journals, sometimes with me involved. You can find these via my website, which also includes links to papers that readers can dip into

Though if you want one paper on semantic internalism and one on psycholinguistic experiments, I’d suggest “Meaning Before Truth” and the recent jointly-authored paper “Linguistic Meanings as Cognitive Instructions”

15) What are the main ideas in your 2005 book Events and Semantic Architecture

I think of that book as a first attempt to lay out my central—and to echo your earlier question, “neo-medieval”—ideas about how semantic composition works. 

Back in 2005, it was less standard to invoke predicates of events and emphasize conjunctive rules of composition. So in the book, that was my focus, and I largely bracketed my growing skepticism about truth-theoretic conceptions of meaning. At the time, several friends offered “one battle at a time” advice. 

Over the years, I modified some of the details. So the technical presentation in Conjoining Meanings is a little different. But many of the main ideas carried over.

16) What would you change/add in a view version of the 2005 book?

I’d change it into Conjoining Meanings. But for many reasons, I couldn’t have written that book 20 years ago. 

The 2005 book was an important stepping stone for me, and maybe it was useful for others.

17) Why is it surprising/interesting that logical forms have ampersands all over the place that conjoin and that get ever stronger? And why does the idea that there are ampersands everywhere seem “crazy” to people?

It’s not surprising if you adopt a neo-medieval perspective on meaning and logical form. 

But it’s quite unexpected if the fundamental rule of semantic composition is what semanticists call “function application”. Given this (neo-Fregean) conception of semantic composition, there’s no reason to expect that lengthening expressions—e.g., from “Dogs barked” to “Brown dogs barked loudly”—will typically correspond to strengthening the expressions (e.g., by adding conjuncts).

David Lewis effectively admitted this when he discussed examples like “fake diamond” and “alleged criminal”. I think words like “fake” and “allege” are interesting special cases, but that there is conjunction under the hood even for the cases Lewis focused on.

Why might the idea seem crazy? Partly because it can seem too simple to be true, and partly because it’s not what people learn when they take a standard class in semantics.

18) What do you mean when you say that the meaning of “brown cat” has the meanings of “brown”/“cat” as parts? Isn’t this expected? 

It’s totally expected if you think that meanings are either concepts or composable recipes for how to build concepts. 

But if the meanings (or semantic values) of “Fido” and “barked” are a dog and a function from entities to truth values—and if the meaning of “Fido barked” is the truth value determined by the function given the dog—then the sentence meaning does not have the word meanings as parts. (In that case, the sentence meaning is merely determined by the meanings of the words and how they are arranged.)

The same point holds if sentence meanings are sets of possible worlds and word meanings are mappings from possible worlds to extensions. If meanings are sets, sentence meanings don’t have word meanings as parts. 

19) What’s the empirical content of the compositionality thesis

I don’t think there’s any such thing as “the” compositionality thesis. 

Some people spend a lot of time formulating determination theses that are weak enough to allow for truth-theoretic semantics, even given examples like this: “I think woodchucks are groundhogs, and George Orwell liked every window that Eric Blair liked, but not everyone knows that.” I find it hard to understand the point of such exercises. 

I’d have thought that given Chomsky’s work in the 1950s, a far more interesting question in the vicinity is this one: What’s the strongest notion of semantic composition that seems to be compatible with the facts about natural languages? 

Weaker notions may have some utility when we’re thinking about Universal Grammar and conceivable languages that children never acquire. But I can’t see why we should care about super-weak notions that are satisfied by certain invented languages—and perhaps satisfied by human languages if we ignore lots of apparent counterexamples.

20) In what substantive way are languages like spoken English/Japanese compositional? 

There’s no telling in advance. 

We need to consider various proposals in light of considerations regarding “descriptive adequacy” and “overgeneration”, as Chomsky suggested in the 1950s and early 1960s.

21) Is there non-linguistic thought? 

Yes, with one caveat related to the polysemy of the words “linguistic” and “thought”.

I think it’s obvious that many non-human animals think. Behaviorism was spectacularly implausible, and likewise for other theories according to which my horse doesn’t think. I see no reason to deny that a lot of human thought, infant and adult, is also non-linguistic in the same ways that my horse’s many thoughts are non-linguistic.

The caveat is that one might use “non-linguistic” to exclude thoughts that are formulated in an entirely mental/internal language that lets non-human animals combine concepts in systematic ways that have nothing to do with grammar or speech. And one might use “thought” to exclude any kinds of cognition that don’t involve systematically combinable concepts. But then it’s boringly definitional that there is no non-linguistic thought—and in my view, it’s quite likely that many non-human animals have linguistic thoughts in this very technical sense.

22) What’s at stake in defining whether the “mind” should include unconscious things? Why does it make sense to debate what “knowledge” means or what the “mind” is? Chomsky has argued for decades that the “mind” should not be restricted to consciousness and should include non-conscious activity. It seems to me like you can define “knowledge” or “mind”—or, in the political realm, “racism”—however you want, so it’s not clear to me what’s at stake in these debates. You can just make up a new word if there’s no word for something you want to refer to, but I don’t see why one should battle with people over how words are currently being used.

There’s a lot going on in this question. I can’t do justice to all of it, here or elsewhere. But I don’t think the real debates are just about words. 

In trying to figure out how the world works, we try to come up with theoretical terminology that marks explanatorily important distinctions. With regard to discussions of knowledge—what it is, why it matters, and whether anybody has any—one can wonder if it’s a good idea to press the English word “know” into theoretical service at all. But even if one continues to use the ordinary word, as Chomsky sometimes does, one can go on to ask whether or not it’s a good idea to adopt a technical notion of knowledge that allows for both conscious and unconscious variants (assuming one has some grip on the conscious-vs.-unconscious distinction). 

One can also wonder whether or not it’s a good idea to adopt a technical notion of knowledge that allows for both propositional (knowledge-that) and practical (knowledge-how) variants—see, e.g., Jason Stanley’s work on this topic. Like Chomsky, I think epistemologists would do well to think more about “knowledge of”, as in knowledge of language (or knowledge of astronomy, or a cab driver’s knowledge of London). But we’re in the domain of hunches about how best to press polysemous words into theoretical service.

These are deep waters.

Plato wrestled with closely related questions. How can you even search for knowledge when you’re ignorant? And how can you know what you’re talking about, when you ask questions, if you know nothing substantive about whatever it is that you’re trying to talk about? Sometimes, thinking about language makes contact with real philosophy. When this happens, my inclination is to proceed slowly and with caution. 

When we add normative questions about the “social world”—and how we want it to be, and how it ought to be—the issues get harder. At least for me, the issues also get harder to even think about. Here, words can and do matter, in various ways that are all too familiar. 

Suitably serious discussion would take a lot of time and a different format. But as a place to start, let me recommend work by Sally Haslanger and in particular, her paper “Gender and Race: (What) Are They? (What) Do We Want Them To Be?”. Haslanger gets the issues on the table in a useful way. 

23) What can people read on the idea that animals have contentful concepts—i.e., that animals’ mental representations have a content-relation to the world? 

There’s a vast literature. But earlier threads of this discussion are relevant.

There’s the ANS literature, going back at least to Church and Meck (discussed in Dehaene’s book that I mentioned above). 

Following leads from Randy Gallistel, I find it useful to think about the literatures on bee navigation (starting with Dyer and Dickinson) and on scrub jays’ memories for caches (starting with Clayton). 

There’s also Herman’s work with dolphins. 

In retrospect, it also seems pretty obvious that Skinner’s rats represented the bars they were pressing, the food they got, the walls of the Skinner boxes, and a lot more.

24) Why do you think that it’s crucial to keep “meanings”/“contents” separate? 

I think the meanings of human linguistic expressions differ in kind from the contents of pre-linguistic mental representations. 

If this is correct, it raises lots of questions about the nature and origins of linguistic meaning. It also suggests that we should be suspicious of theories that identify linguistic meanings with the contents of certain mental states or episodes of assertion.

Smart people repeatedly get led to such theories by starting with the assumption that languages are “for communication”. So maybe that’s not a good assumption to start with.

25) What’s special about human cognition? What separates human cognition from animal cognition? 

Big question. But I assume that many aspects of human cognition are not special, and that at least some aspects of human linguistic cognition are special. 

There are many differences of degree—including significant differences in social cognition, at least if you compare us with chimps. 

But human language involves some categorical difference. And other things equal, I’d rather not posit more than one deep respect in which a particular species of primates turns out to be special.

26) Why should we expect that animal cognition involves multiple languages of thought

We should expect minds that evolved to be modular. And evidence suggests that they are. 

It’s presumably hard enough to biologically instantiate a mental vocabulary that is useful for one class of tasks (e.g., navigation or learning about causal/temporal dependencies). So why think there has ever been an “all-purpose” vocabulary? 

The history of science suggests that knowing stuff about diverse domains requires diverse modes of representation. 

Words are amazing in that they’re so systematically combinable. Given any two words, endlessly many phrases include both. But if I’m right, that’s in part because words are not tethered to the environment. Words can be polysemous instructions that are governed by a special syntax that is not tied to truth.

27) How will we be able to find that out? 

With lots of hard work, one case study at time. If we’re lucky.

28) How will we be able to find out how these languages of thought do—and don’t—relate to the environment? 

Cognitive science is hard. But much has been learned about insect navigation, scrub jays’ memories for caches, and the ANS. 

As we sneak up on human language and thought, let’s not burden inquiry with the dogma that theories of meaning need to be theories of truth. Cognitive science is hard enough without that dogma.

29) Why is it important to you that a theory of meaning be able to explain why a given sentence can be understood in “one rather than two—or two rather than three—ways”

At least for these purposes, let’s take sentences to be strings of words (or of morphemes) that meet certain grammatical conditions. Then we can say that a sentence like “the duck is ready to eat” can be understood in either of two ways that we might indicate as follows: the duck is prepared to be an eater (of something), or the duck is fit to be eaten (by others). 

By contrast, the first kind of meaning goes missing for “the duck is easy to eat”, which cannot be understood as meaning that it is easy for the duck to eat (something). And the second kind of meaning goes missing for “the duck is reluctant to eat”, which cannot be understood as meaning that the duck is reluctant to be eaten (by something). That’s pretty interesting. 

Chomsky discussed the importance of such points before I was born. I stress them because it’s hard to see how a theory of meaning could be remotely adequate if it didn’t at least begin to explain why strings of words/morphemes have the meanings they have and not the logically possible but coherent meanings they don’t have.

30) Does any theory of meaning satisfy this criterion? 

Not completely, of course. Languages are complicated, and each one has very many sentences. 

But I think Chomsky pointed us in the right direction in the early 1960s. The trick is to combine his suggestions with a plausible conception of what meanings are.

31) What do you find most interesting about linguistics? 

It provides a promising route to discovering what’s distinctively human about human cognition. In my view, no other route is as promising.

32) What do you find most rewarding about linguistics? 

You can do it and get somewhere. 

You can then be led further in unexpected directions—e.g., to the kinds of experiments I’ve been involved with—that also lead somewhere. 

These are rare things in areas where you get to talk about philosophy and human nature.

33) In terms of humans’ computational power, where do humans sit on the Chomsky hierarchy

Not sure that question is well-posed. The hierarchy concerns certain artificial languages that can be characterized as sets of atomic symbols. 

From my perspective, the hierarchy is interesting because it can help highlight and clarify a point that is easily forgotten. There are importantly different ways in which computational operations can generate endlessly many expressions from a lexicon of atomic expressions. 

More specifically but a little metaphorically: we can describe one “flavor of recursion” that yields “flat” expressions, which do not exhibit constituency structure of the sort exhibited by human linguistic expressions—and we can describe more powerful “flavors of recursion” that can “erase” constituency structure in ways that human expression-generators evidently cannot. 

Thinking about kinds of recursion in this way can help frame and motivate research questions about how human expression-generators generate what they generate. 

Likewise, I think semanticists should ask how meanings compose. In this context, it can be useful to think about various invented languages as exhibiting a kind of “semantic hierarchy” and then note that currently standard theories describe human linguistic meanings with invented systems that are extremely powerful.

34) What could semanticists do to shed light on this question? 

Try to figure out what is essential to the semantic phenomena exhibited by the languages that children actually acquire. Then look for computationally spare models of the phenomena, rather than merely showing that some alleged semantic properties of expressions are computable.

35) You can exploit enormous computational power if you invent languages like lambda-calculus and use those languages in various semantic theories. Relative to that reference-point, what computational resources does the human child naturally deploy in generating the “semantic instructions” you talk about? 

Another big question. And there’s no way to answer without getting a little technical. 

In Conjoining Meanings, I offer a specific proposal that may still invoke too much computational power. But it invokes less than Church’s lambda calculus, which was designed to describe the full power of a Turing machine

It’s hard to say exactly how much less power I invoke. But Thomas Icard has some good thoughts here. (We have a joint paper that has been “in progress” for a while because I’ve been delinquent.) 

Given the kinds of facts that Chomsky discussed in Syntactic Structures and his papers on “the hierarchy”, it quickly became clear that humans generate linguistic expressions without employing the full resources available to a Turing machine, or even a Turing machine whose memory is “linearly bounded”. But maybe the resources of a “minimalist grammar”, like the one that Ed Stabler formalized, are adequate—or see the work by Aravind Joshi and others on “tree-adjoining grammars”.

With regard to parallel issues concerning meaning, I have been inspired by my teacher George Boolos and his discussions of second-order logic—more specifically, second-order logic with restrictions on the ways that relations (as opposed to monadic properties) and plurality can be expressed. 

In Conjoining Meanings, I explore the hunch that meanings are instructions for how to build concepts that correspond to the expressions of a very limited second-order predicate calculus. The finitely many “lexical” concepts are all monadic or dyadic, and the endlessly many “phrasal” concepts are all monadic. 

Here’s another way to put the hunch. Given a Chomsky-style syntax, a second-order monadic predicate calculus might provide just enough formal resources for natural language semantics—as long as you supplement those formal resources with a memorizable list of dyadic concepts that let you think about certain relations (e.g., thematic relations like being-the-agent-of that connect individuals to events the individuals participate in).

That’s a conjecture, which may need to be relaxed in light of facts about human languages and how children acquire them. But I was trying to offer a specific alternative to the more familiar endeavor of using the lambda calculus to specify alleged truth conditions of sentences. 

I can’t see the point of that endeavor if the goal is to describe human linguistic competence, as opposed to describing truth conditions of idealized utterances in some recursively specifiable way. Of course, semanticists are free to pursue various projects. But my project is explicitly psychological, and I intend it to be fully continuous with Chomsky’s work.

36) Hypothetically, what “cascading consequences” would occur if you added universal grammar to the conceptual system of a non-human primate—e.g., a chimpanzee? 

Hard to know. 

But if a chimp could acquire systematically combinable lexical items, after some exposure to ordinary speech, then the chimp might often “talk to itself” in novel ways—and end up entertaining many thoughts that it wouldn’t have entertained otherwise. 

And if the lexical items could become polysemous, that might lead to cross-module cognition that wouldn’t have emerged otherwise.

37) How does universal grammar “push” a non-human primate to eventually be able to acquire a concept like “carburetor”

I doubt that UG itself does any pushing here. But once a human has enough words and concepts, acquiring a concept of carburetors may require little more than a suitable encounter with a carburetor or two. 

At least for members of our species, it isn’t that hard to acquire a concept of a new kind of animal or a new kind of astronomical phenomenon. An example or two is often sufficient. 

Of course, it’s much harder to acquire substantial knowledge.

38) Regarding natural languages, how tight is the connection between meaning and syntactic form? 

I think it’s pretty tight. 

But that’s in part because I think the connection between meaning and truth isn’t tight.

39) What empirical evidence do we have about this connection? 

For starters, there are the Chomsky-style examples of ambiguity and non-ambiguity. (Recall “the duck is ready/easy/reluctant to eat”.) 

These examples suggest that it’s one meaning for each way of generating a string of morphemes.

40) Why do theories of meaning start to break down once you move away from this connection?

If sentence meanings correspond to ways of generating sentences, we shouldn’t be surprised if theories of meaning turn out to be (partial) theories of the relevant generative systems. 

And if we try to make theories of meaning be (partial) theories of how generable expressions get used, or theories of how these expressions are related to alleged extensions, we shouldn’t be surprised when things don’t go well. Lexical items are polysemous, and truth seems to depend on context in ways that meaning doesn’t. 

So I think you’re in for trouble if you try to make your theory of meaning be a theory of truth. 

41) Why don’t you foresee any theories about how humans actually use expressions? 

Two reasons. 

First, nobody has ever had a good idea about how to even start providing such a theory. 

Second and relatedly, use seems to be free. See Chomsky’s review of Skinner’s Verbal Behavior. I can, right now, use “Unicorns sneeze a lot” as an example if I want to. Good luck coming up with a theory of that.

But I think we can have theories of the constraints that meanings impose on use.

42) Why do you have nothing to say about speech-acts that go beyond (e.g.) assertions about the weather? 

In Conjoining Meanings, I was talking about the meanings of expressions, not free acts of using meaningful expressions in conversation. 

I have nothing new to say about speech acts. At best, I can help bolster the many reasons for not trying to characterize meanings in terms of speech acts.

43) What would a non-human primate be able to do in terms of speech-acts if you gave them universal grammar and let them loose to start thinking? 

I assume that we’ve abstracted away from any problems with vocalization, since human vocal tracts are somewhat distinctive.

But it’s still hard to know, given differences in social cognition. I don’t know what norms a group of chimps—or bonobos or orangutans—would attach to episodes that humans might describe as episodes of assertion or promising. 

I think the more interesting but even harder question is your 36.

44) What are the biggest problems/puzzles/mysteries in linguistics that you want to know the answer to? 

It would be nice to know how lexical items and the relevant combinatorial operations are biologically implemented.

45) What research/experiments might move the ball forward on these problems/puzzles/mysteries?

Crawl before walking. Marathons later. 

It would be nice to know how any memories or computational operations are biologically implemented. See your interview with Randy Gallistel.

46) Is there a set of 10 carefully-selected sentences that perfectly puts on display the biggest and most fascinating puzzles/mysteries in linguistics? It might be an interesting popularization to see these sentences, since it makes more concrete (for the layperson) a very abstract/technical field. 

I doubt it. 

I find it hard enough to illustrate the central questions in compositional semantics with 10 short sentences. Though I think that can be done with effort.

47) Why is the word “if” a “nightmare”

Try saying which concept(s) “if” gets used to access, after you’ve looked at a reasonable sample of ordinary uses. 

It turns out that “and” is already hard. Barry Schein has a 1000-page book on it

And “or” is harder. And “if” seems to be still harder. 

Luckily, it’s not my job to say which concepts “if” gets used to access. And it’s certainly not my job to say what “if” contributes to the alleged truth conditions of sentences in which “if“ appears.

48) What do you think is the solution to the “if” problem? 

I don’t think there is one problem here, much less one solution. 

Lexical items are, in many ways, far more complicated than modes of composition. Even “rabbit” and “water” are complicated cases once you consider the diversity of rabbits and what counts as water.

By comparison, “every” and “most” seem simple, which is why our group focused on them.

49) Why exactly do we say “the shooting of the hunters is a crime” as opposed to “the shooting of the hunters are a crime”? In this sentence, agreement ignores the closest noun-phrase. I don’t know if there are any mysteries to this “hunters” issue. 

I don’t think there are any mysteries here. But the facts are interesting and revelatory.

In “the shooting was a crime”, the auxiliary verb “was” obviously agrees with the singular (gerundive) noun “shooting”. But the relevant principle is not that “was” agrees with the linearly nearest noun. It’s that the auxiliary verb agrees with the head noun in the subject of the sentence, regardless of how far away that noun is. 

So in “the shooting of the hunters was a crime”, the agreement is still with “shooting” and not “hunters”. And in “the shootings of the hunter were crimes”, the agreement is with “shootings” and not “hunter”. 

In general, grammatical rules concern structure and constituency, not linear order. And young children somehow know this, even though it’s very hard to see how anyone could learn this generalization about grammars from the evidence available to children. 

This phenomenon can also be illustrated with questions like the following: “Was the hiker who lost kept walking in circles?” This string of words can only be understood as the yes/no question that corresponds to the unexpected declarative sentence “The hiker who lost was kept walking in circles”, according to which the hiker who lost some contest was made to keep walking in circles. 

But there’s nothing wrong with this declarative sentence: “The hiker who was lost kept walking in circles.” In fact, that’s the sentence that you’d expect given the following word list: the, who, was, in, hiker, lost, kept, walking, circles. 

Nonetheless, the question “Was the hiker who lost kept walking in circles?” has to be understood as if “was” is connected to “kept walking”—and not to the linearly closer verb “lost” that is part of the relative clause (“who lost”) that lies within the larger clause “the hiker who lost”. 

Young children invariably acquire grammars that respect this constraint. And this constraint can trump coherence. 

Consider “Was the guest who fed waffles fed the parking meter?”, which has only the crazy interpretation that corresponds to “The guest who fed waffles was fed the parking meter”. 

Even if you initially hear the sensible declarative sentence “The guest who was fed waffles fed the parking meter”, and you then hear “Was the guest who fed waffles fed the parking meter”, the question still only has the crazy interpretation. You somehow know that regardless of what coherence suggests, “Was” has to be understood as related to “fed the parking meter” and not to the embedded clause “fed waffles”.

These points, which go back to Haj Ross’s dissertation in the 1960s, have been discussed in many places in connection with language acquisition. And they highlight a more general point that came up earlier: given that ambiguity is ubiquitous in natural language, one wants to know how kids come to know that unambiguous strings of words are unambiguous. 

What lets humans know how novel expressions can’t be understood? 

This 2011 paper revisits some of Chomsky’s arguments from the 1960s, provides additional examples, and connects with several themes that have emerged in our conversation. See especially the references in that paper to Dyer’s and Dickinson’s work on bees, and to the work by Crain and colleagues on language acquisition.

50) Suppose I gave you a choice. You either get to know everything about the whole generative procedure/system, how it works, how it assembles atoms of meaning into thoughts, and how it evolved—or you get to know everything about those atoms of meaning, what they are, how they arise in the brain, and how they evolved. Isn’t the second one the more interesting thing? 


But the first might be our best potential route to the second.