Dawson Margin Notes On Green
Chapter 7
The Structure Of Sentences
By Hans van de Koot
Relating The Chapter To The Lecture
The primary goal of the chapter, as far as the course goes, is to introduce the "rule governedness" of language to you. I want you to get a feel for the complex tokens thought to underlie language (i.e., the tree structures) as well as the rules that are used to manipulate these (in particular, MOVE, MERGE, and SPELL-OUT).
In my view, the later sections of the chapter are pretty hard to follow. I don't want you to spend a lot of time memorizing the details of the arguments etc -- just get as good a feel for the rule governed nature of language as possible. And marvel at the complexity that appears to be hidden underneath such a natural human act as speaking, hearing, or reading language.
Finally, pay careful attention to the differences between language as illustrated in the lecture and language as illustrated in the chapter. In particular, the Wh-question rule developed in the lecture is based on a much older version of syntax, the transformational grammar notions from the 60s and 70s. The book is much more closely aligned to the modern minimalist program. One of the key differences between the two approaches is that there is currently much less of a reliance on a variety of transformational rules -- the minimalist program basically uses 1, "move". This is done so that the same transformational rules are present in all human languages. In the older version of syntactic theory, there were many more transformations, and they varied from language to language. A related key difference is the notion of the lexicon. In modern theory, lexical items carry a lot of the grammatical force -- for instance, in the theta-theory stuff outlined in the chapter. In a very real sense, in modern theories of syntax the assumption is that a great deal of structure is being imposed by the lexicon.
Margin Notes On The Chapter
Language is so easily mastered by us that it is hard to imagine that it is based on complex mental processes. E.g. the structural correctness of sentences, e.g. understanding their meaning -- it turns out that these rely on very complex things.
Some sentence meanings depend on knowledge of the world -- but some do not. "You do not need to consider your knowledge of the world to determine that the sentence women are male is false. It seems that such judgements depend on linguistic knowledge only." Focus of this chapter is on such linguistic knowledge -- i.e. this is a chapter about syntax.
"Every normal human being acquires a natural language and eventually produces and comprehends that language with astonishing ease and speed." What is acquired? A certain system of knowledge called a grammar. Chief focus of this chapter is on a Chomskyan view of syntax, generative grammar. Those who study generative grammar try to answer four fundamental questions:
To answer these questions, the tri-level approach is taken. The computational approach to specify a grammar will answer question 1. The algorithmic account of how grammar is used will answer #3. The implementational account of how the brain instantiates language will answer #4. (NB: The text claims that question 2 does not fit into the tri-level approach. I strongly disagree!!)
Learning is a key requirement or constraint on a theory of generative grammar. "The requirement that grammars be learnable offers a challenging perspective on the study of language and has had a profound influence on how generative linguists characterize linguistic knowledge." The remainder of the chapter focuses on questions 1 and 2. (NB: in our course, the focus is really on the rule governed nature of language!)
Language As A Mentally Represented System Of Rules
Language is a black box that maps speech onto meaning and vice versa. What is inside this black box? One hypothesis is that it contains mentally represented linguistic rules. Why? One reason is that this allows us to create (in principle) an infinite variety of sentences from finite resources. Another reason is the study of language acquisition -- errors during learning seem to be best explained by appealing to the over generalization of rules (i.e., applying a rule when it is not needed or appropriate).
Another aspect of our linguistic knowledge is the sense that different words in a sentence group together (phrases). "When two words join together to form a phrase, the grammatical properties of the resulting phrase are always determined by one of the two words." The word that does this is called the head of the phrase, and the phrase that adopts these grammatical properties is a projection of the head. So, a verb phrase has a verb as its head, and the verb phrase projects the grammatical properties of this verb (which means that the VP can be treated as if it were just a V).
One major structural rule in generative grammar is that every head projects a phrase.
For instance, the text describes a proform substitution rule, in which a proform can be substituted for a phrase of the same category type. What this means that there are restrictions on what can be substituted for what! Such structure must be appealed to when we account for grammaticality judgements of native speakers.
For example, the c-command account of relationship between anaphor (e.g., "himself") and antecedent (e.g., "John") requires an appeal to structure -- importantly, an appeal to the hierarchical structure of sentences. Conclusions: Language is not a long list, it is rule based, and the rules operate on tree structure tokens (in the course we will call these tokens phrase markers).
The set of rules that specify a speaker's knowledge of language is called a grammar. "Grammar can be divided into a set of rules about the structural well- formedness of words (morphology) and a set of rules about the structural wellformedness of sentences (syntax)." This knowledge is tacit -- we have it, we use it, but we are not consciously aware of it. Tacit knowledge reflects competence. How it is used reflects performance.
We want our theory of grammar to be descriptively adequate. This means that it captures the IO properties of a language, and captures important generalizations about language. One such possible theory is universal grammar. This is an ambitious goal because: it must be universal, it must be explanatorily adequate, and it must be learnable. Learnability is especially problematic given what is called Gold's paradox: languages can be proven to be impossible to learn (e.g., by a UTM) on the basis of positive evidence only. Butchildren learn languages from positive evidence alone. Because of this, there must be an innate language faculty that can solve this logical problem of language acquisition. What are the properties of this faculty?
Properties Of The Language Faculty
The first property is the presence of interfaces. The structure of language must provide an interface between represented sound and represented meaning. The syntactic aspects of represented meaning (linguistic knowledge) is represented in the logical form (LF) interface. The phonetic form interface (PF) represents the sound of language.
One important property found in the LF interface is the theta criterion (theta for thematic). In the lexicon, items have theta roles that must be filled. These roles specify the kinds of terms that must accompany a lexical item in order for a grammatical structure to emerge.
The language faculty should construct one representation to serve as the basis for LF and PF. The lexicon will do this. The lexicon is "a store of words which have attached to them a representation of their meaningand of their sound." The language faculty will also have a general operation, called Spell- Out, which will map this representation onto LF or PF. This is done by removing parts that are irrelevant to the target interface.
There must also be some sort of Merge operator, which takes 2 syntactic objects and combines them into a phrase marker by projecting one of the two. Derivation is the sequence of events involved in the construction of a tree structure.
"We can think of words, then, as containing three kinds of specifications: semantic, phonological, and syntactic. Each of these specifications can be captured in terms of features." (NB: these are all parts of the lexicon.) The basic idea is that these feature-based properties of lexical items places constraints on Merge, limiting what can be combined with what when derivation is occurring.
The syntactic sense of some sentences may require the use of empty categories.
Consider the sentence "It seems that Zara likes coco-pops." Zara in this sentence is an argument of "like"; that is, "Zara" gets a theta-role assigned by "like". Now consider the sentence "Zara seems to like coco-pops." Here, "Zara" is again an argument of "likes", but it does not appear in the correct position in the sentence for this to have happened. So, in this sentence "Zara" must have been moved from its original position in the tree structure. The Move operation does this, and is an example of a transformational rule (it takes one tree structure and converts it into another). Move is responsible for "Zara" being in a different position than we would expect.
Move leaves a silent copy -- a trace -- of what it repositioned. "The moved constituent and the trace together form a chain." The idea here is that the grammatical things operating at the trace end of the chain (e.g., theta role assignment) get transferred up the chain to apply to the moved component. The grammaticality of the moved component is enforced by properties acting on this chain. Importantly, Move can't move things anywhere -- the resulting chain must be grammatical.
Move is a property of natural languages. It is constrained, and its application varies from language to language. Why does Move exist anyway? According to the minimalist program, language representations and computations are geared towards economy. What is meant by economy? "The language faculty specifies two interface representations: an LF- and a PF-representation. Representational economy requries that each of these representations only contain elements that are interpretable at the relevant interface. For instance, an LF-representation must not contain any phonological features, because these only have an interpretation at the PF-interface." Spell-Out extracts only those parts of the underlying representation that are appropriate for the LF or PF interface (whichever is the target of Spell-Out).
"With respect to Move, economy requires that Move apply as little as possible. What this means in concrete terms is that in the course of a derivation, Move may only apply if failure to do so would cause the derivation to crash." So, Move is required in some cases to prevent such crashing -- which is why it exists in the first place.
How does the language faculty enforce agreement between a subject and a verb? This agreement is not established inside the V-projection (i.e. by theta role). Indeed, lots of evidence suggests that agreement is always established outside of where theta role is set. The assumption is that the VP is part of an inflectional phrase (IP), and that agreement involves matching features from the inflection with the component features of the VP. So, these features must first be checked. How does this checking proceed? Move is used to bring both subject and verb inside the IP. Once this is done, feature checking can be done locally. Again, we see why Move is part of the language faculty.
We can make a distinction between covert syntax and overt syntax. Overt syntax is the part of the derivation that occurs prior to Spell-Out being performed. Covert syntax is the part of the derivation that is done after Spell-Out. In other words, sometimes Move might be called to work on a structure after Spell-Out has created it. Evidence suggests that languages can be differentiated by their distribution of overt and covert syntax.
Can Merge apply after spell out? No! "If Merge gets a word out of the lexicon after Spell-Out, then this word will contain phonological features and therefore cause the derivation to crash at LF."
Why might languages differ with respect to overt/covert use of Move? Define strong vs weak features. "A strong feature has the property that it causes the derivation to crash at PF." So, strong features must be eliminated by overt syntax. But weak features may pass through Spell-Out, and later by removed by covert syntax. Languages differ with respect to strong and weak features. "For instance, in English the specifier-features of I are strong and therefore trigger overt movement, but the head-features of I are weak and, therefore, V moves to I covertly." The reverse is true of Welsh.
Language Acquisition In The Minimalist Program
How is knowledge of language acquired? To solve the logical problem of language acquisition, much is innate -- including lots of lexical info (e.g., features that can be associated with nouns or verbs). "Putting aside arbitrary lexical variation, the Minimalist Program restricts the possibilities for language variation to simple choices, called parameters, which are associated with items in the lexicon.
Pearl Street | "An Invitation To Cognitive Science" Home Page | Dawson Home Page |