Chapter 4: Phrase Structure Grammar (II)

4.1. Introduction


The PSR's proposed in the previous chapter, together with the Lexicon, represent our hypothesis about the content and structure of one's knowledge of English. Instead of random observations, we now have a theory which is relatively satisfying in at least the following ways.

First, our Grammar enables us to express patterns of grammaticality, as we have observed, in some degree of generality and in a way that captures the infinite and creative aspect of our linguistic competence. (Given the recursive devices we built into it, our Grammar is capable of generating an infinite number of grammatical P-markers and sentences.) Without hypothesizing a grammar, we would simply be enumerating the grammatical sentences. But then we would fail to capture two important aspects of our knowledge: (a) that our observations about grammaticality are clearly systematic and not random--an enumeration of these observations would obscure this fact; (b) our linguistic competence is clearly unlimited and creative, but a list of enumerated facts, no matter how long it is, is always finite.

Secondly, our Grammar provides a way to formally and explicitly characterize the notion of grammaticality. A sentence is grammatical iff it can receive a structural description from the Grammar, and ungrammatical otherwise. Thus the sentence (1) is grammatical because it is admissible under our Grammar, which characterizes it by one of the possible trees it generates:


(1) The King of France had an unhappy life.




But a string like (3) is ungrammatical, because it cannot receive a (proper) structural description from our Grammar:


(3) *The of France King joke a told yesterday.


Thirdly, the Grammar provides a way to capture our intuitions about the constituent structure of sentences. For any string of words that we may consider to be a constituent (as evidenced by movement, coordination, pronominalization, ellipsis, etc.), the Grammar characterizes it as a constituent just in case such a string may occur under the exhaustive domination of one single node in a tree that the Grammar can generate.

Fourth, the Grammar also provides a way to explain, or account for, our perceived ambiguity of certain sentences. For example, we observed earlier that utterances like (4) are ambiguous:


(4) Mistrust wounds.


We can now say, quite explicitly, that this sentence has two meanings because it can receive two different structural descriptions from our Grammar (i.e., it can be generated in two different ways by our Grammar):


(5) a. b.


In short, the hypothesis of a mental Grammar enables us to explicitly characterize our linguistic knowledge (including knowledge about grammaticality, structure, ambiguity, etc.).

In spite of this level of achievement, however, our hypothesis only qualifies as a good start at best. An adequate Grammar of English should be capable of generating all and only the (actual and potential) grammatical sentences and phrases in English. It is easy to see that our Grammar is both too narrow and too broad. For one thing, it under-generates (fails to generate all the grammatical sentences of the language). That is, there are still numerous grammatical English sentences that our Grammar cannot generate. Our Grammar is still too narrow. For example, none of the following sentences can be generated by our Grammar, though they are perfectly grammatical:

(6) a. John and his brother got into big trouble.
b. He burst into a loud cry and left the room.
c. I wonder if you could help me with this homework.
d. That they arrived on time surprised me.
e. The woman has been singing at the karoake place.
f. Colorless green ideas sleep furiously.
g. Did you ever see this movie?
h. What you see is what you get.
i. Think globally, act locally.
j. Cars can be expensive nowadays.
k. Flying planes can be dangerous.

[You are encouraged to go through each of these sentences and see for yourself that none of these sentences are generatable by our Grammar.]

For another, our Grammar is also too broad. It over-generates. It blindly allows many sentences as grammatical which are in fact ungrammatical.

(7) a. *The boy died Bill.
b. *The men would put the book.
c. *John explained Bill the theory.
d. *The man elapsed.
e. *The man from Ohio met.

[Again, you can see for yourself that each of these ungrammatical strings can be generated--assigned some structural description--by our PS Grammar.]

Furthermore, not all cases of ambiguity (which is part of our knowledge) are accounted for, as yet, by our Grammar. For example, the fact that (6k) is ambiguous is not accounted for. In fact, it even cannot be generated by our Grammar.

Also, our Grammar does not adequately characterize our intuition concerning the structure of certain strings. For example, our PSR2 characterizes noun phrases like a little boy by assigning it the structure (8):


But this obscures the fact that little and boy appear to form a unit to the exclusion of a (e.g., under the test of coordination, as you can see for yourself). That is, it seems that little boy constitutes an intermediate noun phrase, which is larger in size than a word (boy), but smaller than the entire phrase (a little boy). This fact is not accounted for by the Grammar yet.

This then brings us to the heart of theoretical syntax as an empirical science. We make observations, and then postulate hypotheses to account for them. Then we examine those hypotheses against the observations to see if they are observationally or descriptively adequate or not. If not, then we must revise our hypotheses to account for them. And then we can examine the revised hypotheses again as we proceed along to examine a wider range of data. In the rest of the course, we shall have to revise, supplement, or refute some of the assumptions we have made that are incorporated in our Grammar. We won't be able to arrive at a perfect grammar, of course, but we will proceed with this as a start. The important point is how we proceed to do this, and what counts as a good theory and what counts as a bad theory.

In this chapter we address some of these inadequacies and amend our Grammar to accommodate them. We shall refer to the grammar proposed in the previous chapter our PSG, Version 1.1.


4.2. Conjoining Structures

In Chapter 2, in discussing coordination as a test for constituency, we encountered sentences like the following:

(9) a. He bought a book and a pencil.
b. John ran up the mountain and down the river.
c. Are you crazy, or just plain stupid?
d. Your explanation is unsatisfactory but acceptable.
e. Did your father and mother come to the PTA?

(10) a. John went to the beach, but Bill stayed at home.
b. The player who scores high this year and who has never participated before will be voted Rookie of the Year.

We remarked earlier that if a string of words can be coordinated with another, then both these strings are constituents. In fact, the entire string containing these constituents and the conjunction (and, but, or etc.) is also itself a constituent. (We call the entire string a conjoined structure, and the two constituents it contains its conjuncts.) For example, it is easy to see that a book and a pencil is a constituent in (9a), under the tests of movement and pronominalization:

(11) a. What he bought was a book and a pencil.
b. He bought them.

Similarly, the following sentences provide evidence for the constituency of a conjoined PP, and a conjoined AP:

(12) a. Up the mountain and down the river, John ran.
b. John ran there.

(13) a. Are you crazy, or just plain stupid? He's both!
b. Unsatisfactory but acceptable, it sure is.

Although no similar evidence of the kind discussed in Chapter 2 is available, we can also reasonably assume that in (9e) we have a conjoined N, and in (10) we have instances of a conjoined S.

The problem posed by conjoined structures is, of course, that our PSG is, as it stands, incapable of generating these structures. What we need to do is to add to our PSG the relevant rules that will generate them. For the examples above, we could propose these rules:

(14) a. NP ----> NP conj. NP
b. AP ----> AP conj. AP
c. PP ----> PP conj. PP
d. S ----> S conj. S
e. N ----> N conj. N

Obviously, these rules are not the only ones we need. We shall also need a rule to generate a conjoined P (e.g., in and of itself), a conjoined A (e.g., a happy and innocent face), etc. In fact, it appears that all categories (including all phrasal and lexical categories) can have the form of a conjoined structure. To capture this generalization, let's adopt a rule schema like (15):

(15) X ----> X Conj. X

where X is a lexical (word-level) or phrasal category.

We also need to add to our Lexicon the following list:

(16) Conj. ----> and, or, but, . . . .

Given (15), our PSG can assign a correct structural description to all the coordinate structures above. (9b), for example, has the following structure:


Note that although a conjoined structure necessarily contains at least 2 conjuncts, there is no upper end to the number of conjuncts within a conjoined structure:

(18) a. A government of the people, by the people, and for the people shall not perish from the earth.
b. I bought a book, a pencil, an eraser, a trapper keeper, . . . . and a file cabinet.
c. He came in, (and) looked around carefully, (and) opened the drawer, and ran away with the purse.

Furthermore, as shown in (17c), when there are more than two conjuncts in a conjoined structure, more than one conjunction may be used, though only the one immediately preceding the last conjunct is obligatory. We shall be content with the following revised rule schema:

(19) The Coordinate Structure Rule:

X ----> X ((Conj) X . . . . ) Conj X

where X is a lexical (word-level) or phrasal category.


4.3. Embedding Structures

In our discussion of phrase structure, we have seen examples that illustrate the following three kinds of sentences: simplex sentences, conjoined sentences, and complex sentences. Simplex sentences are those that contain a single constituent of the type S, the root node in all our previous examples. Conjoined sentences consist of two or more sentences in equal status, connected by a coordinating conjunction. Complex sentences are those in which a sentence is embedded within another as a constituent of the latter.1 We have seen examples of complex sentences before, among them the following:

(20) a. John told the little boy he won a prize.
b. The man who was mixing it fell into the cement he was mixing.

In (20a), the sub-string he won a prize has the form and content of a sentence (which can be generated by our PSRs), but it is not an independent sentence. Rather it is embedded in a larger sentence as an object of the verb told. In (20b), the strings who was mixing it and he was mixing are embedded as a modifier within a subject NP and object NP, respectively. All of these are subordinate clauses, in contrast to the sentences within which they are embedded, which are the main clauses. The former are also called dependent clauses, and the latter, independent clauses. In relational terms, these clause types have also been termed constituent clauses and matrix clauses; or subordinate and superordinate clauses.

So far, the examples in (20) present no problem for our PSG, which can assign each of them a reasonable P-marker that captures our intuition about them. For example, the PSG can generate the following tree for (20a):


However, there are many embedding constructions that our current PSG cannot generate. Thus, in the same embedded object position we find embedded clauses preceded by an element like that, if, whether, and for:

(22) a. I know that she won the grand prize.
b. I wonder if she won the grand prize.
c. Jack and Jill asked whether we would meet them.
d. I prefer for you to tell me the truth, the whole truth, and nothing but the truth.

The function of each of these extra elements seems to serve as an interface between an embedded clause and its matrix, i.e., to introduce a sentential object or sentential complement. Each is a complementizer. The use of a complementizer is also seen with a clause embedded within NP:

(23) a. We believe the claim that the chicken existed before the egg.
b. The discovery that he was at the murder scene speeded up the investigation.
c. The theory that you have developed won't hold much water.
d. I like the book which she bought at the newsstand.
e. I understand the reason why she left.

In (23a-b) a clause is introduced by that into an NP as a noun phrase complement. In (23a) the clause the chicken existed before the egg complements the noun claim in the same sense that it helps to complement the verb claimed in I claimed that the chicken existed before the egg (in this case it serves as a verb phrase complement, without which the VP would be incomplete). Similarly, in (23b), he was at the murder scene complements the noun discovery by spelling out the content of the discovery, in the same sense that the clause complements the verb discovered in The police discovered that he was at the murder scene by spelling out what was discovered. The embedded clauses in (23c-e) are not noun phrase complements, but relative clauses. They do not complement the nouns they occur in construction with, but modify them: in (23c) the relative clause serves to identify the theory as the one you have developed, but not the one, say, she just read about in a book, and in (23d-e) the relative clauses modify the identity of the book and the reason under consideration.

How should we amend our Grammar to accommodate sentential complements to verbs and nouns, and relative clauses? To determine what to do, let's first find out the relation of a complementizer with the rest of the sentence. There is evidence that the complementizer forms a constituent with the embedded S in each case, since the string Complementizer + S passes at least one test of constituency:

(24) a. That she won the grand prize, I know.
b. What I wonder about is if she won the grand prize.
c. What Jack and Jill asked was whether we would meet them.
d. I prefer for you to tell the truth, ... and for Bill to be the liar.

(25) a. We believe the claims that the chicken existed before the egg and that God created the chicken.
b. The discovery that he was at the murder scene and that they had a fight speeded up the investigation.
c. The theory that you have developed and that your friends believe won't hold much water.
d. The book which she bought at the newsstand and which she loaned to you is a bestseller.
e. I understand the reason why she left and why she came back again.

Given that the embedded S is also itself a constituent, we can accommodate the complementizers by adding to our Grammar the following PSR:

(26) S' ----> (Comp) S

We shall use the category symbol S' (s1, called "S-bar") to represent this constituent, a clausal category that contains a complementizer and a "bare" S. Note that although some embedded clauses must take a complementizer (as in (22b, c), others may optionally do without it, as in (20a) and the following:

(27) a. The theory you developed won't hold water.
b. I understand the reason she left.

We capture this optionality by parenthesizing the Comp constituent in (26). Of course, we need to add to our Lexicon the following list:

(28) Comp ----> that, whether, if, for, which, why, . . . .

We also need to revise the two recursive rules that introduce embedded clauses, namely PSR2 and PSR3, by replacing the S with the category S':

(29) The NP Rule (revised)


(30) The VP Rule (revised)

In fact, we also have reason to want to revise our PSR1, the Sentence Rule, given the fact that the subject of a sentence can be an S' (in addition to being an NP):

(31) a. That he won the grand prize surprises me.
b. Whether he won the grand prize needs clarification.
c. For you to tell the truth would please me greatly.
d. Why he left puzzles me.

We can accommodate this fact by directly building the choice into PSR1:


(32) The Sentence Rule (revised)

Given these revisions, we can generate sentences involving sentential complementation or noun phrase modification. The following three sentences, for example, can be assigned the structural descriptions in (34). (Henceforth we shall use "C" to designate the category Comp.)

(33) a. I wonder whether the rain in Spain falls in the plain.
b. For you to tell the truth would please me greatly.
c. The man who you met at the party married the woman he loved.








As shown in (34b), we assume that the infinitival marker to is an instance of the category Aux. Note also that we have now assumed that all embedded clauses are S', not S, whether or not they actually contain a Comp. Like other phrases, an S' can be bare, containing only an S.


4.4. The English Auxiliary System


We have seen two areas of English syntax which indicate the need to expand our PSG to accommodate previously unnoticed sentences. Another area where our PSG needs expansion is the Auxiliary system. The first thing to do is to make sure that to is added to the list of Aux in our Lexicon:

(35) Aux ----> can, could, may, will, to, . . .

The auxiliary to differs from the other auxiliaries in that it must be used only with infinitival clauses, which are only used in embedded contexts. A clause is either finite or infinitival. An infinitival clause is one which is not marked for (present or past) Tense, and it requires the auxiliary to. A clause that contains any of the auxiliaries, or no auxiliary at all, is necessarily marked for Tense. For example, in the examples below, each auxiliary is inherently marked for the Present or the Past Tense:

(36) a. Mary will sing.
b. Mary would sing.
c. Mary may sing.
d. Mary could sing.

When no auxiliary is present, Tense is marked on the verb:

(37) a. Mary sings.
b. I know that Mary sang.
c. You understand what I meant.

(Even though the verbs know and understand are not overtly marked for Tense, we know that they are each necessarily understood as being in the present tense. The sentences (37b) and (38) do not refer to my or your mental state at a time in the past. We assume that the Present Tense marker (morpheme) is realized in "zero-from" (the zero-allomorph). This is to be distinguished from the infinitival do in John preferred for him to do this job. From this sentence we cannot tell whether the event do this job would take place in the past or at the present time.)

The auxiliaries we have seen up to now belong to either one of two types: the infinitival auxiliary to and the modal auxiliary (all others we have seen). Another kind of auxiliary expresses the perfective aspect of a sentence:

(38) a. Mary has written a nice book.
b. Mary had written a nice book [e.g., by the time you met her].

(39) a. They have destroyed those ships.
b. They had destroyed those ships [e.g., before the enemies arrived].

The perfective aspect takes the form of the auxiliary have and a requirement that the main verb appear in the form of Past Participle (with the ending -en or -ed, among other irregular forms). The auxiliary itself may be conjugated for Tense.

Still another type of auxiliary expresses the progressive aspect, which involves the use of the auxiliary be and the requirement that the verb appear in the form of the Present Participle (with the ending -ing). Again, the auxiliary be itself may be conjugated for Tense.

(40) a. Mary is singing in the bathroom.
b. Mary was singing in the bathroom [e.g., when I arrived].

(41) a. They are pulling each other's legs.
b. They were pulling each other's legs [e.g., when I arrived].

Note that unlike the first two types of auxiliaries which are mutually exclusive (in complementary distribution), these latter two types can co-occur with each other, or each or both with one of the first two types of auxiliaries. When the Perfective and the Progressive co-occur, the former must precede the latter:

(42) a. Mary has been singing in the bathroom.
b. Mary had been singing in the bathroom.

When either co-occurs with a modal or with the infinitival to, it must follow the latter:

(43) a. Mary must have sung in the bathroom.
b. Mary will have sung in the bathroom.
c. Mary might have sung in the bathroom.
d. Mary should have sung in the bathroom.

(44) a. Mary must be singing in the bathroom.
b. Mary will be singing in the bathroom.
c. Mary might be singing in the bathroom.
d. Mary should be singing in the bathroom.

(45) a. For Mary to have sung in the bathroom would be interesting.
b. For Mary to be singing in the bathroom would be interesting.

And when both co-occur with a modal or with to, the order must be Modal + Perfective + Progressive, or to + Perfective + Progressive:

(46) a. Mary must have been singing in the bathroom.
b. Mary will have been singing in the bathroom.
c. Mary might have been singing in the bathroom.
d. Mary should have been singing in the bathroom.

(47) For Mary to have been singing in the bathroom would be interesting.

All these observations apparently represent some systematic aspect of our knowledge of the English system. These observations are not sufficiently revealed by the single rule that we have that introduces the Aux (PSR1). To accommodate these observations, we can add to our PSG a rule that gives the internal structure of the Auxiliary constituent:

(48) The Auxiliary Rule

We need to revise our Lexicon accordingly with the following new lists:

(49) a. Inf. ----> to
b. Modal ----> can, may, must, will, shall, could, might, ....
c. Perf. ----> have
d. Prog. ----> be

Together with PSR1, our PSG now says that a sentence may or may not contain an Aux. If it contains an Aux, the Aux may optionally take the form of a choice between an infinitival or modal auxiliary, a perfective auxiliary, a progressive or a combination of two or three, in the order given. (Even though all the constituents of Aux are indicated to be optional in (48), by convention we shall say that if the category Aux is picked from PSR1, then at least one of the options in (48) must be taken.)

Before we leave this topic, a brief note on have and be. What we have seen in this section about these two items is their use as auxiliaries. In addition to this use, these items may also be used as main verbs.

(50) a. John is a big liar.
b. Mr. Thomason is the principal of the school.
c. That son of a gun is aggressive.
d. This is totally unbelievable.

(51) a. John has a good guitar.
b. Dr. King had a dream.
c. Mary and her boss had a candid conversation.

These instances of have and be do not convey the meaning of the Perfective or the Progressive aspect. Furthermore, they may each be modified by one or more auxiliaries, including the Perfective and the Progressive auxiliaries:

(52) a. That son of a gun has been aggressive.
b. Mr. Thomason must be the principal of the school.
c. John is being a sweet angel.
d. Mary must have been being coy for some time [when I saw her].

(53) a. John has had a good guitar for a long time.
b. Dr. King was having a dream [when I walked in].
c. Mary and her boss must have been having a candid conversation [when I walked in].

Each of (52) contains a main verb be, and each of (53) a main verb have, in addition to one or more auxiliary elements. We assume that these main verbs are introduced by the VP rule (PSR3), not by the Auxiliary expansion rule (48).


4.5. Summary

In this chapter we examined our PSG, as proposed in Chapter 3, against a wider body of data of English syntax, and expanded our Grammar somewhat to solve some problems of undergeneration--to account for sentences that are grammatical but are otherwise not admitted by the first version of our Grammar (version 1.1). We proposed a rule schema to generate coordinate structures, added a rule of sentence embedding involving complementizers, revised PSR1-3 accordingly to account for structures of sentential and nominal complementation and of relative clauses, and provided a substantial account of the English auxiliary system. Our revised PSG, version 1.2, thus includes the following revisions and additions, plus an expanded Lexicon:


(19) The Coordinate Structure Rule:

X ----> X ((Conj) X . . . . ) Conj X

where X is a lexical (word-level) or phrasal category.


(26) S' ----> (Comp) S


(32) The Sentence Rule (revised)

(29) The NP Rule (revised)

(30) The VP Rule (revised)

(48) The Auxiliary Rule

Although this version goes a long way toward improving on version 1.1, it should not be difficult to see that even this version is still far from being an adequate description of our tacit knowledge of English Grammar. Among the many sentence patterns that are still unaccounted for, for example, is the existence of sentences with predicative adjectival phrases, such as (50c-d). Furthermore, the internal structure of AP's needs to be further explored. There are also a host of problems of overgeneration that have not been dealt with, since the Grammar also allows numerous strings to be generated which are, nevertheless, unacceptable in the judgment of competent speakers.

It will be recalled that we will only be able to account for a small portion of English speakers' competence about their language. What I have tried to do in this chapter is to give you at least the flavor of grammar construction as part of a process of scientific inquiry. We shall have occasion to revise or expand on some of the rules provided here as we move on. But it is the process of theory building that is of utmost importance in this endeavor.

In this chapter we have only dealt with some problems of undergeneration. In the next chapter we turn to a case of overgeneration and consider a preliminary theory of the organization of the Lexicon.



Homework 4


1. The following sentences cannot be generated by our PSG, version 1.2, as revised in this chapter.

a. That son of a gun is aggressive.
b. This is totally unbelievable.
c. John must be very happy.
d. He looks stupid.
e. Mary felt very sad.

Can you revise the grammar further so as to accommodate such sentences? In particular, what can be done about the VP rule (PSR3) in view of these sentences?


2. Consider the following sentences:

a. She is afraid.
b. She is very afraid.
c. She is aware of the problem.
d. She is certain that the problems will arise.
e. John is proud of his sister.
f. I am extremely fascinated with your idea.
g. *I am afraid of Bill that he will bring problems to us.
h. *I am afraid him.
i. *Linda is fully clear the situation.


In the grammatical sentences, there is good reason to believe that each underlined string is an adjectival phrase (AP). Now, our rule for expanding an AP, given in Chapter 3, takes the simple form:


AP ----> (Deg.) A.


This is, of course, not sufficient to account for the full range of sentences (a-i).

(i) Can you revise the AP rule so as to accommodate these sentences?

(ii) Based on your answer to (i), indicate what the differences are between an AP and a VP with respect to their internal structures?


3. Draw a tree diagram for each of the following sentences. Note that some of these sentences are identical to those given in Exercise 3, but you should now assign them a tree diagram each on the basis of the revised PSG, version 1.2:

a. I pledge allegiance to the flag of America and to the republic which it represents.
b. A government of the people, by the people, and for the people won't perish from the earth.
c. I prefer for you to tell me the truth and nothing but the truth.

(Hint: treat but as a preposition, not a conjunction.)

d. The students wonder if a storm will be coming.
e. You are being polite, but it could have been worse.

(Also make use of the revised VP rule you proposed for question (1) above.)

f. The man put the book he bought on the table.
g. What you see is what you get
h. That she might have been waiting for us worries me.


1 In traditional literature, these three sentence types are termed simple, compound and complex sentences, respectively.