Syntactical complexity of speech as measured by the Index of Language Complexity (ILC)
correlates with working memory capacity

 

Summary by Bruce G Charlton of PhD thesis “Language complexity, working memory and social intelligence” (University of Newcastle upon Tyne, UK; 2002) by Christina Susan Fry, supervised by BG Charlton [ contact: bruce.charlton@ncl.ac.uk ]

 

This thesis presents a new objective measure of linguistic complexity based on measuring the frequency of ‘optional’ syntactical elements of spoken human language, and report that linguistic complexity correlates significantly and positively with working memory capacity. This is consistent with human linguistic performance being a sexually-selected trait ‘advertising’ the genetic fitness of the speaker.

Previous methods of measuring linguistic complexity lack objectivity, reliability or a convincing theoretical rationale. We devised the Index of Language Complexity (ILC) as an objective, linguistically-principled and plausible measure of syntactic linguistic complexity. 

Language complexity was conceptualized in terms of ‘optional’ syntactical elements, which serve to modify the obligatory syntactical elements, potentially enriching basic ‘factual’ communications with socially-inflected interpretations. The ILC score was defined as the combined frequency of occurrence per 100 intelligible transcribed words of 1. optional complementiser phrases, 2. adverb phrases, 3. modifier phrases and 4. adverbials.

Fifty native British-English speakers were studied (25 male, 25 female; age 22-66). Spoken language was elicited using standard stimuli such as asking for explanations of factual information, telling a story in response to pictures, encouraging speculations on general topics. Speech was recorded and transcribed.

Working capacity was measured using a Combined Memory Score (CMS) which averaged the standardized scores of the Adult Memory and Information Processing Battery (AMIPB) subtest (recall of a short story) and a modified Working Memory Span test (recall of specific words from increasing numbers of sentences, followed by a judgement task to interfere with rehearsal).

Results showed a significant positive correlation between language complexity (ILC) and working memory capacity (CMS) (see Figure). There was also a significant positive correlation between CMS and each of the four components of the ILC, consistent with internal validity of the ILC.

 

 


                Spearman's rank order correlation: 0.820, p = 0.001

The correlation between CMS and ILC score is consistent with syntactical complexity being constrained by working memory capacity. This may imply an evolutionary expansion of working memory as the crucial element enabling the evolution of human language. For instance, the ability of bonobos and chimpanzees to communicate with humans is enhanced by visual symbol boards, which may function to expand working memory capacity.

It seems that people spontaneously tend to produce language at approximately the maximum syntactic complexity allowed by their working memory capacity.

Working memory is a vital substrate of ‘creative intelligence’ which Geoffrey Miller sees as the basis for human sexual selection for ‘good genes’, because creative intelligence requires a well-functioning brain, and most deleterious genetic mutations will damage brain function.

Syntactically sophisticated language may therefore function as an honest advertisement of biological fitness, potentially explaining why articulate, eloquent speech is so attractive.

 

 

Language complexity, working memory and social intelligence

by Christina Susan Fry, supervised by BG Charlton

(PhD thesis; University of Newcastle upon Tyne, UK; 2002)

1. General introduction

Many studies have shown that ability varies from one individual to another in the comprehension of complexity in language, and that this spectrum of ability is correlated with working memory capacity. This study was intended firstly, to devise an objective and valid instrument to measure complexity in language production; secondly, to refine a measure of working memory; and thirdly to determine the relationship between the ability to produce complexity in spontaneous speech and working memory ability.

In order to do this, it was necessary first to devise a complexity metric, for the purpose of delineating and quantifying those elements that constitute complexity in production. This complexity metric, the Index of Language Complexity, was formulated on the basis of evolutionary theory, syntactic criteria, and evidence from language development and disorders. Existing assessments of complexity were examined but rejected, since many of the existing measures show a lack of objectivity, and none takes into account the contribution of what is proposed as the major reason for language to have evolved, namely, the transfer of social intelligence information.

The measure of complexity used in this study (the Index of Language Complexity) centres on the optionality of the elements expressed. In expressing a proposition, none of the obligatory elements may be omitted without causing the utterance to become ungrammatical. Optional elements may, by definition, be freely omitted, but it is these elements that help to express the speaker’s attitude, and it is by means of them that social intelligence information is signalled. Their very optionality makes such elements vulnerable to loss in language disorder, or lower ability levels.

It is hypothesised that people speak in as complex a manner as they are able, and that lower levels of complexity in spontaneous speech are due to the constraints imposed by working memory limitations. Complex language therefore advertises working memory ability, as well as demonstrating social intelligence.

A pilot study (study 1), which comprises the bulk of this thesis, was conducted on 12 subjects, firstly to determine which stimuli would be suited to the task of measuring working memory and eliciting sufficient spontaneous speech data; secondly to establish a workable method for transcription, and categorisation of the data; and thirdly to provide data on the basis of which theoretical expectations of complexity could be tested, and complexity criteria refined. Study 1 established instruments for the measurement of complexity and working memory, and a standard methodology for eliciting, transcribing, and categorising data.

The replication study (study 2) investigation was of spontaneous speech data, collected from 50 normal adults (aged 22 to 66 years) using a test interview as the stimulus. The test interview included two tests of working memory: one an existing neuropsychological test, the other a newly-created version of the working memory span paradigm. This Aural Working Memory Span test took into account word class, word length, word frequency, and age of acquisition.

Two tests of working memory were amalgamated to create the Combined Memory Score, and the Index of Language Complexity provided a quantitative measure of complexity in language production.

This thesis is organised in three parts. The first consists of introductory literature reviews of the three areas of interest: social intelligence, working memory, and language. The second part, which comprises the greater part of the thesis, deals with the pilot study (study 1): the considerations involved in the formulation of the test interview, the development of the complexity metric, and the method for transcription and categorisation. The third part sets out the details of the replication study (study 2), its findings, and their implications. The test stimuli and transcriptions are given in the appendices.

1.     Social intelligence

Social intelligence concerns how individuals perceive, recall, think about and interpret information about the actions of themselves and others (Reber & Reber 2001:687), and it is of vital importance to the reproductive success of social animals, such as humans. A socially skilled person is able to affect others positively, with the effect that he intended, and is capable in turn of being affected by others (Ylvisaker et al. 1998:271). An individual who is unable to perceive the dispositions and intentions of others will not be able to produce adaptive social behaviour, since he will be unable to take into account the complexities of social life and contrive suitable responses (Byrne & Whiten 1997:2). Impaired social intelligence involves a loss of social concepts and rules, difficulty in drawing inferences about causes of behaviour, and impaired perception of social cues (Ylvisaker et al. 1998:273). The consequences of impairments in this field can be catastrophic, outweighing those of other cognitive disabilities (Broks 1997:100). The cumulative emotional effect of social failure leads to anger, hostility, depression, and withdrawal, and the individual has no goals, no activities and no friends (Ylvisaker et al. 1998:275). In evolutionary terms, this spells “reproductive death”, since, once social success is equated to good biological fitness, social failure means lack of fitness (Humphrey 1988:21).

Clearly possessing good social intelligence skills brings personal benefits to the individual, and living in a group, all of whom possess such skills is equally advantageous. A study of people in rural Zambia (Serpell 1977, cited by Kagitcibasi 2000:337) showed that the children thought intelligent by the adults were not those who performed well on (culturally appropriate) intelligence tests, but rather those who were socially responsible and attuned to others’ needs. The highest value was placed on group well-being and interdependence, in preference to individual independence and self-reliance (Kagitcibasi 2000:337). The individual benefits from simultaneously preserving the overall structure of the group and yet out-manoeuvring others within it, but such social gamesmanship entails calculating the consequences of both one’s own behaviour, and the likely behaviour of others, and the ensuing balance of advantage and loss (Humphrey 1988:19). An individual needs considerable skill in social manipulation to achieve personal advantage to himself at the expense of others in the group, but without causing so much disruption that he is no longer accepted as a group member (Byrne & Whiten 1997:3).

1.1     Theory of Mind

In order to guess correctly how another person (B) will respond to an action of his, it is necessary for an individual (A) to be able to make inferences about B’s mental states (his beliefs and desires), and for this A needs Theory of Mind (ToM). Having ToM means that A can recognise firstly that B can have a particular belief, secondly that B’s belief may be different from A’s own belief, and thirdly that B’s belief may be mistaken. By about the age of four, most children can realise that a second person can have a different belief to their own, and that it can be a mistaken belief, but this realisation never develops in people with autism, who suffer from “mindblindness” (Baron-Cohen 1995).

It is through ToM that intentionality is interpreted: most people can cope with three levels of intentionality, some with four e.g. [A hopes [that B believes [that C thinks [that D knows X]]]] but by five levels of intentionality almost everyone is confused (Dunbar 1998:103). Although intentionality is expressed here through language, language is not necessary for access to knowledge of ToM in adults (Siegal et al. 2001:297). Two people have been reported (Varley et al. 2001) who both have severe aphasia and are unable to access language propositions in any modality, but who nonetheless pass tests of ToM. The researchers point out (Varley et al. 2001:492) that reasoning about others’ belief therefore does not take place through language propositions.

It seems probable that language is necessary for the full development of ToM. Japanese culture places great value on an indirect style of communication, avoiding the overt statement of that which could be inferred (Bishop 1997:207). Because it is of such importance in the culture to be able to anticipate the need of others (so they are not forced to make a direct request), mothers tell their children overtly what people are thinking and feeling in various situations: this has been shown in transcripts of mother-child interactions (Clancy 1986, cited by Bishop 1997:207). Deaf children born into hearing families are not usually exposed to native use of Sign until they reach school age. Communication with their family is generally limited, and does not include the normal focus on mental states that gives the basic grounding in shared beliefs (Siegal et al. 2001:298). Deaf children of hearing families perform on ToM tasks at a level comparable to children with autism (Siegal et al. 2001:298). Their difficulties do not generalise to causal reasoning in other domains, and are specific to reasoning about false belief (Siegal et al. 2001:298). The late signers’ problems with ToM reasoning can persist into late adolescence (Siegal et al. 2001:298).

It has been proposed (Cosmides & Tooby 2000) that a specialist, expert cognitive system exists specifically for reasoning about co-operation for mutual benefit and the detection of cheating. This sort of cognitive specialisation would have resulted from some of the most important adaptive problems faced by our ancestors: the need to optimise social interactions, and to detect and understand the consequences of behaviours both reliably and economically (Cosmides & Tooby 2000:1259-60).

If social intelligence constitutes a cognitive module, in the Fodorian sense, a specific neural architecture dedicated to the operation of that module would be expected. The central element in social intelligence is its intimate connections to emotion and the “irreducible richness of the spectrum of affects” (Brothers 1990:39). The individual receives powerful signals through the emotional coloration of his social experiences (Brothers 1990:41). These links with emotion are reflected in the brain areas subserving social intelligence.

1.2     Neural architecture of social intelligence

There is growing evidence that social intelligence is represented in the brain by the ventromedial prefrontal cortex, the amygdala, and the right somatosensory cortex and insula (Adolphs 1999:470). In a typical real-life situation the component structures of the system work in parallel (Adolphs 1999:477). The amygdala is involved in the fast and automatic evaluation of stimuli with emotional or social importance, and in allocating resources to process stimuli that are ambiguous but potentially important (Adolphs 1999:474,477). The right hemisphere is predominant in attentional systems which select which external stimuli should be focused upon, and hence are essential to survival (Geschwind & Galaburda 1987:45). The ventromedial prefrontal cortex is involved in associating perceptual representations of current stimuli with elements of previously encountered situations, triggering re-enactments of the corresponding emotional state; while the right somatosensory cortex provides the detailed representation of the body state (Adolphs 1999:474,477). The right hemisphere is important in both the subjective experience and external expression of emotion, as well as the recognition of emotion manifested by others (Geschwind & Galaburda 1987:45).

Damage to the frontal areas of the brain can cause impairments in social behaviour, while leaving other cognitive functioning intact (Broks 1997:113), and published case studies describe people who score in the high or superior IQ range, yet are unable to hold down a job or sustain marriages or friendships. People who have suffered damage to their ventromedial prefrontal cortex are unable to plan future activity, or to respond to punishment, and show inappropriate social behaviour, with a lack of concern or empathy for others (Adolphs 1999:474). People with orbitofrontal damage show a dissociation between fully intact abstract knowledge about social situations and badly impaired ability to evaluate real-life situations or draw conclusions about motivations, because they have lost access to the internal cues that should be generated by the actions of others (Brothers 1990:36-7). The somatic marker mechanism (SMM) is the means whereby the internal cue of the emotional value of an action is acquired, represented, and retrieved (Adolphs 1999:475).

1.3     Somatic marker mechanism

Making a good decision means selecting a response that will be ultimately advantageous in terms both of reproductive success, and of the quality of continued existence (Damasio 1994:169). The thought of a bad outcome brings about a transitory unpleasant “gut feeling”, which leads to the rejection of the response that triggered it: this is the somatic marker at work (Damasio 1994:173). The “gut feeling” is the somatic (body) state, the current state of all the bodily systems: the muscles, joints, skin, nerves, viscera, blood chemistry, etc. (Charlton 2000:160).

The neural substrates of most of the structures important to social intelligence reasoning are also important to normal emotional functioning (Adolphs 1999:477). Damasio (1995:20) describes an emotion, which is expressive, as being a collection of changes to the state of the body and brain responding to the content of thoughts about a particular entity or event. By contrast, a feeling is experiential, being the awareness of the changes induced by the emotion, juxtaposed with the mental image that triggered those changes (Damasio 1995:20-1).

The simultaneous holding of both the changes wrought by the emotion and the triggering mental image is performed in working memory, over a time scale ranging from hundreds to thousands of milliseconds (Damasio 1994:197). Working memory constitutes the arena where perceptual feedback about body states marks possible outcomes as positive or negative, and this evaluative process influences the operation of continued and attention working memory (Damasio 1994:197-8).

Associations between a particular class of situation and a particular body state are made in the ventromedial prefrontal area (Damasio 1996:1414). These links are dispositional (implicit) representations, potential patterns of activity in small assemblies of neurons, which hold the potential to reactivate an emotion (Damasio 1994:102). When there arises a situation some aspect of which has previously been encountered, related dispositions are activated, leading to the recall of pertinent information and emotional marking (Damasio 1996:1415).

The somatic marker increases the accuracy and efficiency of decision making, because it leads to an immediate rejection of an unpleasantly-marked possibility, and thereby lessens the number of alternatives left to be chosen between (Damasio 1994:173). At a conscious level, the somatic marker mechanism marks outcomes as positive or negative, and thereby leads to the deliberate avoidance or pursuit of a particular response (Damasio 1991:406). The somatic marker mechanism also works covertly, exciting or inhibiting subcortical neurotransmitter systems, by which means it provides subtle markers that suffice to interrupt an ongoing thought or action, switching attention to one set of representations rather than another (Damasio 1991:406). By biasing cognitive processing in this way, somatic markers steer decision making towards those outcomes that are advantageous to the individual (Adolphs 1999:475).

This kind of emotional processing not only guides the individual’s own behaviour, but can also be used to create models of other people through simulation (Adolphs 1999:477), resulting effectively in Theory of Mind. What another person is likely to do can be predicted by running in one’s brain a simulation of the same processes that the other person is running in his (Adolphs 1999:476). In other words, the somatic marker mechanism is used in the mental modelling and evaluation of past and future events: it is therefore vital for planning.

1.4     Summary

This chapter has established that social intelligence is necessary to humans for optimal biological functioning, and has discussed some of its probable neural mechanisms.

The next chapter looks at working memory, which is essential for the manipulation of social intelligence representations.

2.     Working memory

The focus now moves to working memory, because social intelligence, in the guise of evaluation and planning, requires working memory for the temporary storage and manipulation of information pertinent to social intelligence.

This introductory section outlines what is meant by working memory. Section 3.2 deals with the established model of working memory, including discussion of the recently-proposed episodic buffer in section 3.2.1, and of long-term working memory in section 3.2.2. Working memory capacity is the topic of section 3.3, followed by chunking (in section 3.3.1) and individual variation (in section 3.3.2). Section 3.4 goes into working memory and ageing, and section 3.5 discusses working memory and social intelligence.

2.1     Working memory

There is general agreement (Gordon 1997:306-7) that information (including facts about language) is stored cortically where it is used, in that region or combination of regions responsible for the underlying functions. When access to stored information is needed, activation is required not only in the relevant storage area but also in the multiple prefrontal regions deemed to be involved in memory search and co-ordination, as well as in the temporary storage of information and the intermediate products of processing (Gordon 1997:308). This is the function of working memory (WM), acting as a scratchpad that allows both old and new information to be briefly maintained in an active and manipulable form (Gordon 1997:307). The contents of long term memory (LTM) would be worthless, were it not for WM, which enables the stored long term memories to be brought together with ongoing sensory input in order to meet current demands (Bradshaw & Mattingley 1995:209). People with dorsolateral prefrontal brain damage exemplify such a deficit, as, despite having relatively intact long term memory, they are unable to integrate past events and immediate requirements on a moment to moment basis (Bradshaw & Mattingley 1995:209).

The defining quality of WM is its transient, on-line nature, providing a temporal bridge between both internally and externally generated events (Goldman-Rakic 1997:559). Its purpose is to bring representations to mind, and to keep them activated while cognitive processes operate on them. There is evidence at the cellular level for the role of prefrontal neurons in the maintenance of representational information in the absence of the original stimulus (Goldman-Rakic 1996:1448). It appears that WM hinges on a network of brain areas, depending on the task, stimuli and strategy involved, with the prefrontal regions playing an executive, supervisory role (Bradshaw & Mattingley 1995:210). Encoding and retrieval of semantic material, as well as other verbal processes engages inferior lateral, and/or anterior prefrontal regions, in addition to the insula (Goldman-Rakic 1996:1450). It is postulated that the increase in WM in humans has not so much added to the length of activation, but has rather allowed simultaneous access to more, and more complex, representations (Charlton 2000:181).

2.2     Working memory model

The now standard view is that WM is a tripartite system (Baddeley 1996:13469), with a central executive controlling attention, and two slave systems: a phonological loop to hold and manipulate speech-based information, and a visuospatial sketchpad functioning in the same way for visual images. Baddeley remarks (1996:13469) that, although it is far from complete, the tripartite model has been remarkably successful, both in accounting for experimental data, and in providing a framework for investigation.

Figure 3-1 Working memory tripartite model

The central executive is seen as a limited-capacity attentional system that controls the phonological loop and visuospatial sketchpad, and relates them to long-term memory (Baddeley 1999:66). The central executive has the capacity to focus attention, and to switch attention from one focus to another, as is needed to co-ordinate social behaviour (Baddeley 1996:13471). It is also a fractionable system that is involved more in encoding than in retrieval (Baddeley 2001:117). Cappa notes (2000:74) that executive functions (including action planning, reasoning, and problem solving) cannot be modular in the Fodorian sense of being computationally autonomous and informationally encapsulated, because they require access to unrestricted information, in order to function adequately. Moreover, functional imaging studies (some using functional magnetic resonance imaging (fMRI), and some positron emission tomography (PET)) suggest a common network of regions in the mid-dorsolateral, mid-ventrolateral, anterior insular, and anterior cingulate regions, which are recruited to solve diverse cognitive problems such as response selection, WM maintenance, and stimulus recognition (Duncan & Owen 2000:476,480).

An alternative view (based on the functional architecture of non-human primates) is that the central executive may be seen as an emergent property of the interactive operation of multiple domain-specific processors each connected to domain-relevant storage sites in posterior regions and to motor pathways (Goldman-Rakic 1996:1445,1450-1). This conception seems plausible, in view of evidence about frontal-subcortical circuits, sharing a prototypic structure, which are contiguous while remaining anatomically segregated, and about frontal lobe syndromes which are recapitulated by the similarities in performance deficits caused by damage at various levels of each circuit (Cummings 1993:873-7). The multiple domain model is also compatible with the idea of convergence zones holding a record of temporal conjunctions of activities in other structures (Damasio 1996:1416).

The phonological loop holds auditory information for some 1˝ -2 seconds (crosslinguistically constant) before the traces decay, although they may be maintained for about 10 seconds by articulatory rehearsal (Fabbro 1999:94). In a study of regional cerebral blood flow (rCBF), the two components of the phonological loop were localised in different areas of the brain: the phonological store in the left supramarginal gyrus (BA40), and the subvocal rehearsal system in the left Broca’s area (BA44) (Paulesu et al. 1993:344).

The phonological loop is thought to have evolved as a system to mediate language learning, with the primary purpose of storing novel speech input while more permanent memory records are constructed (Baddeley et al. 1998:170,158-9). A visuospatial “phonological loop” is claimed to exist in prelingually deaf signers. The internal structure is said to be strikingly similar to the phonological loop for speech, although American Sign Language does not appear to support as long a memory span as does the auditory phonological loop (Wilson & Emmorey 1997:317,319).

Tasks using the visuospatial sketchpad are thought to place heavier demands on the central executive, as many uses of visual imagery are less automatic than is phonological coding (Baddeley 1996:13470). The more visual aspects of imagery depend on the occipital lobes, while the more spatial aspects reflect activity in the parietal lobes, although the frontal lobes may also be involved in an imagery controlling function (Baddeley 1999:64-5). Imaging studies using PET found spatial WM to be mediated by a network of predominantly right hemisphere regions: the premotor and superior parietal areas mediate spatial rehearsal, while the inferior posterior parietal and anterior occipital areas mediate the storage function (Smith & Jonides 1998:12065).

There is evidence for other WM buffers: Jonides et al. (1996:82) report a number of studies that suggest the existence of a motoric WM, an auditory memory that does not store a phonological code, a semantic or propositional code and a dissociation between spatial and visual-object information.

2.2.1     The episodic buffer

Baddeley (2000a) has recently proposed a fourth WM component, the episodic buffer. The episodic buffer uses a multimodal code to provide temporary storage of information from the subsidiary systems and from LTM, binding such information into a unitary episodic representation (Baddeley 2000a:417). The buffer holds episodes by which information is integrated across space and possibly across time (Baddeley 2000a:421), hence it is called episodic.

Figure 3-2 Episodic buffer (Baddeley (2000:421))

The shaded areas in the lower box represent crystallised cognitive systems capable of accumulating long term knowledge, while the unshaded areas represent fluid capacities such as attention and temporary storage (Baddeley 2000a:421).

Baddeley (2000a:420) reports evidence for a store capable of operating beyond the timescale assumed for the slave systems, that can temporarily hold and manipulate information such as that involved in the comprehension of a prose passage, which involves the activation of existing structures in LTM. Information such as a schema from LTM may be used to organise new material into chunks, but this raises the question of how this information is integrated and where the newly-formed chunks are stored (Baddeley 2000a:419).

Evidence for the episodic buffer as an integrated store of information from different modalities and systems is adduced (Baddeley 2000a:421) from the effect of visual similarity on verbal recall and from the impact of meaning on immediate recall of sentences and prose. The episodic buffer is assumed to have a limited capacity, to be controlled by the central executive, and to play an important role as a conduit for information passing into episodic LTM, and for retrieving such information (Baddeley 2000a:421). As well as storing a limited number of chunks of material, the episodic buffer is a modelling space for the combination and manipulation of information, to plan future actions or interpret recollected experience (Baddeley 2001:118).

The episodic buffer emphasises the integration of information, and is conceived of as using a common multi-dimensional code, so that it can serve as an interface between a range of systems that each use different codes (Baddeley 2000a:422,421). It is anticipated that the episodic buffer will have a limited capacity because of the computational demands brought about by the binding problem caused by simultaneously accessing a wide range of codes (Baddeley 2000a:421). The binding problem concerns how to bind together all the aspects of a complex object or representation, so they are perceived as pertaining to the same entity. Binding may be either static when a representational unit stands for a specific conjunction of properties, or dynamic when representational units are tagged to indicate whether they are bound together, so bindings of units in the representation stand for conjunctions of properties (Hummel 1999:85). Although dynamic binding is more flexible than is static binding, one of its disadvantages is that it requires much more attention and WM, such that there are likely to be firm limits as to the number of distinct tags available for dynamic binding (Hummel 1999:85). Perception of illusory conjunctions of properties by people with neurological deficits suggests that the parietal cortex plays a role in the binding problem (Treisman 1996:174).

Baddeley suggests (2000a:421) that the central executive can retrieve information from the store in the form of conscious awareness, and that the central executive can reflect on, manipulate and modify that information. The episodic buffer constitutes a mechanism for creating new cognitive representations, since the central executive can influence the content of the buffer by attending to a given source of information, whether it be perceptual, from another WM component, or from LTM (Baddeley 2000a:421).

2.2.2     Long term and working memory

The relationship between WM and LTM is not clear-cut, and a number of differing conceptions of the relationship exist (Collette et al. 2000:49). Baddeley’s conception (1996:13472) of WM is as a gateway, providing an interface between perception, attention, memory and action: he specifically rejects (2000a:422) the idea that WM might be simply the activated portion of LTM. For Logie (1996:41) WM is seen as a workspace, a set of cognitive functions to temporarily store and process information, with the slave systems acting as working buffers for information that has yet to be processed or is about to be recalled overtly, since sensory input passes through LTM to reach WM (Logie 1996:55,41). Cowan envisages WM as consisting of a limited-capacity focus of attention, plus a temporarily-activated portion of permanent memory information, including some automatically activated information (Cowan 1998:77).

Evidence from highly skilled performance by experts led Ericsson and Kintsch to propose (1995:211-3) that WM includes a mechanism (long-term working memory (LT-WM)) based on storage in LTM, which is kept accessible by retrieval cues. Expert skill in particular domains and activities (e.g. mental calculation, chess, medical diagnosis, or remembering dinner orders) allows an individual to acquire LT-WM and hence to extend his WM for that particular activity (Ericsson & Kintsch 1995: 234-8,213-4). The increase in WM capacity seen in experts is specific to their domain of expertise, and is related to their level of skill (Ericsson & Kintsch 1995:238).

Text comprehension is claimed to be an acquired skill, so, rather than maintaining temporary information in WM, skilled readers have the ability to access LTM from retrieval cues held in the active portion of WM (Ericsson & Kintsch 1995:228-9). Text comprehension has, of course, only been possible as a skill base since the invention of writing, and prior to that all comprehension would have been of verbal material, the complexity of which would be restricted by performance limitations on the speaker. There must be a certain element of acquired skill in language production ability, such that people whose livelihood relies on their ability to communicate effectively (such as lecturers or barristers) will necessarily have had many hours of rehearsal and will no doubt have built up routinised elements. However, it seems likely that much of their expertise in communication would be restricted to imparting their specific field of knowledge, leaving them at no particular advantage in normal social situations.

2.3     Working memory capacity

Miyake and Shah (1999:464) raise the question of the functional or evolutionary significance of WM limitation, asking why WM should be limited, since individuals with large WM capacities have an advantage over those with smaller capacities.

There are two elements to consider with regard to WM limitation: the first is developmental, in that WM capacity will necessarily be constrained and limited by the volume taken up by the neurons dedicated to WM. It therefore seems likely that childhood limitations on WM are the result of the brain’s physical size and lack of myelination. WM is known to increase during childhood, and a child’s brain undergoes huge growth in infancy, doubling in weight in the first year of life, and attaining three-quarters of adult size by the age of three (Smith 1970:342).

Verbal WM span increases dramatically between infancy and adulthood: a four year old child has a digit span of two or three items, whereas a fourteen year old has a span of about seven (Gathercole & Baddeley 1993:25). Large individual differences in capacity are found in childhood: 10% of a group of three year olds had a digit span of four, whereas 36% of the group only achieved this span two years later (Baddeley et al. 1998:159). Listening span ability is reported (Siegel 1994, cited by Gathercole 1999:411) to increase steeply until the age of sixteen, in contrast to other memory abilities in which developmental increases flatten off at about 11-12 years (Gathercole 1999:411).

The second element to consider about WM limitation is that of inter-individual variation, between adults. As Miyake and Shah point out, (1999:464) large WM capacity is adaptive, being positively related to factors such as intelligence and status. In regarding limitation as something to be selected for in evolutionary terms, they seem to have disregarded the deleterious effects that disease, adverse environments, and random impairments necessarily have on WM. There is no selective advantage for low WM: instead, the variation in human WM (like the variation in height or symmetry) is due to disease or adverse environment, which result in individuals being dragged down, to a greater or lesser degree, from the optimum (Bruce Charlton, pers. com., 28th August 2002). It is also probable that variations in working memory are genetic in origin. It has been reported that more than half of the individual differences in adult IQ test performance are due to genetic factors (de Geus et al. 2001:489), and research on twins suggests that individual differences in working memory and general cognitive ability arise from individual variations in frontal lobe functioning, with a significant part of the variance in working memory being due to genetic factors (Wright et al. 2001:54). It is also suggested that the genetic contribution to cognition may not fixed, as new genes appear to be expressed in the course of brain maturation (de Geus et al. 2001:493).

Nevertheless, a number of possible explanations for WM limitation are advanced by Miyake and Shah (1999:464): firstly, to prevent excessive brain activity (that might create positive feedback loops) and to promote focused and coherent processes; secondly, as a result of limitations due to synchronous oscillations (one of the suggestions put forward for tackling the binding problem); or thirdly, to facilitate certain kinds of learning (Miyake & Shah 1999:464-5). They claim (Miyake & Shah 1999:465) that there is evidence that severely restricted WM may be useful in detecting subtle statistical regularities in the environment, and that this ability is crucial to language acquisition. Kareev remarks (1995:268) that, in an environment where some order exists, small samples mean that people (and especially young children) are likely to encounter examples suggesting the presence of that order. Cowan (2001:108) gives the example of a correlation between height and voice pitch being more likely to be noticed in a sample of 4-8 individuals than across a larger sample, so a smaller sample increases the chance that a moderate correlation would be noticed at all. In this way, a limited WM capacity helps to avoid missing a correlation, but gives a higher likelihood of false alarms, although these are refuted by subsequent data (Kareev 1995:267-8). This is widely acknowledged to be the situation in child language acquisition, where the child moves from a subset of the adult language and moves towards the superset (Haegeman 1991:419).

Making an inference places heavy demands on WM, as it requires storing information from previous sentences, while concurrently processing new information, so people with a lower WM capacity not only take longer to process syntactically complex information, but they also have considerably lower accuracy in comprehension (Just & Carpenter 1992:129). People with low WM spans may be doing fundamentally different things from those with high spans, when reading (Daneman & Carpenter 1980:464). There is undeniable variability among individual brains as to the size and location of cortical areas (Brown & Hagoort 1999:8) and individual differences in cognitive performance must be expected as an inevitable concomitant of this. Evidence from electroencephalograms (EEGs) suggests that people who score highly on the WAIS-R test of general cognitive ability were better able to focus and sustain attention during a WM task (Gevins & Smith 2000). EEG results indicated that subjects with high ability developed strategies that made relatively greater use of parietal regions, whereas those with low ability relied more exclusively on frontal regions (Gevins & Smith 2000). Activation of different brain areas have been reported (Raichle 1993:584) for people holding lists of nonwords, depending on whether people’s performances on the task were good (premotor and cingulate) or bad (occipital and cerebellum).

2.3.1     Chunking

Chunking is the process whereby memory is increased by gathering together bits of cognitive or perceptual information into larger units, known as chunks, which are then processed as single units. What constitutes a chunk is fairly elastically defined: as Simon puts it (1974:484) “a chunk of any kind of stimulus material is the quantity that short-term memory will hold five of”. A chunk functions as a single entity, so it is not possible to access relations between items within a chunk, although relations between the chunk and other chunks, or other items, can be accessed (Halford 1998:145).

The mean memory capacity among adults is three to five chunks, with a maximum range of two to six chunks in individuals (Cowan 2001:91,114). By building larger and more enriched chunks, with each chunk holding more information, the amount of information held can be increased, although the number of chunks remains the same. As larger numbers of concepts need to be organised into a single chunk, WM is involved to a greater extent, because all of those concepts must be held simultaneously within WM in order to be grouped together into one chunk (Daneman & Carpenter 1980:464). However, as Daneman and Carpenter point out (1980:464), although the actual process of forming rich chunks imposes a temporary strain on WM, it nevertheless brings a benefit in that having a quantity of concepts recoded as one chunk then reduces the load on WM and releases functional capacity for subsequent processing (Daneman & Carpenter 1980:464). An undergraduate subject was reported to have increased his digit span (presumably forward-span, rather than reverse-span) from under 10 digits to 80 digits, by chunking the numbers into meaningful units (representing foot race times, ages, or dates) and then organising these chunks into a hierarchy (Ericsson et al. 1980, cited by Bock 1987:341). The limit on the number of chunks was still seemingly observed, however, as the subject gathered the digits into groups of three or four digits, and then generally used three groups in his supergroups (Ericsson et al. 1980, cited by Cowan 2001:104).

Daneman and Carpenter propose that the chunks formed by subjects with higher spans will be qualitatively different from, and richer than, those formed by lower span subjects, and that the difference between good and poor performers lies in the efficiency of their processing (Daneman & Carpenter 1980:456, 461, 464-5). Differences in processing efficiency may be attributable to a greater proportion of the available WM capacity being absorbed by slower and less efficient processes (Daneman 1984:368). The time devoted to lower level processes, such as word retrieval, could not then be used for other, higher level, processes (Daneman 1984:371).

2.3.2     Individual variation

There seems to be general agreement (Kintsch et al. 1999:420) that no single all-encompassing factor exists that is responsible for WM capacity limitations. Although computational and architectural limitations (assumed to be universal) may differ across individuals, it appears that individual differences are based on the characteristics of individuals, and may be related to knowledge and skill (Kintsch et al. 1999:421). This proposal ignores disease-related effects, which bring about decreased levels of attention and concentration, as well as effects from brain damage or mental handicap.

Differences in individual WM capacity are thought (Engle, Kane & Tuholski 1999:104, 103) to reflect differential ability in controlled processing, required to maintain goals in the face of interference or distraction. Controlled processing therefore pertains to the functioning of the central executive, rather than the WM system as a whole (Engle, Kane & Tuholski 1999:104).

Individuals are thought to differ in the functioning of the prefrontal cortex, especially the dorsolateral region (BAs 9, 10, and 46), which is the area that is critical to both WM and controlled attention abilities (Engle, Kane & Tuholski 1999:105,116-7). Frontal lobe injury leads to impaired executive control over other cognitive activity, which results , inter alia, in poor abstract thought, reduced skill in problem solving, and a failure to plan ahead or monitor behaviour (McDonald 1998:492). The cognitive deficits associated with frontal lobe damage show up particularly in everyday activities, as carelessness, unreliable judgement, poor adaptability to new situations, and blunted social sensibility (Lezak 1995:91). There is also some evidence that the prefrontal cortex is involved in performance on tasks that reflect gF, general fluid intelligence (Engle, Kane & Tuholski 1999:122). The prefrontal cortex is considered to be critical to both the functioning, and individual differences in WM, controlled attention, and fluid intelligence (Engle, Kane & Tuholski 1999:122).

2.4     Working memory and ageing

Cognitive function declines progressively across the life-span, and this decline is both regular and of considerable magnitude (Park 2000:6). The decline has been shown in tests of speed of processing, WM, and both free- and cued-recall (Park 2000:6). Four mechanisms have been proposed to account for age-related decrements in cognitive functioning, namely speed of processing, WM, inhibitory function, and sensory function (Park 2000:8). Cognitive slowing means that, in a complex cognitive task, older adults may no longer have available to them the products of the earlier stages of processing; while the selection of the most recent among multiple-choice answers is increased by aural presentation in place of written presentation of answers (Park 2000:10-11). Deficits in inhibition have been cited as the reason why older people are more likely to maintain information that is subsequently disconfirmed (Park 2000:15), although Tompkins et al. (1994) found that their subjects without brain damage had no difficulty in revising inferences. Lindenberger and Baltes (1994) (cited by Park 2000:16-7). found that nearly all the variance in a wide range of tests of cognitive ability was accounted for by sensory functioning, as measured by simple tests of visual and auditory acuity, in their study of a large sample of older adults (aged 70-103 years).

Age is reckoned to have a greater effect on nonverbal than verbal intelligence, as exemplified by scores on the Wechsler Adult Intelligence Scale (WAIS), where performance IQ (PIQ) begins to decline around age 50, whereas verbal IQ (VIQ) does not decline until about 60 years (Reuter-Lorenz 2000:97). A possible confounding factor is that PIQ tests demanded inhibition of irrelevant elements in the stimuli, whereas VIQ tests had minimal selective attention requirements, and that both inhibitory processes and selective attention are deemed to depend on prefrontal cortex, which is affected disproportionately by age (Reuter-Lorenz 2000:97). Decreased activation in the left dorsolateral frontal region has been observed in normal ageing (Grady et al. 1995, cited by Cappa 2000:71), and, since this region is implicated in encoding semantic information, it is suggested (Cappa 2000:71) that this could be the neural correlate of defective encoding, and hence age-related memory impairments.

A study by Wingfield et al. measured performance in groups of younger and older adults on a spoken version of the Daneman and Carpenter WM span test, and found that whereas the younger group’s average WM span was 4, the average WM span was only 2.5 in the older group (Wingfield 2000:183). Although it is generally agreed (Grady & Craik 2000:224) that memory performance declines with age, some areas of memory show a greater decline than others. Recognition memory, and short-term memory (tested by Digits Forwards) suffer slight age-related decrements; whereas losses on free- or cued-recall, and WM tasks are substantial (Grady & Craik 2000:224-6). PET imaging studies show that, whereas younger adults have left lateralised prefrontal cortex activity during VIQ tasks, frontal cortex activity in older adults is bilateral (Reuter-Lorenz et al. 2000, cited by Grady & Craik 2000:226). It is suggested (Grady & Craik 2000:226) that this recruitment of frontal cortex in older adults could be compensatory.

2.5     Working memory and social intelligence

Having established some features of WM, the next step is to discuss its relationship with social intelligence.

Baddeley and Logie (1999:28-9) give considerable importance to aspects relevant to social intelligence, in their definition of WM as

“those functional components of cognition that allow humans to comprehend and mentally represent their immediate environment, to retain information about their immediate past experience, to support the acquisition of new knowledge, to solve problems, and to formulate, relate, and act on current goals.”

WM must necessarily play an integral part in manipulating social intelligence information, by permitting the creation and orchestration of complex representations of other individuals and social scenarios. It is argued (Nelson 1990, cited by Naito & Komatsu 1993) that the basic function of memory is to provide guidance for action and to predict what will happen. The definition of intelligence has, similarly, been proposed to be the ability to guess correctly, and the ability to discover unexpected orderliness (Barlow 1983:208). In view of the congruence between these functions, it is to be expected that there will be a relationship between memory and intelligence.

WM, particularly the central executive component, is considered to be highly connected with general fluid intelligence (gF), the ability to solve novel problems and adapt to new situations (Engle, Laughlin, Tuholski & Conway 1999:310, 313). A frequently-cited earlier study (Kyllonen & Christal 1990:426) claims that general reasoning ability and WM capacity are very highly correlated. It should be noted, however, that some of their WM tasks were extremely similar to their reasoning tasks. For example, AB Grammatical Reasoning (a reasoning task) and ABCD Grammatical Reasoning (a WM task) both required subjects to process sentences of the form A precedes B; while both Mathematics Knowledge (a reasoning task) and ABC Numerical Assignment (a WM task) required subjects to solve equations. It should not, therefore, be surprising that the overall WM and reasoning tasks correlated so well, as they seem to have been testing largely the same abilities. Although they claim (Kyllonen & Christal 1990:392) that their WM tasks test both storage and processing, at least one of the tasks (Digit Span) was a recognition, not a recall task; whereas other tasks required subjects first to store information, then to process that stored information. The tasks were therefore successive, rather than simultaneous.

The ability to guess correctly requires the efficient use of all the available information (Barlow 1983:208), and such information will presumably be held and manipulated in WM. A problem arises where the amount of available information is overwhelmingly large: it is at this point that Barlow’s second aspect of intelligence, that of discovering unexpected orderliness, comes into play. The intelligent individual is someone capable of finding meaningful associations in an enormous quantity of data, since this requires knowledge of the associative structure of a body of information (Barlow 1983:209). The task of guessing correctly in the face of insufficient information or completely novel circumstances similarly requires intelligence (Barlow 1983:208-9), although it is to be assumed that any such situations will be internally represented as social intelligence information.

2.6     Summary

This chapter has surveyed what is meant by working memory, and the current model of its functioning. It has discussed long-term working memory, working memory capacity, chunking, and individual variation in ability. The effects of ageing on working memory were reviewed, as was the relationship between working memory and social intelligence.

The theme of social intelligence is pursued in its relationship with language, the topic of the next chapter.

3.     Language

This chapter examines firstly the interaction between language and social intelligence, in section 4.1; then that between language and working memory, in section 4.2. Section 4.3 deals with what constitutes complexity in language production. The measurement of complexity is discussed in section 4.4, and this is followed by the argument for individual variation in language, in section 4.5.

3.1     Language and social intelligence

Predication, the sharing of information, has been described as the “core business” of language (Levelt 2000:152). Much, however, hangs on the kind of information that is to be shared. Although, as Levelt points out (2000:151-2), language can be used for exchanging experiences, transmitting skills, and planning joint actions, a more likely scenario is that proposed by Dunbar (1996:123), namely that language evolved as an aspect of social intelligence, for the promulgation of gossip, the exchange of socially relevant information, and the management of reputations.

Language is dependent on two component systems: a social cognition network responsible for lexical acquisition, and a grammatical system responsible for utterance analysis and computation (Locke 1999:380). Locke (1998:191) distinguishes between speaking which conveys information encoded in spoken language, and talking which is sound-making to maintain social cohesion with others. Talking is socially-oriented, and is heavily reliant on support from non-verbal communication (Locke 1998:192). He points out that, although propositional speaking often does occur during talking, it is optional in many circumstances (Locke 1998:192). Children are involved in the social interaction of talking, long before they develop speaking to exchange information, but through talking, they become aware that the activity can be used to communicate thoughts (Locke 1998:192; Locke 1999:378).

From the earliest stage, infants respond to vocal affect, and in this way come to recognise and predict caregivers’ behaviour (Locke 1999:382). There is a clear consequence for survival in the child’s ability to monitor the affect of others who are capable of judging danger, as he becomes increasingly mobile (Locke 1996:256). Later, when he names things, the child demonstrates to others that he knows and can say these names, thereby signalling his claim to personhood and membership of the social group (Locke 1999:383).

Under the Chomskyan paradigm, the principles of language are assumed to be innate and invariant, with the functional category options that instantiate a given language being fixed during the process of language acquisition (Radford 1997:12). Functional categories represent such concepts as definiteness, perfectivity, passivity, habituality, and relationships between elements, all of which may be considered necessary for conveying social intelligence information. It is claimed (Cinque 1999:106-7) that functional categories are represented in adult grammars in a universally invariant order (although any particular language may instantiate only a subset of them). During the process of acquisition, the child has access to only those functional category options relevant to his current stage of development.

The emergence of behavioural patterns is related to the functional maturation of the brain and cycles of myelination (Lecours 1975:121). If the information conveyed by functional categories depends on the representation of body state feedback, the timing of the availability of the functional categories could depend on the myelination of the areas concerned with its cortical representation. This could provide an explanation for the child’s sequential awareness of particular elements of the triggering data during language acquisition.

A young child, even a child with impaired language, needs only one exposure to a new word to acquire its meaning (Dollaghan 1987:220). A young child acquires amazing numbers of new words every day: a two year old knows some two hundred words (Locke 1997:277), but by six, he knows some ten thousand, and by eighteen some sixty thousand (Bloom & Markson 1998:68). Since it is assumed that exposure to a new word evokes an emotional response, subsequently generating a feeling, and dispositional representation, rapid access to emotional responses is required. It is notable that, by three years, the myelination of the subcortex is complete (Thatcher et al. 1987), which would speed up emotional responses. As the somatic marker mechanism is presumed to be associated with the meaning of words and propositions, it is tempting to speculate that somatic markers may constitute Logical Form, the hypothesised interface between the language faculty and the conceptual-intentional system of cognition (Chomsky 1995:2).

When a word is learned, activations pass between the word-formation system (Wernicke’s area) and the motor-control system (Broca’s area), via both the cortical route (arcuate fasciculus), and the subcortical route through the basal ganglia and thalamus (Damasio & Damasio 1992:67). Language processing is thought to involve the parallel operation of both the cortical “associative” and subcortical “habit” systems (Damasio & Damasio 1992:67). This involvement of the basal ganglia implies that word learning is mediated by the SMM, as the connection is formed between the word, the concept, and the body state representation.

The medial temporal circuit, connected with the temporal and parietal lobes, subserves declarative memory (learning and storage of information about facts and events), and probably words as well, since they are also arbitrary. Circuits connecting the basal ganglia and frontal cortex subserve procedural memory (learning and processing of motor, perceptual and cognitive skills) and probably also grammatical rules (Ullman et al. 1997:267). The cross-linguistically identical theta roles associated with a given verb are almost like a mini schema (or cognitive framework), and attest the verb’s original nature in motor activity. When the verb is first enacted as a body state representation, with the relevant agent, theme, goal, etc., the expectation of their presence will be encoded with the representation of verb itself, and will also be accessed when the verb is accessed. It has been noted (Tomasello 2000:156-7) that a child’s earliest words are item-based, organised around a concrete schema, and that semantically similar verbs are used in only one type of sentence frame.

The earliest verbs a child acquires are generally concrete activities (Tomasello 2000:156), which may be assumed to bring about specific body state representations. Verbs that depict mental states are not acquired until appreciably later in development, around 2;6 to 2;10 (Limber 1973:172). Presumably these mental state verbs are interacting with the nascent ToM mechanism, since the child must appreciate that people have mental states before he can speak about those states, since it has been noted (Hoff-Ginsberg 1993:567) that children do not truly attempt to communicate ideas until the age of about three, when they have developed an understanding of the mental states of others.

One class of learning disabilities is in integration, and consists of a deficit in acquiring meaning and symbolic significance (Johnson & Myklebust 1967:21). This class of problems is exemplified in echolalia (when the speaker repeats what he hears) and word-calling (when the word-caller identifies the word he sees in print), yet in neither case is any meaning associated with the words (Johnson & Myklebust 1967:21). It is possible that these individuals have a diminished, or damaged, somatic marker mechanism, and are consequently unable to form the connection between a word and their own body state feedback which would allow them to impute meaning to the word.

The relation between word and meaning is also lost or damaged in people with Wernicke’s aphasia, or transcortical sensory aphasia, who have damage to the left parietal lobe. Such people produce many paraphasias (real words, perhaps related in meaning to the target word, but perhaps apparently randomly selected) and neologisms (possible, but non-occurring, “words” ), as well as an increased number of indefinite terms (something, this, here). There is also a frequent occurrence of semantic paraphasias after subcortical damage (Lesser 1990:406). People with semantic dementia (associated with focal temporal lobe atrophy) have a profound, progressive, and often precipitous, loss of semantic knowledge, affecting not only language but also object recognition and factual knowledge (Hodges et al. 1992:1798,1803).

3.2     Language and working memory

Working memory is a necessary prerequisite for processing syntax, in both comprehension and production (Hagoort et al. 1999:277). Parsing principles in language comprehension suggest (Kimball 1973:40) that, although semantically the unit of perception is the sentence, syntactically the unit of perception is the phrase. What was then called short-term memory (STM) holds a chunk (defined as a node and all its immediate constituents (Kimball 1973:38)) until it has been parsed syntactically; whereupon the chunk is removed from STM, and is available only to semantic processing.

As the length of working memory is presumed to be approximately 1 to 2 seconds (Baddeley 1986:93), and speech to be delivered at the rate of two to three words per second (Levelt 1999:112), it may be seen that the number of words that would be expected to be held in working memory corresponds closely to Miller’s magical number of seven plus or minus two (Miller 1956). The number of unrelated words that can be remembered is, indeed, approximately six (Baddeley & Hitch 2000:134).

However, it is well-known that speech is packaged into tone units (also known as intonation units, or information units) that indicate which elements belong together in an utterance (Leech & Svartvik 1994:18,194). Each tone unit averages some four or five words, contains a stress nucleus, and represents a separate piece of information, e.g. |the man told us |we could park |at the railway station | (Leech & Svartvik 1994:18,194). It is common for the speaker to lengthen the word immediately before a clause boundary, and to pause for a beat (perhaps about 250 msec) between clauses (Wingfield & Stine-Morrow 2000:363). Evidence from event-related potential (ERP) recordings suggests that the detection of intonational boundaries is very important in speech perception, and that listeners adjust their syntactic strategies according to prosodic cues (Van Petten & Bloom 1999:104). It must be assumed that this form of chunking allows the listener to process a number of tone units consecutively.

Meaning plays a large part in determining how much can be remembered, and the average adult is able to recall sentences of 24 or 25 syllables correctly (Lezak 1995:364). Indeed, memory span for sentences is approximately 16 words (Baddeley & Hitch 2000:134), and the final sentence in the Sentence Repetition subtest in the Multilingual Aphasia Examination (Benton & Hamsher 1989), consists of 18 words comprising 24 syllables, viz: The members of the committee have agreed to hold their meeting on the first Tuesday of every month. Cases have been reported where a person has a very short span for unrelated words, but a relatively well preserved recall of meaningful sentences (Lezak 1995:364). The implication of this is that the presence of meaning seems to mobilise additional memory mechanisms in support of the phonological loop.

3.2.1     Memory in language comprehension

Working memory has long been thought to play a role in reading comprehension, influencing in particular the retrieval of facts, and the computation of anaphoric pronominal reference (Daneman & Carpenter 1980:450). Subjects’ performances on the reading span test and listening span test were shown to correlate significantly with their ability to answer factual questions about a short passage of text, and to compute the referent of an anaphoric pronoun in the passage (Daneman & Carpenter 1980:455-6, 459). It should, however, be noted that this is an epiphenomenon of text and reading, in that, in almost all spoken interactions, the hearer would be able to question the speaker about the identity of the referent, were it not clear.

Working memory is implicated in the reading of “garden path” sentences, where the initial interpretation [baU] has to be revised in the light of following material (e.g. The violinist took a bow. … It had been propped on the music stand) (Daneman 1984:375). It should be noted firstly, that such sentences are extremely unlikely in speech, because either pronunciation or prosody will disambiguate them; and secondly, that they may be peculiar to English, since a more inflected language would not have the necessary homographs.

The creation of inferences also is sensitive to working memory differences. Daneman (1984:376-7) found that high span subjects were significantly more able than low span subjects to integrate clues spread throughout a 25-page detective story, and to name the perpetrator correctly. The necessity to store information, and then use it in order to parse, disambiguate, and integrate subsequent text, taxes both the storage and processing functions of working memory which compete for limited capacity resources (Daneman & Carpenter 1980:450-1). In spoken interaction, however, the hearer can simply ask for information if he fails to understand the speaker.

Processing embeddings makes demands on working memory, the classic example being an object-trace relative clause (often referred to as a centre-embedded relative) such as The reporteri that the senator attacked ti admitted the error (Just & Carpenter 1992:128). The greater difficulty of an object-trace over a subject-trace relative (e.g. The reporteri thati attacked the senator admitted the error) has been explained as being because the same element (reporter) functions as both subject and object (Just & Carpenter 1992:129). In linguistic terms, in an object-trace relative, the head of the chain is further from the foot and must cross more nodes than is the case in a subject-trace relative, under the Chomskyan paradigm. Reading time experiments show that performance on comprehension of object-trace relatives is slower in subjects with low spans than in those with high spans (Just & Carpenter 1992:130).

3.2.2     Memory in language production

There is general agreement on a broad outline of the production process (Bock & Levelt 1994:945). This proceeds in a top-down fashion, from the speaker’s intended meaning at the message level; through the functional level, where lexical selection and the assignment of syntactic functions occur; to the positional level, where the constituents are assembled in an ordered set of word slots and morphological slots; and down to the phonological level, where phonological segments and prosody are encoded, ready for the output systems (Bock & Levelt 1994:945-6). Language production is assumed to be incremental, allowing limited parallel processing to occur across stages, with higher levels delivering information concerning only part of the element under construction piecemeal to levels lower in the hierarchy, before the whole representation of that element is complete at the higher level (Berndt 2001:379).

In order to plan and organise output, information must be retrieved from long term memory, and integrated in real-time with other information passing through working memory (Olson 1973:156). Message generation is therefore dependent on working memory function, and is thought (Barch & Berenbaum 1997:409) to demand more capacity than other aspects of language production. Among the long term memory items that must be held activated are general world knowledge, and knowledge about lemmas, which are representations of semantic and syntactic information. Limited capacity buffers, specific to each level of processing, maintain representations of knowledge activated from the long term store (Martin & Freedman 2001:264). There is an obvious conflict between the top-down manner in which processing from message to output (described above) is assumed to occur, and the bottom-up approach assumed as the syntactic tree is constructed by successive applications of the operation Merge, in the Minimalist Program paradigm. It is proposed (Martin & Freedman 2001:278) that syntactic planning is incremental, with a buffer to retain clause fragments as they are planned, so that they can be integrated with the structure of earlier fragments to create a syntactically coherent whole.

Evidence from speech errors indicates that, although the words involved in most word exchange errors originate in the same clause, some 20% come from adjoining clauses, and hence it is assumed that no more than two clauses can be planned at once (Garrett 1980, cited in Bock & Levelt 1994:967). However, as Bock and Levelt point out (1994:971) speakers rarely know precisely how their sentences will end before they begin them. Indeed, there is evidence from reaction time studies (Ford & Holmes 1978:42,47) that speakers plan a subsequent clause during the end of the previous clause, and that each clause is independently formulated into its surface form as the sentence is being produced. Clearly there is a difference between a matrix clause and an embedded clause, in the length of activation required. A matrix clause, by definition, is that in which other clauses are embedded, and consequently it must be held in working memory until the end of the utterance, whereas, in many cases, the representation of an embedded clause can be terminated as soon as the clause is uttered.

In producing an utterance, a speaker must undertake a considerable amount of parallel processing, simultaneously formulating several elements at different levels. As Levelt points out (1999:112) there is no more complex cognitive-motor activity than speaking, since it requires the speaker to co-ordinate his semantic, syntactic, and phonological systems, while at the same time he must also monitor the content, grammaticality, and articulation of what he has produced. Not only must the speaker correct his articulatory and grammatical errors, but he must also take into account the needs of the listener.

Conforming to the Gricean conversational maxims of quantity, quality, relation, and manner (Grice 1975) requires that the speaker should supply any necessary background information, and monitor the listener’s comprehension, making repairs when they are needed. The generalised requirement of relevance necessitates that the speaker should keep activated in working memory the topic he is addressing, and adhere to it. The maintenance of cohesion and coherence across a conversational turn requires shifts of the speaker’s attention between the ongoing string and previous utterances (Thomas & Fraser 1994:589). Clearly the speaker must also obey the discourse requirements of his culture, observing such things as politeness formulae, which will be observed consciously, and therefore demand attentional resources, although it is likely that the production of variants that are sociolinguistic markers will be below the level of consciousness (Wardhaugh 2002:206), and should therefore not create additional demands on attention.

3.3     Complexity in language production

This section surveys those elements which are acquired very late in childhood or adolescence, which are particularly susceptible to damage in cases of aphasia, and which create particular problems for people with known language disorders or disabilities.

Whereas some groups have no particular problems with working memory, such as those with Specific Language Impairment (SLI) (Fletcher 1999:350), other groups have lower than normal adult working memory capacity. For example, this is the case in childhood, where WM capacity typically increases two- or three-fold between the ages of 4 and 14 (Gathercole 1999:410), as span on a Digits Forward task increases from 2 or 3 at four years, to about 7 at fourteen years (Gathercole & Baddeley 1993:25).

Working memory deficiencies are also evident in people with mental retardation, who have problems in developing strategies for chunking information, so they have to recall unrelated bits of information, which quickly overloads their memory capacity (Owens 1989:119-120). As IQ falls, information processing becomes slower, and problems with language, especially in production, increase (Hulme & Mackenzie 1992:13-14). The language development of people with Down’s syndrome often ceases at the age of 12, with a Mean Length of Utterance of about 3 (Hoff 2001:343), and they rarely progress beyond the simple phrase structures of a typical two year old (Pennington & Bennetto 1998:87). Most people with Down’s syndrome fail to acquire knowledge of sentential embedding, or of how to use complex questions (Tager-Flusberg 1999:319). There is, however, tremendous variability in linguistic function within and across subgroups of mental retardation, and people whose morphosyntax is comparatively spared have relatively intact verbal working memory whereas only those who have digit span of four or more achieve complex syntax (Fowler 1998:311,314-5).

It is assumed that the elements which are late-acquired, easily damaged, or problematic for these groups represent loci of conceptual and/or computational complexity in language production. The following sections (4.3.1 to 4.3.3) discuss such evidence concerning those elements, the optional Complementiser Phrases, adverbs and adverbials, and modifier phrases (attributive adjectives), which are considered to exemplify both syntactic complexity and relevance to social intelligence.

3.3.1     Optional Complementiser Phrases

The term optional Complementiser Phrase (hereafter CP) is used to refer both to relative clauses, and to those clauses introduced by a subordinating conjunction which function as adjuncts but not as complements of the verb. These clauses are represented in a syntactic tree as being headed by a CP element.

Relative clauses require syntactic movement, and chain formation, consequently incurring computational costs, and the difficulties that nested embeddings resulting from relative clauses present in both production and comprehension have long been noted (Limber 1973:183). Even in adult speech, complete and grammatical utterances containing nested embeddings are much less frequent than might be expected, and a variety of devices is used instead, including recapitulation of elements, insertion of a coreferent pronoun in the relative clause, and anacoluthon (the breaking off of one clause, to start another) (Limber 1973:183).

The ability to produce relative clauses develops throughout childhood, and the expansion of relatives to include modification of objects rather than subjects, and centre-embedded clauses are signs of mature written varieties (Scott 1988a:54-5).

In child language acquisition, the order in which the subordinating conjunctions emerge appears to be partly related to the difficulty of the concepts they encode (Bowerman 1979:287), and, in adult language, syntactic complexity may, to at least some extent, represent the complexity of the relations between the concepts expressed (Barch & Berenbaum 1997:408). It is inherently low-frequency structures that indicate growth of complexity (Scott 1988a:58). High-frequency subordinating conjunctions are when and because, which together account for some 75% of all adverbial clauses produced by 9 to 19 year olds; mid-frequency subordinating conjunctions are if and so (that); while although, as, even if, provided that, and unless are low-frequency subordinating conjunctions (Scott 1988b:70-1). The low-frequency items are regarded as being sensitive indicators of syntactic development in adolescence (Scott 1988b:71), and an 11 year old whose subordinating conjunctions are still limited to because, if, and when, would have a subtle linguistic impairment, as he is unable to exploit the full range of language (Scott 1988a:58).

People with language learning disabilities (LLD) have problems in receptive language in dealing with terms expressing spatial or temporal relations e.g. before, after, and first (Montgomery 1992:518). Relational terms such as before and after also present great problems to people with various forms of mental retardation (Fowler 1998:302).

Subordinating conjunctions such as although, unless, until, because have been reported to be difficult for people with LLD to comprehend, possibly because of the subtlety of the relations they encode (Montgomery 1992:518). They may possibly also present difficulties because of the additional memory and attention load of maintaining both the main and the subordinate clauses in working memory until the end of the utterance.

People with schizophrenia have been reported to use fewer clausal and sentential connectives (Thomas et al. 1990:207) and to use fewer embedded clauses than do control subjects (Morice & Ingram 1982:15).

It therefore seems plausible to include optional CPs among those elements that constitute complexity and represent social information.

3.3.2     Adverbs and adverbials

Adverbs and adverbials may occur as adjuncts giving additional information (e.g. seldom, yesterday, in the rain), disjuncts providing a comment (e.g. fortunately, perhaps), or conjuncts connecting to the context (e.g. therefore, on the contrary). People with language learning disabilities have problems with the productive use of such terms as yet, after all, nevertheless (Montgomery 1992:523), which exemplify each of these three kinds.

Only a few conjunct adverbs (anyway, now, so, then, though) occur in the speech of children up to 12 years, but more seemingly develop during adolescence, as adults use three times as many conjuncts (Scott 1988a:55-6).

Children with SLI have problems in producing adverbials in the form of Prepositional Phrases (PPs) (Gavin et al. 1993:200,204), avoiding them with both transitive and intransitive verbs (Fletcher 1999:361), and adverbials expressing time are particularly difficult for such children (Fletcher 1990:448). It has been noted that children with SLI produce significantly fewer adverbial predicates than do normal children, and that they are less likely to give information indicative of time, place, manner, or quantity (Johnston & Kamhi 1984:75,78).

Adverbs and adverbials are also considered to be instances of complexity, and to give socially related information.

3.3.3     Modifier phrases

Attributive adjectives are the form of modifier phrase that appears to give most trouble. People with reduced memory spans have been shown to have difficulty in both producing phrases containing attributive adjectives (e.g. AN green leaf, AAN small green leaf) whereas they can produce the same content predicatively (e.g. the leaf is green, the leaf is small and green) (Martin & Freedman 2001:269). It is noteworthy that the control subjects did not perform at ceiling levels, producing only 90% correct AN phrases and 70% correct AAN phrases (Martin & Freedman 2001:270).

The lack of internal elaboration in noun phrases, by way of attributive adjectives and prepositional phrases, has been noted in the speech produced by people with and without agrammatism, and with both fluent and nonfluent aphasias (Berndt 2001:390). A cross-linguistic study of agrammatic speakers of Swedish, French, German, Polish and English found that subjects could produce under 40% of AN structures and only 25% of AAN structures, and that there was a tendency to produce attributive adjectives postnominally, regardless of whether that was legal in the speaker’s language (Ahlsen et al. 1996:549,553-4,557).

Children with SLI also have difficulty in producing noun phrases (NPs) containing one or two attributive adjectives (Gavin et al. 1993:200), and the investigator has personally witnessed a class of 8 and 9 year olds with SLI struggling to produce an utterance of the form there’s a red frog on your hand.

Attributive adjectives are also plausible examples of syntactic complexity and social intelligence information.

3.4     Measuring complexity

A major difficulty in investigating language complexity in production lies not only in defining precisely what the term complexity means, but also how it should be measured. This is taken up in section 4.4.2 to section 4.4.13, where measures of complexity used in a number of earlier studies are discussed.

The preponderance of research into language complexity has been in the field of comprehension, where testing is methodologically simpler. Typically the investigator provides a stimulus containing the phenomenon under investigation, and then asks the subject a question about that stimulus (e.g. Baddeley et al. 1985; Daneman & Carpenter 1980).

The computation of anaphoric reference over varying distances has been shown to correspond to performance on a reading span test (Daneman & Carpenter 1980:456). This was tested by having the subject read a story and then answer a question about the referent. An example of this is a passage (Daneman & Carpenter 1980:455) about a meeting of jungle animals, which concludes …The proceedings were delayed because the leopard had not shown up yet. There was much speculation as to the reasons for the midnight alarm. Finally he arrived and the meeting could commence. The probe question tested the subject’s ability to name the referent of the pronoun in the final sentence by asking Who finally arrived? The number of sentences between the pronoun and its referent varied in the different passages used. The advantage of the stimulus-question sort of test is that it produces a limited number of simple answers that are either right or wrong, and are consequently easy to score.

Another sort of stimulus used in comprehension tests is a sentence containing a centre-embedded relative clause, which may be either subject-trace e.g. The reporter that attacked the senator admitted the error, or object-trace e.g. The reporter that the senator attacked admitted the error (examples from Just & Carpenter 1992:130). A number of different studies have shown that object-trace relative clauses are more difficult to process, for example by requiring increased reading times (Just & Carpenter 1992:129-130). In comprehension studies, where a limited number of possibilities are presented to the subject, the scoring will necessarily be more straightforward than where the subject could present an almost infinite variety of possible responses, as is the case in studies of production.

The elements claimed to instantiate complexity in language comprehension (anaphoric reference, ambiguous or garden path sentences, and subject- or object-trace relative clauses) are essentially syntactic phenomena, and hence it is at the level of syntax that a correlation between complexity and working memory has been demonstrated by previous researchers (e.g. Baddeley et al. 1985; Daneman & Carpenter 1980). For this reason, this study was confined to complexity instantiated in syntactic elements, and no analysis at semantic or pragmatic levels was attempted.

3.4.1     Existing analyses of complex language

Several methods already exist for describing and analysing language complexity at a syntactic level, and are reported below: however, none was considered suitable for use in the present study. Some of these analyses were intended to be purely theoretical (and/or solely to describe complexity in comprehension), and many were intended for use only with a circumscribed section of the population (young children, people with aphasia, or people with learning disabilities).

In all the analyses outlined below, the data are divided into sentences, utterances, or Text Units: these latter are described as “minimal domains of utterance organisation” (Edwards et al. 1993:218). It should be noted that the construal of what constitutes any of these units must necessarily be at least partly subjective, where the criterion is prosody, grammaticality, or completeness of a thought. Analyses based on such units may be dependent either on the length of a unit (for counts of the number of Xs per unit) or on underlying assumptions about what makes unit A more complex than unit B (Cheung & Kemper 1992:56). This latter may again introduce subjectivity.

Where an analysis is dependent on the length of unit (be it sentence, utterance, or text unit) much will hinge on the precise rules for what counts as a unit: for example, what happens about co-ordination within, versus of, IP? Clearly, longer units are more likely than shorter units to contain more of the elements counted as complex. Where an analysis makes assumptions that some constituents are more complex than others, this hierarchy may be motivated by developmental chronology (as is the case in Developmental Sentence Scoring (section 4.4.5), Index of Productive Syntax (4.4.6), and Developmental Level (4.4.8)) or by purely theoretical considerations (as with Yngve depth (4.4.2) and Frazier depth (4.4.3)). The former are reliant on the validity of the acquisition data and the analysis imposed upon that data: the latter, if not validated empirically, are dependent on the validity of the theoretical framework.

Another problem with breaking down the data into units for analysis is that, given the nature of spontaneous speech, many units will contain mazes (exact, amended, or elaborated repetitions), or will be ungrammatical or incomplete. If an analysis looks only at complete and grammatical units, much of the data will necessarily be discarded. It has been shown that stuttering occurs more frequently with verbs of higher valency, and in utterances of greater length and/or complexity (Yaruss 1999:338,343), and it is likely that longer and/or more complex utterances present more of a challenge to working memory, and hence will be more likely to contain mazes, or to be ungrammatical or abandoned when incomplete. If these units are discarded, much useful and relevant data will be lost. The listener, after all, does still hear and process the entirety of the speaker’s output: not simply those parts that are deemed complete and grammatical. What, then, is the motivation for discarding large portions of the data from the analysis?

Analyses intended for use on language from young children will concentrate on the elements relevant to acquisition (e.g. the presence of functional categories, or absence of agreement errors): these analyses cannot be expected to be suitable for describing data from normal adults, where such elements are assumed. Similarly, an analysis (such as Developmental Level (4.4.8)) intended to describe the language of people with learning difficulties, where