Jonathan Marks offers a helpful short glossary of terminology for teaching pronunciation: homophones, rhotic and rhythm.


Homophones are words with different spellings and different meanings but the same pronunciation. 'Knows' and 'nose' are homophones, for example – so are:

  • 'reed' and 'read' (infinitive)
  • 'red' and 'read' (past tense)
  • 'key' and 'quay' 
  • 'I', 'eye' and 'aye'
  • 'so', 'sew' and 'sow'
  • 'pair' and 'pear'

and so on.

Some pairs of words are homophones in some accents but not others. In many non-rhotic accents, pairs of words such as 'stalk' and 'stork', 'caught' and 'court' are homophones, but in rhotic accents they aren't.

Pairs such as 'some' and 'sum', 'you' and 'ewe' are homophones when 'some' and 'you' are in their strong forms /sʌm/ and /ju:/, but not when they're reduced to the weak forms /səm/ and /jʊ/ or /jə/.


rhotic /rəʊtɪk

(from the name of the Greek letter 'rho')

A rhotic accent is one in which the letter 'r' is pronounced wherever it appears in the spelling. Rhotic accents are characteristic of most regions of North America, Ireland, Scotland and south-west England. In non-rhotic accents, on the other hand, 'r' is only pronounced immediately before a vowel sound – which can be in the same word or in the following word. ('iron' aɪən / is an exception.) Regions with non-rhotic accents include Australia, South Africa, Wales and most of England. RP (received pronunciation) is a non-rhotic accent.

Compare these examples – the 'r's that are pronounced are in bold:

 rhotic  non-rhotic
 star  star
 start  start
 staring  staring
 Add some water.  Add some water.
 Put some water in.  Put some water in.
 Where?  Where?
 Where is it?  Where is it?
 Another day.  Another day.
 Another one.  Another one.
 Another hour.  Another hour.
 Another hour or so.  Another hour or so.















The characteristic rhythm of English and certain other languages have been described as 'stress-timed', meaning that stressed syllables tend to occur at roughly equal intervals and that intervening unstressed syllables, depending on how many there are, are therefore either compressed or extended to fit the available time interval between stresses. 'Stress-timed' languages are contrasted with 'syllable-timed' ones (French is the most frequently cited) in which all syllables are said to occupy roughly equal lengths of time.

An effect of stress-timing can easily be contrived through the step-by-step expansion of a phrase such as this:

one                    two                   three                   four

one and             two and            three and            four

one and a          two and a         three and a         four

one and then a  two and then a three and then a four

The regular beat of the stressed syllables one – two – three – four can be maintained while more and more unstressed syllables are slotted in between them. Stress-timing is also often illustrated in the recitation of verse such as:

        The   owl    and the    puss  -y cat     went to         sea

        In a   beau -tiful          pea   green      boat              x

        They took   some       hon  -ey and    plen  -ty of    mon  -ey 

Wrapped   up     in a          five    pound     note              x

(x represents a silent beat.)

In spontaneous, non-scripted speech a similar impression can arise over durations of a few seconds, and indeed in animated, emphatic speech the regular beat is sometimes conveyed visually through rhythmic gestures such as head-nodding or movements of the hand or arm. But clearly most speech is not stress-timed to the same extent as the kind of rhyme recitation referred to above, and there are good reasons for this.

From a speaker's point of view, long stretches of stress-timed speech would be immensely difficult to produce. They would require impossibly detailed planning, before and during speaking, of the exact words a speaker was going to use, and of how to map these words onto the framework of stress and unstress. (Significantly, an impression of stress-timing is often associated with the occurrence of partially or wholly prefabricated 'chunks' of language, which do not require much planning, if any.) And they would require equally demanding planning of articulation and the timing of breath intake. Even then, strict stress-timing would become increasingly hard to preserve as sequences of unstressed words became longer. And from a listener's point of view, continuous stress-timed speech would also be problematic; there would be no space in which to take stock of, or catch up with, the speaker's message, and no way of assessing the relative significance of different elements in the message, beyond the binary stressed/unstressed distinction.

And indeed, attempts to identify significant differences between 'stress-timed' and 'syllable-timed' languages using objective, instrumental measurement have not met with success. Yet the notion of stress-timing has been, and continues to be, regularly invoked in pedagogic descriptions of English pronunciation, and used as a basis for practice exercises. So how did it arise, and why does it seem so plausible?

Perhaps part of the explanation is simply that when the impression of stress-timing was first reported (more than two centuries ago) it may have been based on rather stylised, perhaps rehearsed, types of speaking, and having established itself as authoritative, it was simply handed down from generation to generation, like so many other dubious claims about language.

There is plenty of evidence from other areas of life that people are predisposed to identify a regular rhythm where the objectively measurable evidence for it is in fact only slight, or ambivalent. Perhaps this is why we can be so easily persuaded that speech tends to have a regular beat.

More speculatively, various theories of the origins of language trace speech back to a form of expression more akin to singing, or relate it to the regular tread of walking. Perhaps even the flimsiest of evidence is sufficient to evoke an atavistic association with the regular rhythm of music, or of the human gait. And perhaps this is particularly likely to happen in the case of English, which is certainly unusual among languages in the extent to which it maximises the distinction between the characteristics of stressed and unstressed syllables.

A more practical concern, for language teachers, is whether there is any value in using pronunciation practice material with a regular beat? If such material is presented as representative of a target for learners' production, then the answer is no, because this would be asking them to do something which even native speakers would be incapable of. But I would say that such material, whether tailor-made for language learning or in the form of the recitation of pre-existing rhymes and verse, can be valuable as a pedagogic artifice, since it provides a framework for directed, focused practice of certain aspects of English pronunciation which learners often find challenging, including the stress/unstress distinction, intonation, compression of unstressed syllables, vowel reduction, weak forms. elision and assimilation.