The Musical Origins of Language

Deep History

Without music life would be a mistake.

– Nietzsche

The psychoanalyst Ehrenzweig commented:

It is not unreasonable to speculate that speech and music have descended from a common origin in a primitive language . . . . Later this primeval language would have split into different branches: music would have retained the articulation mainly by pitch (scale) and duration (rhythm), while language chose the articulation mainly by tone colour (vowels and consonants) . . . . Music has become a symbolic language of the unconscious mind. (in Nagel, 2013, p. 19).

Ehrenzweig speculates in a fruitful direction. This “common origin” was later theorised and described as “musilanguage” or “protolanguage” (McGilchrist, 2010; Mithen, 2006; Tomlinson, 2015; Spitzer, 2021). It was thought that language developed out of this precursor that was distinctively more musical than it was linguistic. This was founded on the idea that specific characteristics evolve out of more general ones. For example, the limited sound patterns of language are a refined form of more general sounds. Language, on this account, is a highly specialised form of sound production (Grassi, 2021), or as I like to phrase it, language is a subgenre of music. Some say that music is like a language, on this account it is language that is like music.

Of course this is open to debate. Spitzer says “we must take with a pinch of salt the cliché that language and music grew out of a single root of emotional expression, one branch splitting off into words and concepts, the other into notes and feelings” (2021, p. 316). The history is more likely a weave of diverse and mutually influential capacities, not just one thing leading to another, and not just the neat split of music being emotional and language being rational. Another alternative is that this common origin was neither music nor language but that they slowly emerged alongside one another as intermingling steps in human evolution (Tomlinson, 2015). This would make both language and music not entirely distinct subgenres of sound more generally.

Hints of this protolanguage may still be with us. It has been noted that we still have many forms of vocal communication which are not language. We sigh, grunt, hum, whistle, and so forth. Laughter is another non-semantic form of communication that can promote social bonding (Mithen, 2006, p. 81). These may provide insight into what pre-linguistic communication sounded like. Unlike language many of these sounds are not learnt but reflexive (Spitzer, 2021, p. 320). There is a nice example of this as applied to therapeutic practice where the use of “hmm” or “huh” and their various inflections can show receptivity while not affirming client statements with a “yes” (Fink, 2007, p. 9).

There are some common arguments as to the evolutionary advantage of human music making and linguistic musicality. I will delineate just a few here. Infants are born in a relatively helpless state, and unlike many other animals, remain helpless for quite a long time. This extended period of dependence would require a greater degree of attunement by the mother to the infant, to mind those states that the infant still cannot manage independently. One way for mothers to connect to this distress is through “the body, face, and voice of the other” (Mithen, 2006, p. 197), allowing the mother to enter the “feeling state of the other” (Dissanayake, in Mithen, 2006, p. 197). In relation to the specifically vocal qualities, vocalising also allows one to attend to the other when not directly in their presence. If I go to a different room and can hear you singing I know you’re okay, there’s still a connection. Infant Directed Speech (IDS), which I will explore later in more detail, provides the infant with a “disembodied extension of the mother’s cradling arms” (Falk, in Mithen, 2006, p. 201).