The evolutionary origins of music are a puzzle, since music lacks any obvious adaptive function (Darwin, 1871, Wallin et al., 2000). Some theorists have speculated that it actually has no adaptive function, but rather music was invented as a pure pleasure stimulant and all components of human musicality originally evolved for non-musical purposes, like for language, fine motor-control or emotional communication (James, 1890, Pinker, 1997). This by-product hypothesis provides a plausible theoretical rationale for the initial step in music evolution at the point where human ancestors produced the first music-like behaviors, but it does not preclude the possibility that music could have later acquired some adaptive function(s), either biological or cultural.
As music is an omnipresent behavior across all cultures (Merriam, 1964), with deep roots in human ontogeny (Trehub, 2001), an ancient history of at least 40,000 years (Conard, Malina, & Munzel, 2009) and powerful psychological effects on mood and emotions (Sloboda & Juslin, 2001), other theorists have proposed various adaptive functions of music-like behaviors — at least at some stage of human biological and cultural evolution. Such adaptive theories are not necessarily mutually exclusive, since each may account for certain aspects of human music today which could have evolved at different evolutionary periods. In addition, it is important to be clear about whether music in its biological or cultural dimension is at issue: particular forms of music are products of cultural evolution, whereas the innate and universal components of human musicality are products of biological evolution.
Darwin (1871) proposed that musical behaviors once had an adaptive function that they no longer have (i.e., music is an evolutionary vestige). In this view, the major components of human musicality originated in an ancient, pre-linguistic, songlike communication system comprising learned and complex acoustic signals (Brown, 2000b, Mithen, 2005, Richman, 1993). At a later stage in human evolution, this communication system was upgraded by a more efficient one — human language — leaving our species with the innate predisposition to create today's music. Analogous to bird song during courtship, Darwin suggested that music-like behaviors first evolved by means of sexual selection, as individuals advertised for mates (for an extended argument see Miller, 2000).
The idea that music originally evolved as a display signal was also put forward by Merker (2000), who proposed that synchronous chorusing by hominid males served as an indicator of coalition strength, helping to defend territory and at the same time attract migrating females. Similarly, Hagen and Bryant (2003) suggested that group music making and dancing evolved as between-group displays signaling internal stability and the group's ability to act collectively, thereby establishing meaningful relationships — whether cooperative or hostile — between groups. These group display theories, however, have a hard time explaining how such a signaling system could be invented and stabilized in the first place within large groups of often non-related individuals, since it appears vulnerable to cheating. For example, individuals might participate in the musical group performance, but only pretend to share the group's coalition agreement, later taking personal advantage of the others' commitment. Furthermore, these signal theories are supported by rather sparse ethnomusicological evidence and do not account for the majority of musical encounters observed across cultures today where music is part of peaceful within-group ceremonies outside any sexual or competitive context (Clayton, 2009).
Another group of theories treat music not as a signal but as a tool — thereby circumventing the problem of cheating. For example, Dissanayake, 2000, Falk, 2004 have advocated a kin-selected function for an ancient musical communication system in mother–infant bonding: prosodic utterances might have served to keep mothers and their infants in psychological contact when they were physically separated: e.g., while mothers prepared food or manufactured tools. Indeed, the use of lullabies to soothe infants is considered a human universal (Trehub, 2001) and when it comes to communicating emotion through infant-directed speech, “the melody is the message” (Fernald, 1989).
Another related hypothesis is that music and dance, once invented, turned out to be effective tools to establish and maintain social bonds and prosocial commitment among the members of social groups, ultimately increasing cooperation and prosocial in-group behavior (Huron, 2001, McNeill, 1995, Roederer, 1984). Unlike the many modern Western examples of individual music consumption (e.g., via iPod or car radio), in traditional small-scale societies music is typically performed for pragmatic reasons (Bohlman, 2000), integrated into ritual ceremonies such as worship, weddings, funerals or preparations for hunt or combat (Clayton, 2009, Dissanayake, 2006). These ceremonies are usually considered to be essential for the maintenance of the group's identity, with the music being an indispensable part of it. On the proximate level, several universal features of human music (Fitch, 2006, Stevens and Byron, 2009) — like its ritualized context, periodic pulse (beat), discrete pitches and a highly repetitive repertoire — may contribute to solving the proposed adaptive problem of maintaining group cohesion. Specifically, they all make music more predictable than, for example, language and thus enable coordination between multiple individuals at once via synchronization of body movements and blending of voices.
This hypothesis of music as a tool for supporting group cohesion predicts that joint music making ultimately increases prosocial commitment and fosters subsequent cooperation among the performers. Indeed, Anshel and Kipper (1988) found that adult Israeli males cooperate better in a prisoner's dilemma game and score higher on a questionnaire on trust after a group singing lesson, compared to passive music listening, active poetry reading or just watching a film together. Likewise, Wiltermuth and Heath (2009) showed that US students scored higher on a weak-link coordination-exercise and a public-goods game after joint singing along with a song played from headphones, compared to no singing or forced “asynchronous” singing (via playing the same stimulus at individual tempi). Adding synchronous movement (by moving plastic cups from side to side on a table) to the synchronous singing condition did not improve the scores in the subsequent economic games.
However, for the evolutionary argument, much stronger evidence would be provided if similar prosocial effects could be shown in young children. Kindergarten children are presumably not engaged in sexual advertising, nor do they have to form coalitions in fear of encountering rival neighboring groups. In terms of the group cohesion hypothesis, kindergarten children are a better test than adults because children this young, especially in Western cultures, have had few experiences of institutionalized music occurring for external pragmatic reasons. Therefore, we can probably neglect normative knowledge as a source for their interpretation of the manipulation phase and for their decision making during the dependent measures (Olson & Spelke, 2008). But since all human children have musical predispositions and skills (Trehub and Hannon, 2006, Zentner and Eerola, 2010), it would be very telling if involvement in joint music making and dancing somehow influences children's spontaneous altruistic and cooperative tendencies.
In the current study, therefore, we had pairs of 4-year-old children participate in a 3-min episode of interactive play. Using the same setup, procedure and cover story, children either interacted with one another (and an adult) in the context of traditional music — that is, with dancing, singing and playing percussion instruments to a novel, but easy-to-learn, children's song (Musical condition) — or they interacted with one another (and an adult) during basically the same joint activity but without singing, dancing or playing instruments (Non-musical condition). Immediately after this manipulation phase, each pair participated in two social interactions designed to test their willingness to (1) help their partner and (2) cooperate on a problem-solving task. We predicted that prior engagement in joint music making should make children behave more prosocial, i.e., spontaneously help each other more and solve a task rather jointly instead of alone.