Skip to main content
Advertisement
Main content starts here

Abstract

Human language is unique among communication systems since many elements are learned and transmitted across generations. Previous research suggests that this process is best predicted by infant-directed communication, i.e., a mode of communication directed by caregivers to children. Despite its importance for language, whether infant-directed communication is unique to humans or rooted more deeply in the primate lineage remains unclear. To assess this, we investigated directed and surrounding vocal communication in human infants and infants of wild nonhuman great apes. Our findings reveal that human infants receive dramatically more infant-directed communication than nonhuman great ape infants. These data suggest that the earliest hominins likely relied more on surrounding communication to become communicatively competent, while infant-directed vocal communication became considerably more prominent with human language.

INTRODUCTION

Human language is unparalleled in its diversity, complexity, and informational potential. Through the production and combination of linguistic units, we can communicate about the past and the future, the real and the imagined. The acquisition of this unique communication system is highly dependent on linguistic input. One pivotal input type, infant-directed communication, is a crucial source of language learning and a key predictor of acquisition [e.g., (13)]. Infant-directed communication has been documented in many cultures and languages (4, 5) and involves modifications to spoken language (6, 7), sign language (8), as well as gestures (9) used when directly addressing a child or infant (hereafter infant). In the vocal domain, infant-directed communication is typically characterized by a number of acoustic (5, 10) and structural (1113) features and has been demonstrated to attract the infants’ attention more than adult-directed speech (14, 15). These features have been shown to support language acquisition, both in comprehension (3, 16) and production (2, 17). A considerable body of research has further indicated that infant-directed communication plays a role in the transmission of cultural knowledge, a process commonly referred to as natural pedagogy (18). However, there is also substantial cross-cultural variation in infants’ exposure to infant-directed communication, with no obvious effect on language learning (1924). As a result, recent studies have also begun to emphasize the potential augmentative role of infant-surrounding communication in language acquisition (21, 2527).
Despite the central role that infant-directed communication plays in language acquisition and cultural transmission, its evolutionary origins remain largely unknown (28). The few studies investigating the topic in our closest-living relatives, the nonhuman great apes, have suggested minimal (29) or no (30) infant-directed vocal communication, although targeted and systematic empirical studies of great ape individuals in their natural habitat are currently lacking. To reconstruct the evolutionary emergence of infant-directed vocal communication, we investigated the extent to which vocal communication is directed at human infants from different cultures (Chintang, Qaqet, Shipibo-Konibo, and Tuatschin) and infants from at least one species from each genus of all nonhuman great apes [Bornean orangutans (Pongo pygmaeus wurmbii), western gorillas (Gorilla gorilla), chimpanzees (Pan troglodytes schweinfurthii), and bonobos (Panpaniscus)] using comparable methods. On the basis of earlier work (28), we expected high levels of infant-directed vocal communication in humans and low levels in nonhuman great apes. On the other hand, we expected human and most nonhuman great ape infants to be exposed to a similar amount of surrounding communication (28).
To investigate differences in vocal input across species, we first compared the absolute amount of infant-directed and surrounding vocal communication (see Table 1 for definitions) between great ape species. In a second step, given underlying differences in the volubility of each species (see fig. S1), we compared the proportion of infant-directed and surrounding vocal communication in relation to the general vocal activity of each species. In all models, we also accounted for the infant’s age to control for age-related changes in infant-directed or surrounding communication. Our findings suggest that humans produce infant-directed vocal communication at levels that drastically exceed any other great ape species while the input from the surrounding environment across Pan species (chimpanzees and bonobos) and humans is broadly equivalent.
Table 1. Definition of vocal input types and vocal activity and summary of the data available per species.
We counted the number of 2-min intervals in which a human observer recorded this type of vocal behavior. Infant-directed and surrounding vocal communication are independent of each other but can occur within the same 2-min interval. Abbreviations: H, humans; B, bonobos; C, chimpanzees; G, gorillas; OU, orangutans. See the Supplementary Materials for further details on the data availability for each species.

RESULTS

To compare the total amount of (i) infant-directed and (ii) infant-surrounding vocal communication across species, we first fitted Bayesian generalized linear mixed-effects models (GLMMs), specifying a Poisson distribution (quantified by the number of 2-min intervals containing infant-directed or surrounding vocal communication per focal sample across species), see figs. S2 to S4 for visualizations of raw data. Focal samples were obtained by continuously following an immature individual (hereafter focal individual) for a set time period (31). To test whether a larger proportion of vocal communication is (i) infant-directed or (ii) surrounding (for visualizations of raw data of the vocal activity across species, see fig. S1), we secondly fitted Bayesian GLMMs, but this time specifying a binomial distribution. All analyses were controlled for differences in sampling effort [i.e., included the number of 2-min intervals per focal sample (on average 39.1 2-min intervals ±62.8) as an offset term in the GLMM] and accounted for overdispersion by incorporating an observation-level random effect.

Species differences in infant-directed vocal communication

Our first model yielded compelling support for the hypothesis that the incidence of infant-directed vocal communication is much higher in humans than in great ape species (Fig. 1A). Specifically, the number of 2-min intervals (per focal sample) in which infant-directed vocal communication occurred in humans [mean = 19.42, 95%, highest posterior density interval (HPD interval) = 14.62 to 24.66] was 399 times higher than in bonobos (mean = 0.05, 95% HPD interval = 0.01 to 0.11), 69 times higher than in chimpanzees (mean = 0.28, 95% HPD interval = 0.15 to 0.42), and 219 times higher than in orangutans (mean = 0.09, 95% HPD interval = 0.04–0.16). In addition, we found evidence for differences among the nonhuman great ape species. The number of 2-min intervals in which infant-directed vocal communication occurred in chimpanzees was three times higher compared to orangutans (95% HPD interval = 1.02 to 6.21) and five times higher compared to bonobos (95% HPD interval = 0.03 to 0.44). Moreover, a marginal positive effect of the focal individual’s age was identified, suggesting that in all species tested, as infants mature, they are incrementally exposed to more directed vocal communication [estimate: 0.02, 95% confidence interval (CI) = 0.01 to 0.04].
Fig. 1. Species differences in infant-directed vocal communication.
Model predictions showing the absolute amount of infant-directed vocal communication (A) and the proportion of infant-directed vocal communication relative to the general vocal activity (B) per species. The absolute amount of infant-directed vocal communication was calculated by the number of 2-min intervals (per focal observation bout) in which infant-directed vocal communication occurred. The relative amount of infant-directed vocal communication was calculated by dividing the number of 2-min intervals in which infant-directed vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal sample while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.
The second model then revealed that humans not only have higher absolute levels of infant-directed vocal communication but that infant-directed vocal communication forms a larger proportion of the vocal activity of humans (Fig. 1B). Specifically, we found that 94.8% of the vocal activity of humans (in the presence of an infant) consisted of infant-directed communication (95% HPD interval = 0.92 to 0.97), which was 144 times higher than in bonobos (prob. = 0.007, 95% HPD interval = 0.001 to 0.016), 66 times higher than in chimpanzees (prob. = 0.014, 95% HPD interval = 0.006 to 0.024), and 259 times higher than in orangutans (prob. = 0.004, 95% HPD interval = 0.001 to 0.008). We did not find any differences in the proportion of directed vocal communication between the nonhuman great ape species but we did again identify a small positive effect of the age of the focal individual, suggesting that older infants receive more directed vocal communication across development (estimate: 0.04, 95% CI = 0.02 to 0.07).
When examining directed vocal communication from mothers only, our first model again yielded strong support that the incidence of infant-directed vocal communication is much higher in humans than in any of the other great ape species (Fig. 2A). Specifically, the number of 2-min intervals (per focal sample) in which infant-directed vocal communication from mothers occurred in humans (mean = 13.22, 95% HPD interval = 8.80 to 18.14) was 414 times higher than in bonobos (mean = 0.03, 95% HPD interval = 0.001 to 0.102), 143 times higher than in chimpanzees (mean = 0.09, 95% HPD interval = 0.037 to 0.166), 31 times higher than in gorillas (mean = 0.42, 95% HPD interval = 0.171 to 0.791), and 92 times higher than in orangutans (mean = 0.14, 95% HPD interval = 0.047 to 0.291). In addition, our model predicted marginal differences between the nonhuman great ape species. The number of 2-min intervals in which infant-directed vocal communication from mothers occurred in gorillas was three times higher compared to orangutans (95% HPD interval = 0.61 to 7.13), 13 times higher when contrasted to bonobos (95% HPD interval = 0.001 to 0.274), and five times higher when compared to chimpanzees (95% HPD interval = 0.05 to 0.509). For orangutans, the number of 2-min intervals in which infant-directed vocal communication from mothers occurred was five times higher compared to bonobos. We found no effect of the age of the focal individual on the amount of infant-directed vocal communication infants received from the mother.
Fig. 2. Species differences in infant-directed vocal communication from the mother.
Model predictions showing the absolute amount of infant-directed vocal communication from the mother (A) and the proportion of infant-directed vocal communication from the mother relative to the general vocal activity (B) per species. The absolute amount of infant-directed vocal communication from the mother was calculated by the number of 2-min intervals (per focal observation bout) in which infant-directed vocal communication from the mother occurred. The relative amount of infant-directed vocal communication from the mother was calculated by dividing the number of 2-min intervals in which infant-directed vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.
The second model again indicated that, in humans, a greater proportion of the overall vocal activity is made up of directed communication from the mother compared to all other great ape species (Fig. 2B). Specifically, we found that 72% of the overall vocal activity of humans contains infant-directed vocal communication from the mother (95% HPD interval = 0.57 to 0.83), which was 134 times higher than for bonobos (prob. = 0.005, 95% HPD interval = 0.001 to 0.019), 170 times higher than for chimpanzees (prob. = 0.004, 95% HPD interval = 0.001 to 0.01), 86 times higher than for gorillas (prob. = 0.008, 95% HPD interval = 0.002 to 0.021), and 70 times higher compared to orangutans (prob. = 0.004, 95% HPD interval = 0.001 to 0.008). We did not find any differences in the proportion of directed vocal communication from the mother between the nonhuman great ape species.

Species differences in surrounding vocal communication

Our initial set of analyses demonstrated that humans engaged in more infant-directed communication, both absolutely and relatively when accounting for differences in vocal activity across species. Our next question was to understand how input from the surrounding vocal environment (in the presence of an infant) differed between species. Our first model yielded some support for the hypothesis that the incidence of infant-surrounding vocal communication is more similar between great ape species (Fig. 3A). Specifically, the number of 2-min intervals (per focal sample) in which infant-surrounding vocal communication occurred was highest in humans (mean = 18.67, 95% HPD interval = 14.84 to 22.86). More precisely, this was two times higher compared to bonobos (mean = 9.41, 95% HPD interval = 7.16 to 12.03), three times higher than for chimpanzees (mean = 6.24, 95% HPD interval = 4.82 to 7.80), and 27 times higher than the surrounding vocal input orangutan infants received (mean = 0.69, 95% HPD interval = 0.43 to 1.03). In addition, we found evidence for differences between the nonhuman great apes. The number of 2-min intervals in which infant-surrounding vocal communication occurred in bonobos was 1.51 times higher than in chimpanzees (95% HPD interval = 1.01 to 2.08) and 13.69 times higher than in orangutans (95% HPD interval = 5.16 to 14.42). Also, the number of 2-min intervals in which surrounding vocal communication occurred was 9.08 times higher in chimpanzees when compared to orangutans (95% HPD interval = 5.16 to 14.42).
Fig. 3. Species differences in infant-surrounding vocal communication.
Model predictions showing the absolute amount of surrounding vocal communication (A) and the proportion of surrounding vocal communication relative to the general vocal activity (B) per species. The absolute amount of surrounding vocal communication was calculated by the number of 2-min intervals (per focal observation bout) in which surrounding vocal communication occurred. The relative amount of infant-directed vocal communication was calculated by dividing the number 2-min intervals in which infant-surrounding vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal sample while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.
Our second model predicted that infant-surrounding vocal communication occurred at proportionally similar levels in humans (mean = 97.5%; 95% HPD interval = 0.952 to 0.991) as for chimpanzee infants (mean = 98.2%; 95% HPD interval = 0.959 to 0.995). Bonobo infants were found to have a slightly higher (1.02) probability of receiving surrounding input than human infants (95% HPD interval = 0.04 to 0.74). In contrast, orangutan infants received five times less surrounding vocal communication (mean = 19%; 95% HPD interval = 0.04 to 0.46) compared to the other three great ape species (Fig. 3B). We found no effect of the age of the focal individual on the proportion of infant-surrounding vocal communication.

DISCUSSION

Through comparing vocal input received by infants of all great ape species, we demonstrated notable differences in the amount of directed communication between human and nonhuman great apes. Humans engaged in infant-directed communication at orders of magnitude higher than any other great ape species. In contrast, we found fewer marked differences between species in terms of surrounding vocal input, with most nonhuman great apes displaying proportions similar to humans. Critically, our analyses showed that the vocal activity across species (i.e., how voluble a species is) was not sufficient to explain the differences between human and all nonhuman great apes.
A key implication of our data is that there must have been a massive expansion in the amount of infant-directed communication within the hominin lineage. What might explain this difference between humans and other great apes? Insights into the drivers of this expansion of infant-directed communication could be gleaned from our understanding of its function. One dominant hypothesis for the function of infant-directed vocal communication in humans is that these vocal interactions with children play a key role in scaffolding the transmission and learning of language. Our data provide compelling comparative support for this since nonhuman great ape vocal systems are generally considered to be far more fixed and genetically determined than humans’ (32), and, accordingly, we see much lower levels of infant-directed vocal input. However, humans not only direct vocalizations at infants but also adopt an idiosyncratic vocal register when doing so (e.g., repeating words and using higher pitch) [e.g., (6, 33)]. Given the very low levels of infant-directed vocal communication in nonhuman great apes, it simply was not possible to also examine vocalizations for equivalent structural features known to characterize human infant–directed speech. To shed further light on any potential acoustic variation, in addition to better understanding the precise function of the rare infant-directed vocalizations in nonhuman great apes, behavioral data (e.g., the contexts in which these vocalizations are produced) compiled over longer study periods are critical. Previous research indicates that infant-directed gestures in great apes are, like infant-directed communication in humans, characterized by enhanced repetition (34, 35) and might even be more frequent in nonhuman great apes in contrast to the low rates of infant-directed vocal communication (29, 36, 37). The gestural modality might therefore represent an additional fruitful avenue for future work investigating the evolutionary origins of infant-directed communication in humans.
Our focus here has been on the occurrence of infant-directed communication in our closest-living great ape relatives. Parallel research over the past 20 years has, however, also identified analogous or convergent cases in more distantly related species to humans. While informative, these cases of infant-directed vocal communication seem to serve qualitatively different functions than infant-directed vocal communication in humans. Functions range from infant-retrieval [e.g., domestic cats, Felis catus: (38)], mother recognition [e.g., Mexican free-tailed bats, Tadarida brasiliensis: (39)] to fine-tuning vocal production using vocal accommodation [e.g., orcas, Orcinus orca: (40)]. In marmoset monkeys (Callithrix jacchus) vocal input from caregivers has been shown to bootstrap infant vocal development. Specifically, contingent parental vocal feedback (within turn-taking events), but not the overall amount of surrounding parental vocalizations, had a positive effect on the development of adult-like vocalizations in immatures (4143). Possible cases where the features of infant-directed vocal communication have parallels to human infant–directed vocal communication are found in greater sac-winged bats (Saccopteryx bilineata) as infant-directed vocal communication differs in pitch and timbre in comparison to adult-directed vocal communication (44). In addition, bottlenose dolphins (Tursiops truncatus) have also been shown to modulate the acoustic features of their signature whistles when their infant is present (45). Critically, both dolphins and greater sac-winged bats, in addition to humans, are considered vocal learners (46), highlighting a potential link between vocal learning and the presence of infant-directed vocal communication. Future studies investigating the presence and function of infant-directed vocal communication in vocal learning and nonvocal learning animals are required to support the generality of this relationship.
To better understand the overall vocal input infants are exposed to, we also captured the surrounding vocal communication of humans and other great apes. Our data indicate that infant-surrounding vocal communication is the major source of input in all nonhuman great ape species we tested. Our results also showed that orangutans received less surrounding vocal input compared to all the other great apes, including humans, a finding that can be explained by the fairly solitary nature of Bornean orangutans (47). Secondly, we found an additional, albeit much smaller, difference whereby bonobo infants received a marginally higher proportion of surrounding vocal input compared to humans (see Fig. 3). This difference can likely be explained by the greater amount of infant-directed communication in humans. An emerging picture from our data is that learning during vocal development in great apes must be nearly exclusively based on the surrounding (as opposed to directed) vocal input.
In some human cultures, infant-surrounding vocal input is also more prevalent than infant-directed vocal input (2123), suggesting that surrounding vocal communication could also play a more important role for language acquisition than previously assumed. Such a conclusion is supported by growing evidence from more experimentally driven studies, demonstrating that children are not only able to learn language from surrounding interactions not explicitly directed toward them (25, 48, 49), but that the precise nature of the surrounding input can provide differential learning opportunities. For example, a recent study has indicated that, across cultures, surrounding speech from children captures the infants’ attention more effectively compared to surrounding speech from adults, suggesting that surrounding speech from other children might provide more learnable input compared to more complex adult speech (50). Following from this, a promising direction for further research would be to investigate the precise nature of infant-surrounding input for nonhuman great apes in greater detail—specifically focusing on the callers’ identities, age classes, relationship to the infant, and nature of the vocal input (whether, for example, it consists of calls or call combinations) and the potential influence this has on the infant’s vocal output.
In conclusion, our findings suggest that the tendency to direct vocalizations at infants, a key feature of human communication, has been massively expanded in the human lineage. These data provide support for the hypothesis that infant-directed vocal communication played a critical role in the emergence of human language through scaffolding the learning and acquisition of such a complex communication system. The presence of broadly equivalent levels of surrounding vocal communication in Pan (chimpanzees and bonobos) and humans suggests that early hominins probably relied on surrounding vocal communication for any learned component of their vocal system until infant-directed vocal communication became more prominent.

MATERIALS AND METHODS

Data

Vocal data on all species (Bornean orangutans, western gorillas, eastern chimpanzees, bonobos, and humans) were collected in a maximally comparable way. We collected data in wild-living populations for the nonhuman great apes and naturalistic conditions for humans. We continuously focal-followed (31) immatures (except for gorillas, where mothers were followed; see Material and Methods subsection “Gorillas”) of the same age range (10 to 60 months). This age range is adequate for a comparative study as all great apes have a similar life history with dependency on the mother lasting at least 4 years [chimpanzees and humans (51), western gorillas (52), bonobos (53) and orangutans (5456)]. For all nonhuman great apes, we recorded all vocalizations uttered by the focal individual and any other individual that was audible (for exceptions and specifics for each species, see the methods description per species below). We noted if a vocalization was (a) directed toward an infant, and, if so, if it was directed by the mother or another individual, (b) part of surrounding vocal communication or (c) given by the focal individual (see Table 1 for detailed definitions). Vocalizations from unknown individuals were included in the analysis, provided that we could confidently exclude the mother or immature as the source and that we could determine whether the vocalization was directed or surrounding. If this was not the case, for gorillas, chimpanzees (C.F. dataset, see section “Chimpanzees”), and orangutans for which we conducted full-day focal follows, only the 2-min interval in which an unknown vocalization occurred was excluded. The human datasets did not include any 2-min intervals with uncertainty because of the high quality of both the video data and the linguistic transcriptions that provided sufficient information regarding the context of the interactions. The chimpanzee dataset (M.L.) also did not include any 2-min intervals with uncertainty because only focals with excellent visibility were used. For bonobos, if a vocalization was categorized as uncertain, the focal individual was recorded as out of view and we discarded the focal sample from that time onward. This difference in methodology is due to not conducting full-day focals for bonobos but focals of up to 1 hour. Our rationale for excluding all 2-min intervals following a 2-min interval with uncertainty was that uncertainty only arose when a vocalization occurred. If no vocalization occurred, the likelihood of uncertainty was zero. To avoid biasing our dataset through only excluding 2-min intervals with vocalizations, we decided to exclude all subsequent 2-min intervals from that focal observation. Although this might sound extreme, since we did not conduct full-day focal sampling for bonobos, this approach typically resulted in the exclusion of only a few minutes of data. While we appreciate that the approach for the bonobos is overly conservative and different to the other species’ data collection protocols, we were still keen to remove bias where we could, and that was only feasible with the bonobo dataset given the lengths of focals. Furthermore not excluding all subsequent 2-min intervals might have introduced some bias to the gorilla, orangutan, and chimpanzee (C.F.) data. However, since full-day focals were collected in these species, it simply did not make sense to exclude the rest of the day. Recording distances for all nonhuman great apes were between 7 and 15 m and for humans, approximately 2 to 10 m. To ensure that there was no intrinsic bias in how we coded our data, we conducted interobserver reliability testing using a blind coder for all human, chimpanzee (C.F. data), and bonobo datasets. Results suggest high levels of reliability in the assignment of vocalizations as either directed or surrounding (all Cohen‘s kappa values >0.8; see the Supplementary Materials). It is possible that recordings from humans tend to represent slightly higher rates of interactions compared to absolute naturalistic scenarios since non-daylong recordings typically only include situations where the infant is awake and surrounded by other individuals rather than being alone. Conversely, in nonhuman great ape species, there might be an underestimation of infant-directed vocal input due to the distance between observers and the individuals (minimum 7 m), which was even increased for chimpanzees and gorillas during the COVID-19 pandemic. Some great apes also tended to be in high and dense canopies, making it challenging to see the individuals and difficult to hear their calls. Hence, any data recorded when the individual was above 15 m and/or in such dense vegetation, were discarded. All data for this study were collected noninvasively and were purely observational. Informed consent was given by all caregivers of human infants. The data collection on nonhuman great apes adhered to the guidelines of the American Society of Primatologists for the ethical treatment of nonhuman primates.

Bonobos

Study site and study groups

F.W. collected data from three wild bonobo communities at the Kokolopori Bonobo Reserve (N0.41716°, E22.97552°) (57) in the Democratic Republic of the Congo (DRC). At the time of the study, the Kokoalongo community consisted of 38 individuals (18 adults, 9 juveniles, and 11 infants), the Ekalakala community consisted of 21 individuals (12 adults, 3 juveniles, and 9 infants), and the Fekako community consisted of 9 individuals (6 adults, 1 juvenile, and 2 infants). Ethical permission to conduct the data collection was granted by the Institutional Animal Care and Use Committee at the Faculty of Arts and Sciences at Harvard University, the Institut Congolais pour la Conservation de la Nature (ICCN; 420/ICCN/DG-INT/03/12), the Ministry of Science and Technology of the DRC (MIN RST/SG/180/21; MIN RST/SG/180/23; MIN RST/SG/180/24) and is in line with the ethical guidelines of the former Department of Primatology at the Max Planck Institute for Evolutionary Anthropology.

Data collection

Data were collected from May to October 2022 from 05:30 a.m. to 05:00 p.m. using continuous focal follows of up to 60 min. If a focal individual went out of sight before 60 min, the minimum length for a focal sample to be included in this study was 6 min. Focal follows were conducted with a directional microphone [Sennheiser directional microphone (K6 power module, ME66 recording head and Rycote-Softie windscreen)] attached to a solid-state recorder (Marantz PMD 660). Vocalizations were recorded with a 44.1-kHz sampling frequency and a 16-bit amplitude resolution.

Chimpanzees

Study site and study groups

C.F. and M.L. collected data in the Sonso community at the Budongo Forest Field Station (BCFS), Uganda (between 1°350 and 1°550 N and 31°080 and 31°420 E). The Sonso community (58) was composed of approximately 70 individuals (51 adults, 7 juveniles, and 17 infants in 2008 and 41 adults, 11 juveniles, and 11 infants in 2022). Ethics assessment was conducted, and permission for data collection was granted by the Uganda Wildlife Authority and the Uganda National Council for Science and Technology (UNCST-CF project registration number: NS272ES).

Data collection

M.L. collected data from January to December 2008 and by C.F. from February to September 2022. Data were collected from 07:00 a.m. to 04:30 p.m. except on Sundays where data were collected from 07:00 a.m. to 01:30 p.m. During the two study periods, 7 and 11 focal infants were followed, respectively. From the 1st to the 20th of February, additional COVID-19 protocols were in place and data could only be collected until 12.00 p.m. on any given day. From the 21st of February to the 6th of March, data could only be taken until 2.30 p.m. From the 7th of March onward, normal sampling days resumed. M.L. sampled continuously during 30- to 60-min (average of 51 ± 30 min) focals follows of an individual when visibility was not obstructed and then switched focal individuals throughout the day. M.L. sampled vocalizations in field notes and recorded vocalizations with an M66 Sennheiser directional microphone with the factory windshield and K6P power module. The microphone was attached to a solid-state recorder Marantz PMD660. Vocalizations were recorded with a 48-k sampling frequency and a 16-bit amplitude resolution. M.L. later merged and digitized the data. C.F. sampled continuously during full day follows and only switched focal individuals after 30 min of the focal individual being out of sight. C.F. recorded vocalizations with an M66 Sennheiser directional microphone with the factory windshield and K6P power module. The microphone was attached to a solid-state recorder Marantz PMD 661 MKII. Vocalizations were recorded with a 48-k sampling frequency and a 16-bit amplitude resolution. We compared the input rates between the two datasets and found them to be overall very similar. We merged the two datasets for chimpanzees for our analyses (for more detail, see the chimpanzee data comparison subsection in the Supplementary Materials).

Gorillas

Study site and study groups

L.N. collected data from three groups of wild western gorillas in the Dzanga-Ndoki National Park of the Dzanga-Sangha Protected Areas in south-western Central African Republic (CAR). The CAR1 and CAR2 groups were studied in Bai Hokou (20°50′N, 16°28′E) and the CAR3 group in Mongambe (2°55′N, 16°23′E). The CAR1 group consisted of 8 individuals (4 adults, 2 juveniles and 2 infants); the CAR2 group consisted of 9 individuals (5 adults, 2 juveniles and 2 infants); and the CAR3 group consisted of 10 individuals (3 adults, 1 juvenile and 2 infants). Ethical evaluation and approval of the data collection was conducted by the Ethics Committee for Animal Experimentation - Cuvier Committee at the National Natural History Museum in France and the Ministre de la Recherche Scientifique et de l’Innovation Technologique of the Central African Republic (permit numbers: No. 020/MRSIT/DIRCAB/ CB.21 and No. 1PFGS21).

Data collection

Full-day follows on mothers with offspring were collected between 06:30 a.m. and 05:00 p.m., from April 2021 to March 2022. The 2-min intervals where the mother was in sight were summed up per day. This methodological difference, (i.e., focusing on the mother and not the infant) arose because the gorilla data were collected for a separate study. Data on vocalizations by all individuals, except laughter and play vocalizations, were recorded with a handheld Runbo device using cybertracker (see www.cybertracker.org). Although not recording laughter may have led to an underestimate of infant-directed vocalizations within the play context for gorillas, we did not record infant-directed laughter from the mother in any nonhuman great ape species, and for humans, no 2-min interval that was coded as “directed” included only directed laughter. Hence, despite these subtle differences, we are confident that our approach and results are still robust.

Orangutans

Study site and study groups

C.F. collected data from February to June 2018 at the long-term field site of Tuanan, Mawas Reserve, Central Kalimantan, Indonesia (02° 15′’S; 114° 44′E). During this study period, 10 adult females and their dependent offspring and 15 adult males were regularly seen in the study area (59). The Indonesian State Ministry for Research and Technology (RISTEK, permit number: 02/EXT/SIP/FRP/E5/Dit.KI/I/2018), the Directorate General of Natural Resources and Ecosystem Conservation-Ministry of Environment Forestry of Indonesia (KSDAE-KLHK), the Ministry of Internal Affairs and the Nature Conservation Agency of Central Kalimantan (BKSDA KalTeng) gave their approval for this study.

Data collection

Data were collected from February to June 2018 from around 5 a.m. until the individuals built their evening nests. All vocalizations (including their call type and context) given and heard by the focal individual were noted continuously ad libitum throughout the focal sample. Mothers of each focal individual were simultaneously followed, and the simultaneous data collected from these follows were used to complete the vocal dataset, e.g., regarding the directedness of a vocalization. Duplicates of vocalizations were identified and removed from the dataset after data collection. Data were directly collected electronically on Excel with an iPad.

Humans

All data from human infants were collected in their naturalistic language environments. Recording schemes differed between the different datasets from different languages and are described below. The minimum amount of data extracted from the different corpora per child per month was 60 min. Recording samples across languages were matched in age and gender as much as possible. Caregivers from all families gave their written (Qaqet, Shipibo-Konibo, and Tuatschin) or oral (Chintang) consent to participate in the data collection.

Shipibo-Konibo

J.S. collected the Shipibo-Konibo data between July 2021 and October 2022 in Callería, a small Shipibo-Konibo village in the Ucayali river valley. Shipibo-Konibo is a Panoan language spoken by approximately 20,000 people in the Ucayali river valley in the central eastern part of Peru (60). Data consist of naturalistic daylong video recordings in and around the homes of 14 children (11 to 54 months). One hour from each daylong recording was manually transcribed using the following sampling scheme: three segments of 10-min at 10:00 a.m., 01:00 p.m., and 04:00 p.m. Four randomly selected segments of 7.5-min before, between, and after the previously mentioned periodically selected clips. For each child, a total of 60 min of transcribed recordings from the same day were extracted from the corpus. The 4*7.5-min segments of each day were summed up to one 30-min segment for data processing. Ethical permission to conduct the data collection was granted by the University of Zurich and local representatives of the Shipibo-Konibo community.

Qaqet

B.H. collected the data for the Qaqet corpus between 2015 and 2016 in Raunsepna. Qaqet is a Papuan language spoken by approx. 15,000 people in Papua New Guinea’s East New Britain Province (61). Three children (age: 25, 32, and 41 months) were video-recorded within their environment by their parents or other adults for approximately 1 hour/ week. For this study, data from three following weeks of transcribed recording sessions from each child were extracted from the corpus resulting in a total of 2 hours of recordings per child on average. Ethical permission to conduct the data collection was granted by the La Trobe University, the National Research Institute of Papua New Guinea, the local NGO Bainings Environmental Heritage Conservation Foundation Inc., and the elders and representatives of the communities.

Chintang

The Chintang Language Research Program collected the data we used of the Chintang corpus in 2004 in Chintang, a village located in the lower foothills of the Himalayas (Dhankuṭā district) in Nepal. Chintang is a Sino-Tibetan language spoken by approximately 5000 speakers in the Himalayas (Kiranti group) (62). Children were video-recorded for a total of 4 hours during several sessions within a single week. For this study, we extracted 1 hour of transcribed recordings per child (age: 26, 32, 41, and 46 months) from the corpus from a total of four children.

Tuatschin

G. Walther and J. Mazara collected the data for the Tuatschin corpus between 2016 and 2019 in the Val Tujetsch as well as with Sursilvan-Tuatschin–speaking families in the Swiss diaspora. Tuatschin is a dialect of the Romansh Sursilvanvariety spoken by approximately 1500 speakers in the Val Tujetsch area (Switzerland). Children were video-recorded for a total of at least 4.5 hours, several times a week without a researcher being present. For this study, 100 min of transcribed recording sessions per child (age: 26, 28, 32, 37, 42, and 46 months) were extracted from the corpus from a total of six children. Ethical permission to conduct the data collection was granted by the University of Zurich.

Data annotation

We manually annotated data from all five species to estimate the rates of directed and surrounding communicative input infants receive. All recordings were analyzed using zero-one sampling (31) in 2-min intervals. For the bonobo, chimpanzee, orangutan, and human data, every 2-min interval was coded for the presence or absence of infant-directed vocal communication, infant-directed vocal communication from the mother only, infant-surrounding communication, or vocal activity (see Table 1 for definitions). There could be overlap in inputs with a given 2-min interval containing multiple types of input. If that was the case, the interval was taken into account for each separate category. For the gorilla data, only infant-directed vocal communication and vocal activity could be annotated. General vocal activity was computed by adding together all 2-min intervals within one recording (or, in the case of gorillas: 1 day) in which either a surrounding, directed, or an infant vocalization occurred. For humans, prelinguistic vocalizations and laughing were included in the annotations given that they also have valuable communicative functions.

Statistical analysis

To investigate species (bonobo, chimpanzee, gorilla, orangutan, and human) differences in infant-directed and infant surrounding (no data available for gorillas) vocal communication, in a first step, we formulated three Bayesian GLMMs, specifying a Poisson distribution. The respective outcome variables of these models were the number of 2-min intervals (per focal sample) with any (i) infant-directed vocal communication, (ii) infant-directed vocal communication by the mother, and (iii) infant-surrounding vocal communication. To answer the question whether species differed in relative terms, in a second step, we investigated differences in the proportion of infant-directed and infant-surrounding vocal communication to vocal activity. Specifically, we explored potential differences across species by dividing the number of 2-min intervals in which infant-directed vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. We used Bayesian binomial GLMMs to control for the vocal activity of each species. The respective outcome variables of these models were: (i) the proportion of all vocal inputs scored as infant-directed vocal communication, (ii) the proportion of all vocal communication scored as infant-directed vocal communication produced by the mother, and (iii) the proportion of all vocal communication scored as surrounding.
Each model incorporated species and infant age as potential explanatory variables while accounting for differences in sampling effort by including an offset term (the natural logarithm of the number of 2-min intervals that made up each focal sample). Random intercepts were estimated to account for multiple observations on each focal individual (infant ID), as well as the observed over-dispersion [i.e., an Observation Level Random Effect (63)].
We fitted our models in R (64) (version 4.3.2), using the “brms” (65) interface to Stan (66). Parameters were estimated by running four independent Monte Carlo Markov chains for 8000 iterations each (6000 to warm-up and 2000 to sample the posterior distribution). To facilitate model convergence, we specified weakly regularizing priors. Chain convergence, mixture, and stationarity were confirmed by visual inspection of trace plots and by ensuring that all R̂=1.00 . To achieve this, the adapt_delta-argument was increased to 0.98.
Overall model performance was assessed by performing visual posterior predictive checks and by calculating a Bayesian R2-statistic (67). Post hoc contrasts were calculated to quantify pairwise species differences. The outcome of all analyses can be found in the Supplementary Materials (see tables S3 to S8).
The respective human language could not be included as a variable to the main model since it would require lumping all nonhuman species into a single “nonlinguistic” category. We therefore conducted an additional within-species analysis with the four human datasets, as well as with the two chimpanzee datasets that were added together. Results of both comparisons can be found in the Supplementary Materials.

Acknowledgments

We thank the research staff of all field sites for invaluable help with data collection. We thank the Institut Congolais pour la Conservations de la Nature and the Ministry of Scientific Research and Technology in the DRC for permission to work in the Kokolopori Bonobo Reserve and the Bonobo Conservation Initiative and Vie Sauvage for support. For support and permission to collect data on chimpanzees at the Budongo Conservation Field Station, we thank UWA, UNCST, and RZSS. We thank the government and the Ministre de la Recherche Scientifique et de l’Innovation Technologique of the CAR and the WWF CAR for permission and support to collect data on gorillas in the Dzanga-Sangha Protected Areas. In addition, for permission and support to collect data on orangutans at Tuanan, we thank RISTEK, BKSDA, KLHK, and BOSF. We are very grateful to all the families who participated in the child language data collection. We thank G. You for help with data preparation, N. Lahiff for blind coding data, E. Ringen for statistical advice, and C. Schuppli and L. Fornof for helpful discussions. We also thank A. Russell, N. Lahiff, and M. Townsend for helpful comments on earlier drafts of this manuscript.
Funding: This research was funded by the NCCR Evolving Language, SwCSS NSF agreement Nr.51NF40_180888 (F.W., C.F., J.S., K.Z., C.P.v.S., S.S., and S.W.T.), the SNSF grant PP00P3_198912 (S.W.T), the SNSF grant 310030_185324 (L.N. and K.Z.), the Leverhulme Trust Research Leadership Award F/00 268/AP (M.L. and K.Z.), and the Transversal Action of Muséum National d’Histoire Naturelle 2020-2021 and 2021-2022 (S.M. and L.N.).
Author contributions: Conceptualization: F.W., C.F., J.S., K.Z., C.P.v.S., S.S., S.W.T., and E.P.W. Data curation: F.W., C.F., J.S., L.N., M.L., and M.A.v.N. Formal analysis: F.W., C.F., J.S., and E.P.W. Funding acquisition: K.Z., C.P.v.S., S.S., S.W.T., S.M., J.S., C.F., F.W., and M.L. Investigation: F.W., C.F., J.S., L.N., and M.L. Methodology: F.W., C.F., J.S., K.Z., M.L., C.P.v.S., S.S., S.W.T., E.P.W., M.S., and S.M. Project administration: F.W., C.F., J.S., L.N., M.L., M.S., M.A.v.N., S.M., B.H., K.Z., C.P.v.S., S.S., and S.W.T. Software: F.W., C.F., J.S., L.N., and E.P.W. Resources: M.S., S.M., K.Z., C.P.v.S., S.S., and S.W.T. Supervision: K.Z., C.P.v.S., S.S., S.W.T., M.S., M.L., M.A.v.N., and S.M. Validation: F.W., C.F., J.S., E.P.W., K.Z., S.W.T., and S.M. Visualization: J.S., C.F., and F.W. Writing—original draft: F.W., C.F., J.S., S.W.T., S.S., and C.P.v.S. Writing—review and editing: F.W., C.F., J.S., L.N., M.L., M.A.v.N., M.S., S.M., B.H., E.P.W., K.Z., C.P.v.S., S.S., and S.W.T.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Data and code used in the analyses are available in the Zenodo repository: https://doi.org/10.5281/zenodo.15261663.

Supplementary Materials

This PDF file includes:

Supplementary Text
Figs. S1 to S9
Tables S1 to S8
References

REFERENCES AND NOTES

1
J. Huttenlocher, H. Waterfall, M. Vasilyeva, J. Vevea, L. V. Hedges, Sources of variability in children’s language growth. Cogn. Psychol. 61, 343–365 (2010).
2
M. L. Rowe, A longitudinal investigation of the role of quantity and quality of child-directed speech in vocabulary development. Child Dev. 83, 1762–1774 (2012).
3
A. Weisleder, A. Fernald, Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychol. Sci. 24, 2143–2152 (2013).
4
A. Fernald, T. Taeschner, J. Dunn, M. Papousek, B. De Boysson-Bardies, I. Fukui, A cross-language study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. J. Child Lang. 16, 477–501 (1989).

(0)eLetters

eLetters is a forum for ongoing peer review. eLetters are not edited, proofread, or indexed, but they are screened. eLetters should provide substantive and scholarly commentary on the article. Neither embedded figures nor equations with special characters can be submitted, and we discourage the use of figures and equations within eLetters in general. If a figure or equation is essential, please include within the text of the eLetter a link to the figure, equation, or full text with special characters at a public repository with versioning, such as Zenodo. Please read our Terms of Service before submitting an eLetter.

Log In to Submit a Response

No eLetters have been published for this article yet.

Information & Authors

Information

Published In

Science Advances
Volume 11 | Issue 26
June 2025

Article versions

Submission history

Received: 10 October 2024
Accepted: 20 May 2025

Permissions

Request permissions for this article.

Acknowledgments

We thank the research staff of all field sites for invaluable help with data collection. We thank the Institut Congolais pour la Conservations de la Nature and the Ministry of Scientific Research and Technology in the DRC for permission to work in the Kokolopori Bonobo Reserve and the Bonobo Conservation Initiative and Vie Sauvage for support. For support and permission to collect data on chimpanzees at the Budongo Conservation Field Station, we thank UWA, UNCST, and RZSS. We thank the government and the Ministre de la Recherche Scientifique et de l’Innovation Technologique of the CAR and the WWF CAR for permission and support to collect data on gorillas in the Dzanga-Sangha Protected Areas. In addition, for permission and support to collect data on orangutans at Tuanan, we thank RISTEK, BKSDA, KLHK, and BOSF. We are very grateful to all the families who participated in the child language data collection. We thank G. You for help with data preparation, N. Lahiff for blind coding data, E. Ringen for statistical advice, and C. Schuppli and L. Fornof for helpful discussions. We also thank A. Russell, N. Lahiff, and M. Townsend for helpful comments on earlier drafts of this manuscript.
Funding: This research was funded by the NCCR Evolving Language, SwCSS NSF agreement Nr.51NF40_180888 (F.W., C.F., J.S., K.Z., C.P.v.S., S.S., and S.W.T.), the SNSF grant PP00P3_198912 (S.W.T), the SNSF grant 310030_185324 (L.N. and K.Z.), the Leverhulme Trust Research Leadership Award F/00 268/AP (M.L. and K.Z.), and the Transversal Action of Muséum National d’Histoire Naturelle 2020-2021 and 2021-2022 (S.M. and L.N.).
Author contributions: Conceptualization: F.W., C.F., J.S., K.Z., C.P.v.S., S.S., S.W.T., and E.P.W. Data curation: F.W., C.F., J.S., L.N., M.L., and M.A.v.N. Formal analysis: F.W., C.F., J.S., and E.P.W. Funding acquisition: K.Z., C.P.v.S., S.S., S.W.T., S.M., J.S., C.F., F.W., and M.L. Investigation: F.W., C.F., J.S., L.N., and M.L. Methodology: F.W., C.F., J.S., K.Z., M.L., C.P.v.S., S.S., S.W.T., E.P.W., M.S., and S.M. Project administration: F.W., C.F., J.S., L.N., M.L., M.S., M.A.v.N., S.M., B.H., K.Z., C.P.v.S., S.S., and S.W.T. Software: F.W., C.F., J.S., L.N., and E.P.W. Resources: M.S., S.M., K.Z., C.P.v.S., S.S., and S.W.T. Supervision: K.Z., C.P.v.S., S.S., S.W.T., M.S., M.L., M.A.v.N., and S.M. Validation: F.W., C.F., J.S., E.P.W., K.Z., S.W.T., and S.M. Visualization: J.S., C.F., and F.W. Writing—original draft: F.W., C.F., J.S., S.W.T., S.S., and C.P.v.S. Writing—review and editing: F.W., C.F., J.S., L.N., M.L., M.A.v.N., M.S., S.M., B.H., E.P.W., K.Z., C.P.v.S., S.S., and S.W.T.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Data and code used in the analyses are available in the Zenodo repository: https://doi.org/10.5281/zenodo.15261663.

Authors

Affiliations

Funding Information

Leverhulme Trust Research Leadership Award: F/00 268/AP
Leverhulme Trust Research Leadership Award: F/00 268/AP
Transversal Action of Mus´eum National d’Histoire Naturelle 2020-2021 and 2021-2022
Transversal Action of Muse ́um National d’Histoire Naturelle 2020-2021 and 2021-2022

Notes

*
Corresponding author. Email: franziska.wegdell@iea.uzh.ch (F.W.); caroline.fryns@unine.ch (C.F.); johanna.schick@uzh.ch (J.S.)
These authors contributed equally to this work.
These authors contributed equally to this work.

Metrics & Citations

Metrics

Article Usage

  • 0 citation in Scopus
  • 0 citation in Web of Science

Altmetrics

Citations

Cite as

Export citation

Select the format you want to export the citation of this publication.

View Options

View options

PDF format

Download this article as a PDF file

Download PDF

Figures

Fig. 1. Species differences in infant-directed vocal communication.
Model predictions showing the absolute amount of infant-directed vocal communication (A) and the proportion of infant-directed vocal communication relative to the general vocal activity (B) per species. The absolute amount of infant-directed vocal communication was calculated by the number of 2-min intervals (per focal observation bout) in which infant-directed vocal communication occurred. The relative amount of infant-directed vocal communication was calculated by dividing the number of 2-min intervals in which infant-directed vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal sample while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.
Fig. 2. Species differences in infant-directed vocal communication from the mother.
Model predictions showing the absolute amount of infant-directed vocal communication from the mother (A) and the proportion of infant-directed vocal communication from the mother relative to the general vocal activity (B) per species. The absolute amount of infant-directed vocal communication from the mother was calculated by the number of 2-min intervals (per focal observation bout) in which infant-directed vocal communication from the mother occurred. The relative amount of infant-directed vocal communication from the mother was calculated by dividing the number of 2-min intervals in which infant-directed vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.
Fig. 3. Species differences in infant-surrounding vocal communication.
Model predictions showing the absolute amount of surrounding vocal communication (A) and the proportion of surrounding vocal communication relative to the general vocal activity (B) per species. The absolute amount of surrounding vocal communication was calculated by the number of 2-min intervals (per focal observation bout) in which surrounding vocal communication occurred. The relative amount of infant-directed vocal communication was calculated by dividing the number 2-min intervals in which infant-surrounding vocal communication was reported, by the total number of 2-min intervals with any kind of vocalization. Model predictions from Bayesian GLMMs are calculated for each focal sample while controlling for differences in sampling effort. Error bars indicate 95% credible intervals.

Tables

Table 1. Definition of vocal input types and vocal activity and summary of the data available per species.

Multimedia

Share

Share

Copy the article link

Share on social media

References

References

1
J. Huttenlocher, H. Waterfall, M. Vasilyeva, J. Vevea, L. V. Hedges, Sources of variability in children’s language growth. Cogn. Psychol. 61, 343–365 (2010).
2
M. L. Rowe, A longitudinal investigation of the role of quantity and quality of child-directed speech in vocabulary development. Child Dev. 83, 1762–1774 (2012).
3
A. Weisleder, A. Fernald, Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychol. Sci. 24, 2143–2152 (2013).
4
A. Fernald, T. Taeschner, J. Dunn, M. Papousek, B. De Boysson-Bardies, I. Fukui, A cross-language study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. J. Child Lang. 16, 477–501 (1989).
5
C. Cox, C. Bergmann, E. Fowler, T. Keren-Portnoy, A. Roepstorff, G. Bryant, R. Fusaroli, A systematic review and Bayesian meta-analysis of the acoustic features of infant-directed speech. Nat. Hum. Behav. 7, 114–133 (2023).
6
M. Soderstrom, Beyond babytalk: Re-evaluating the nature and content of speech input to preverbal infants. Dev. Rev. 27, 501–532 (2007).
7
R. M. Golinkoff, D. D. Can, M. Soderstrom, K. Hirsh-Pasek, Baby talk to me: The social context of infant-directed speech and its effects on early language acquisition. Curr. Dir. Psychol. Sci. 24, 339–344 (2015).
8
A. S. Holzrichter, R. P. Meier, Child-directed signing in American sign language, in Language Acquisition by Eye (Psychology Press), pp. 25–40 (1999).
9
J. M. Iverson, O. Capirci, E. Longobardi, M. C. Caselli, Gesturing in mother-child interactions. Cogn. Dev. 14, 57–75 (1999).
10
M. Spinelli, M. Fasolo, J. Mesman, Does prosody make the difference? A meta-analysis on relations between prosodic aspects of infant-directed speech and infant outcomes. Dev. Rev. 44, 1–18 (2017).
11
A. Henninga, T. Strianoa, E. V. M. Lieven, Maternal speech to infants at 1 and 3 months of age. Infant Behav. Dev. 28, 519–536 (2005).
12
B. Ambridge, E. Kidd, C. F. Rowland, A. L. Theakston, The ubiquity of frequency effects in first language acquisition. J. Child Lang. 42, 239–273 (2015).
13
A. Martin, Y. Igarashi, N. Jincho, R. Mazuka, Utterances in infant-directed speech are shorter, not slower. Cognition 156, 52–59 (2016).
14
A. Fernald, Four-month-old infants prefer to listen to motherese. Infant Behav. Dev. 8, 181–195 (1985).
15
The ManyBabies Consortium, Quantifying sources of variability in infancy research using the infant-directed-speech preference. Adv. Methods Pract. Psychol. Sci. 3, 24–52 (2020).
16
J. Y. Song, K. Demuth, J. Morgan, Effects of the acoustic properties of infant-directed speech on infant word recognition. J. Acoust. Soc. Am. 128, 389–400 (2010).
17
Z. O. Weizman, C. E. Snow, Lexical output as related to children’s vocabulary acquisition: Effects of sophisticated exposure and support for meaning. Dev. Psychol. 37, 265–279 (2001).
18
G. Csibra, G. Gergely, Natural pedagogy. Trends Cogn. Sci. 13, 148–153 (2009).
19
C. Pye, Quiché Mayan speech to children. J. Child Lang. 13, 85–100 (1986).
20
P. Brown, The Cultural Organization of Attention in The Handbook of Language Socialization, A. Duranti, E. Ochs, B. B. Schieffelin, Eds. (John Benjamins B.V., 2011), pp. 29–55.
21
L. A. Shneidman, S. Goldin-Meadow, Language input and acquisition in a Mayan village: How important is directed speech? Dev. Sci. 15, 659–673 (2012).
22
A. Cristia, M. Gurven, J. Stieglitz, Child-directed speech is infrequent in a forager-farmer population: A time allocation study. Child Dev. 90, 759–773 (2019).
23
M. Casillas, P. Brown, S. C. Levinson, Early language experience in a Tseltal Mayan village. Child Dev. 91, 1819–1835 (2020).
24
E. K. McClay, S. Cebioglu, T. Broesch, H. H. Yeung, Rethinking the phonetics of baby-talk: Differences across Canada and Vanuatu in the articulation of mothers’ speech to infants. Dev. Sci. 25, e13180 (2022).
25
R. Foushee, M. Srinivasan, Infants who are rarely spoken to nevertheless understand many words. Proc. Natl. Acad. Sci. U.S.A. 121, e2311425121 (2024).
26
M. Casillas, P. Brown, S. C. Levinson, Early language experience in a Papuan community. J. Child Lang. 48, 792–814 (2020).
27
G. Loukatou, C. Scaff, K. Demuth, A. Cristia, N. Havron, Child-directed and overheard input from different speakers in two distinct cultures. J. Child Lang. 49, 1173–1192 (2022).
28
J. Schick, C. Fryns, F. Wegdell, M. Laporte, K. Zuberbühler, C. P. van Schaik, S. W. Townsend, S. Stoll, The function and evolution of child-directed communication. PLOS Biol. 20, e3001630 (2022).
29
M. Fröhlich, R. M. Wittig, S. Pika, Should I stay or should I go? Initiation of joint travel in mother–infant dyads of two chimpanzee communities in the wild. Anim. Cogn. 19, 483–500 (2016).
30
D. K. Oller, U. Griebel, S. N. Iyer, Y. Jhang, A. S. Warlaumont, R. Dale, J. Call, Language Origins Viewed in Spontaneous and Interactive Vocal Rates of Human and Bonobo Infants. Front. Psychol. 10, 729 (2019).
32
R. M. Seyfarth, D. L. Cheney, Production, usage, and comprehension in animal vocalizations. Brain Lang. 115, 92–100 (2010).
33
A. Fernald, T. Simon, Expanded intonation contours in mothers’ speech to newborns. Dev. Psychol. 20, 104–113 (1984).
34
E. M. Luef, K. Liebal, Infant-directed communication in lowland gorillas (Gorilla gorilla): Do older animals scaffold communicative competence in infants? Am. J. Primatol. 74, 841–852 (2012).
35
M. Fröhlich, G. Müller, C. Zeiträg, R. M. Wittig, S. Pika, Gestural development of chimpanzees in the wild: The impact of interactional experience. Anim. Behav. 134, 271–282 (2017).
36
A. Knox, J. Markx, E. How, A. Azis, C. Hobaiter, F. J. van Veen, H. Morrogh-Bernard, Gesture use in communication between mothers and offspring in wild orang-utans (Pongo pygmaeus wurmbii) from the Sabangau peat-swamp forest, Borneo. Intl. J. Primatol. 40, 393–416 (2019).
37
M. Halina, F. Rossano, M. Tomasello, The ontogenetic ritualization of bonobo gestures. Anim. Cogn. 16, 653–666 (2013).
38
P. Szenczi, O. Bánszegi, A. Urrutia, T. Faragó, R. Hudson, Mother–offspring recognition in the domestic cat: Kittens recognize their own mother’s call. Dev. Psychobiol. 58, 568–577 (2016).
39
J. P. Balcombe, G. F. McCracken, Vocal recognition in Mexican free-tailed bats: do pups recognize mothers? Anim. Behav. 43, 79–87 (1992).
40
B. M. Weiss, F. Ladich, P. Spong, H. Symonds, Vocal behavior of resident killer whale matrilines with newborn calves: The role of family signatures. J. Acoust. Soc. Am. 119, 627–635 (2006).
41
D. Y. Takahashi, A. R. Fenley, Y. Teramoto, D. Z. Narayanan, J. I. Borjon, P. Holmes, A. A. Ghazanfar, The developmental dynamics of marmoset monkey vocal production. Science 349, 734–738 (2015).
42
D. Y. Takahashi, D. A. Liao, A. A. Ghazanfar, Vocal learning via social reinforcement by infant marmoset monkeys. Curr. Biol. 27, 1844–1852.e6 (2017).
43
D. Y. Takahashi, A. R. Fenley, A. A. Ghazanfar, Early development of turn-taking with parents shapes vocal acoustics in infant marmoset monkeys. Philos. Trans. R. Soc. B. Biol. Sci. 371, 20150370 (2016).
44
A. A. Fernandez, M. Knörnschild, Pup directed vocalizations of adult females and males in a vocal learning bat. Front. Ecol. Evol. 8, 265 (2020).
45
L. S. Sayigh, N. El Haddad, P. L. Tyack, V. M. Janik, R. S. Wells, F. H. Jensen, Bottlenose dolphin mothers modify signature whistles in the presence of their own calves. Proc. Natl. Acad. Sci. U.S.A. 120, e2300262120 (2023).
46
V. M. Janik, M. Knörnschild, Vocal production learning in mammals revisited. Philos. Trans. R. Soc. B Biol. Sci. 376, 20200244 (2021).
47
A. M. Ashbury, E. P. Willems, S. S. Utami Atmoko, F. Saputra, C. P. van Schaik, M. A. van Noordwijk, Home range establishment and the mechanisms of philopatry among female Bornean orangutans (Pongo pygmaeus wurmbii) at Tuanan. Behav. Ecol. Sociobiol. 74, 42 (2020).
48
P. Floor, N. Akhtar, Can 18-month-old infants learn words by listening in on conversations? Inf. Dent. 9, 327–339 (2006).
49
A. Fitch, A. M. Lieberman, R. J. Luyster, S. Arunachalam, Toddlers’ word learning through overhearing: Others’ attention matters. J. Exp. Child Psychol. 193, 104793 (2020).
50
J. Schick, M. M. Daum, S. Stoll, Input to the language learning infant: The impact of other children (2024). 10.31219/osf.io/e547z.
51
C. J. Charvet, Cutting across structural and transcriptomic scales translates time across the lifespan in humans and chimpanzees. Proc. Biol. Sci. 288, 20202987 (2021).
52
T. Breuer, M. B.-N. Hockemba, C. Olejniczak, R. J. Parnell, E. J. Stokes, Physical maturation, life-history classes and age estimates of free-ranging Western gorillas - Insights from Mbeli Bai, Republic of Congo. Am. J. Primatol. 71, 106–119 (2009).
53
S. M. Lee, C. M. Murray, E. V. Lonsdorf, B. Fruth, M. A. Stanton, J. Nichols, G. Hohmann, Wild bonobo and chimpanzee females exhibit broadly similar patterns of behavioral maturation but some evidence for divergence. Am. J. Phys. Anthropol. 171, 100–109 (2020).
54
M. A. van Noordwijk, C. P. van Schaik, Development of ecological competence in Sumatran orangutans. Am. J. Phys. Anthropol. 127, 79–94 (2005).
55
R. S. Mendonça, T. Kanamori, N. Kuze, M. Hayashi, H. Bernard, T. Matsuzawa, Development and behavior of wild infant-juvenile East Bornean orangutans (Pongo pygmaeus morio) in Danum Valley. Primates 58, 211–224 (2017).
56
M. A. van Noordwijk, S. S. U. Atmoko, C. D. Knott, N. Kuze, H. C. Morrogh-Bernard, F. Oram, C. Schuppli, C. P. van Schaik, E. P. Willems, The slow ape: High infant survival and long interbirth intervals in wild orangutans. J. Hum. Evol. 125, 38–49 (2018).
57
M. Surbeck, S. Coxe, A. L. Lokasola, Lonoa: The establishment of a permanent field site for behavioural research on Bonobos in the Kokolopori Bonobo Reserve. Pan Africa News 24, 13–15 (2017).
58
V. Reynolds, The Chimpanzees of the Budongo Forest: Ecology, Behaviour and Conservation (OUP Oxford, 2005).
59
M. A. van Noordwijk, L. R. LaBarge, J. A. Kunz, A. M. Marzec, B. Spillmann, C. Ackermann, P. Rianti, E. R. Vogel, S. S. U. Atmoko, M. Kruetzen, C. van Schaik, Reproductive success of Bornean orangutan males: Scattered in time but clustered in space. Behav. Ecol. Sociobiol. 77, 134 (2023).
60
P. Valenzuela, Transitivity in Shipibo-Konibo grammar: A typologically oriented study, (dissertation, University of Oregon, Eugene, OR) (2003).
61
B. Hellwig, D. Jung, Events of caused accompanied motion in Qaqet and Dene Suline child language corpora. Caused Accompanied Motion p. 397 (2022).
62
Stoll S. Lieven E. Banjade G. Bhatta T. N. Gaenszle M. Paudyal N. P. Rai M. Rai N. K. Rai I. P. Zakharko T. Audiovisual corpus on the acquisition of Chintang by six children 2015.
63
X. A. Harrison, A comparison of observation-level random effect and Beta-binomial models for modelling overdispersion in binomial data in ecology & evolution. PeerJ 3, e1114 (2015).
64
R Core Team R: A language and environment for statistical computing (Computer software manual) (2022) https://R-project.org.
65
P.-C. Bürkner, Bayesian item response modeling in R with brms and Stan. J. Stat. Softw. 100, 1–54 (2021).
66
B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. A. Brubaker, J. Guo, P. Li, A. Riddell, Stan: A probabilistic programming language. J. Stat. Softw. 76, 1 (2017).
67
A. Gelman, B. Goodrich, J. Gabry, A. Vehtari, R-squared for Bayesian regression models. Am. Stat. 73, 307–309 (2019).
68
C. Hobaiter, R. W. Byrne, The meanings of chimpanzee gestures. Curr. Biol. 24, 1596–1600 (2014).
69
M. Davila-Ross, B. Allcock, C. Thomas, K. A. Bard, Aping expressions? Chimpanzees produce distinct laugh types when responding to laughter of others. Emotion 11, 1013–1020 (2011).
70
S. L. Winkler, G. A. Bryant, Play vocalisations and human laughter: A comparative review. Bioacoustics 30, 499–526 (2021).
71
S. A. Wich, M. Krützen, A. R. Lameira, A. Nater, N. Arora, M. L. Bastian, E. Meulman, H. C. Morrogh-Bernard, S. S. U. Atmoko, J. Pamungkas, D. Perwitasari-Farajallah, M. E. Hardus, M. van Noordwijk, C. P. van Schaik, Call cultures in orang-utans? PLOS ONE 7, e36180 (2012).
72
M. Gamer, J. Lemon, I. F. P. Singh, irr: Various Coefficients of Interrater Reliability and Agreement (2019), https://CRAN.R-project.org/package=irr, r package version 0.84.1.
73
A. B. Kaufman, R. Rosenthal, Can you believe my eyes? The importance of interobserver reliability statistics in observations of animal behaviour. Anim. Behav. 78, 1487–1491 (2009).
ScienceAdviser

Get Science’s award-winning newsletter with the latest news, commentary, and research, free to your inbox daily.