Abstract
Retracting academic papers is a fundamental tool of quality control, but it may have far-reaching consequences for retracted authors and their careers. Previous studies have highlighted the adverse effects of retractions on citation counts and the citations of co-authors; however, the broader impacts beyond these have not been fully explored. Here we address this gap by leveraging Retraction Watch, the most extensive dataset on retractions and link it to Microsoft Academic Graph and Altmetric. Retracted authors, particularly those with less experience, often leave scientific publishing in the aftermath of a retraction, especially if their retractions attract widespread attention. However, retracted authors who remain active in publishing maintain and establish more collaborations compared with their similar non-retracted counterparts. Nevertheless, retracted authors generally retain less senior and less productive co-authors, but gain more impactful co-authors post-retraction. Our findings suggest that retractions may impose a disproportionate impact on early career authors.
Similar content being viewed by others
Main
Reputation is a crucial factor in building status, particularly when quality is uncertain or unobservable1 and when it is produced through highly technical and complex processes. This characterizes creative fields, medicine and science alike. Therefore, when a scientist’s reputation is challenged, the consequences can be severe2,3,4,5, with long-lasting effects on their career outcomes. The credibility of a scientist, a crucial currency of their reputation, is established over the course of their career based on the quality of their publications6, among other factors. Therefore, when the quality of one’s work is called into question, the stakes are high, and the consequences can be more substantial than the outcome of a single project. While positive signals, such as citations and grants, have been linked to reputation building6,7, less is understood about the relationship between a scientist’s challenged reputation and their career progression in future collaborations. Hence, further research is needed to fully comprehend the impact of such challenges on scientific collaborations and career trajectories. Retractions of scientific papers give us a window through which to study this question.
When the integrity of a scientific paper is disputed, editors and authors may choose to remove the work, either together or in isolation. While the article may still be accessible, it will be accompanied by a retraction notice that explains the reason(s) behind its removal, such as misconduct, plagiarism, mistake or other considerations. This creates a clear and visible signal associated with the authors of the paper that the quality of their work has come under scrutiny. Retractions have been used for this purpose since 17568, and contemporary journals have formal procedures to execute when the authors or readers highlight problematic content.
Prior work has investigated the impact of retractions on scientific careers, examining productivity3, citations for retracted papers4, citations for papers published before retraction2 and post-retraction citations of pre-retraction collaborators5, generally finding negative effects. However, some work also demonstrates that these effects are heterogeneous and might vary based on the reason for retraction and/or the prominence of the retracted author2,4,9, for instance, measured by author order. While recent research is breaking new ground on quantitative analysis of the impact of retractions, it often focuses on specific fields or compares different retracted authors to one another (for example, those who have experienced a single versus multiple retractions). This existing work used various strategies to construct a comparison group for retracted authors, some examining others publishing in the same journal at the same time, while others examining the co-authors of retracted scientists on non-retracted papers or on retracted papers without assigning them blame. None use a more comprehensive strategy to create matched pairs on the basis of a series of author-level characteristics and, with few exceptions, they do not examine the impact of retraction on post-retraction collaboration networks10,11. Therefore, a comprehensive analysis of retractions across fields and over time has yet to be undertaken where retracted authors are compared with otherwise similar non-retracted authors. Beyond documenting the impact of retraction on careers, it is essential to examine the mechanisms that might bring about these effects. Therefore, here we focus both on the continuity of post-retraction careers and the development of the collaboration network of retracted authors that is needed to succeed in publishing careers12,13,14.
Retractions can attract significant attention, particularly when they expose egregious misconduct. Such instances not only question the authors’ reputation, but also undermine the public’s trust in science, scientific findings and institutions of science, such as universities, internal review boards, journals and the peer-review process. Some retractions, therefore, cast a long shadow that extends far beyond the scrutinized work. For instance, the retraction of a study on political persuasion and gay marriage in Science in 201515, which was probably based on fabricated data, led to questions about the impartiality of reviewers16. Similarly, when a paper in Nature was retracted due to falsified images17, criticism went beyond concerning the conduct of the first author and extended to the male-dominated Japanese academy and its culture of fierce pressure and competition18. In both cases, the first authors left scientific publishing careers after receiving extreme levels of attention (Altmetric scores above 1,000). However, how systematic the impact of this attention is, is yet to be fully understood. This question, of course, is tied closely to how retracted scientists might rebuild their collaboration network, as future collaborators may or may not learn about past events, depending on the level of attention they received.
The value in social relationships and the theory that resources encapsulated in them may be leveraged is longstanding19. The assumption, specifically that larger collaboration networks are beneficial, is rooted in prior work that documents the benefits of research collaborations and that of larger collaboration networks. Qualitative evidence suggests that researchers collaborate for both instrumental and strategic reasons, such as access to specialized expertise, equipment and other resources, visibility for professional advancement and enhanced research productivity, as well as emotional reasons, since many regard collaborative work as energizing and fun20. These self-reports are reflected in empirical evidence, such as the association between the size of collaboration networks and citations21, in addition to future productivity12,22. Importantly, Ductor and colleagues suggest that the quality of one’s co-author network signals important information about researchers’ quality, and that these signals are crucial to assess one’s research potential, especially at the beginning of the career22. Additionally, prior work reveals that co-author networks show higher levels of triadic closure than expected by chance, that is, authors of scientific papers tend to work with former co-authors of their co-authors in the future23,24. Such regularity is based on similarity, but also on strategic considerations, where a scientist brokers relationships among their unconnected co-authors, thereby communicating information about the qualities of those they connect that are challenging to observe otherwise25, such as their skill or integrity in the context of scientific publishing. Extending these arguments to authors who experience a retraction, the collaboration networks they maintain or build could be crucial to recover from a negative signal about the quality of their work and some processes, such as triadic closure, might help them in particular to do so.
Drawing on retractions as a (potentially stigmatizing) signal that challenges authors’ reputations, we offer three key empirical observations. First, we find that the extent of attention received by a retraction is positively associated with the likelihood of retracted authors leaving publishing careers. That is, the more public the retraction, the more profound its consequences appear to be for authors’ careers. This finding is especially significant since most attention received by papers extends beyond the content of the science and involves discussions of great societal importance about the context within which scientific findings are produced. Second, perhaps counterintuitively, we demonstrate that conditional on staying in scientific publishing, retracted authors retain and gain more collaborators compared with otherwise similar authors without retractions. Third, while these larger collaboration networks may benefit retracted authors, retracted authors generally build qualitatively different and weaker networks compared with their similar counterparts in terms of their collaborators’ seniority and productivity post-retraction.
Results
The consequences of retractions on authors’ careers can be severe, sometimes resulting in them leaving scientific publishing entirely. Analysing the timing of an author’s departure from publishing following a retraction can yield valuable insights. To facilitate this analysis, we utilized two main datasets: Retraction Watch (RW)26, the most extensive publicly available database of retracted papers with over 26,000 publications in around 5,800 venues, and Microsoft Academic Graph (MAG)27,28, which provides comprehensive records and citation networks for over 263 million scientific publications and collaboration networks for over 271 million authors. From RW, we excluded bulk retractions (for example, when all papers are retracted from a conference proceeding as a result of questionable peer review29), as well as authors with multiple retractions to centre on a singular event in our analysis. Furthermore, we focused on papers retracted between 1990 and 2015 to allow us to examine post-retraction outcomes. After filtering RW, we merged it with MAG, identifying over 4,578 retracted papers involving 14,579 authors (the ‘filtered’ sample). Linking these two datasets allows us to characterize these retracted authors and describe their pre-retraction and post-retraction careers. Full details of our preprocessing steps, further justifications for exclusion criteria and merging of datasets can be found in Supplementary Note 1 and ‘Merging RW and MAG’ section.
To study the relationship between when an author leaves scientific publishing and having faced a retraction, it is necessary to define ‘leaving publishing’ or ‘attrition’. Our data, however, are right-censored, meaning that, especially for authors who started publishing recently, we may not observe their entire careers. This makes it difficult to accurately determine their true attrition year or whether they have indeed left scientific publishing. For those whose attrition year can be calculated, we define it by first identifying either the last observed publication year or the start of the first prolonged gap in an author’s career, recognizing that such gaps indicate a significant interruption. For those whose attrition year cannot be determined due to the right-censored nature of the data, we assume that they are still active and highlight how this is handled in each analysis.
To identify attrition, we analyse the distribution of the longest gaps in authors’ publishing careers in science, engineering, technology and mathematics (STEM) fields, which make up the majority of our data in RW, across the entire MAG dataset. For each cohort, the length of this gap is selected to be the 95th percentile of all gaps, indicating that 95% of authors have maximum gaps this long or shorter (Supplementary Fig. 1). The variability in gap lengths across cohorts can be attributed to two main factors: first, MAG relies on digitized publication records, which may result in missing publications, particularly for earlier years; second, over time, the frequency of scientific publishing has increased, thereby reducing the typical gap size. Thus, we determine an author’s attrition year either by identifying the onset of such a gap in their career or, if no such gap exists, by their final year of publishing activity. Authors who have not experienced a gap relevant for their cohort by 2020, when our observation window ends, are presumed to be active.
Next, we categorize retracted authors into three groups based on the relationship between their retraction year and their departure from scientific publishing: those who left publishing careers (1) around the time of retraction (years 0 and −1; Fig. 1, blue), (2) after retraction (years 1 onwards; Fig. 1, pink) or continue to have ongoing careers, and (3) before retraction (years −2 and earlier; Fig. 1a, grey). A notable trend is revealed, showing that approximately 45.9% of authors who have left their publishing careers do so around the time of retraction (Fig. 1a, note that authors with ongoing publishing careers in 2020 are not included). Specifically, 29% leave in the year of retraction (year 0), 16.9% depart shortly before (year −1). In addition to exploring this aggregate pattern, we further investigate the probability of authors remaining in scientific publishing across different academic ages (Supplementary Fig. 2), replicating Fig. 1a but disaggregated by academic age. Our analysis reveals that early career authors, specifically those whose retraction falls within 0–3 years from their first publication, are much more likely to leave publishing when experiencing a retraction. Furthermore, we explore these patterns by affiliation rank, author order and retraction reason (Supplementary Figs. 3–5). We find, descriptively, that authors whose papers were retracted due to misconduct and plagiarism are more likely to leave in the retraction window compared with those retracted for a mistake.
a, The percentage of retracted authors who left (blue) versus those who did not (red). b, Comparisons across different author-level characteristics among retracted authors who left scientific publishing at the time of retraction and those who have not, with N = 12,742. c, Similar to b but for paper-level characteristics, with N = 4,267. The boxes extend from the lower to upper quartile values of the data, with a line at the median; whiskers extend to the minimum and maximum values within 1.5 times the interquartile range of the lower and upper quartiles, respectively; means are shown as additional square markers. Normality and homogeneity of variances were tested and not met, thus P values were calculated using a non-parametric two-sided Mann–Whitney U test. Outliers are removed from box plots for presentation purposes. ***P < 0.001. NA, not available.
Further descriptive statistics based on the sample of 12,742 retracted authors who either left publishing during the retraction window or later, or maintained ongoing careers in 2020 (that is, excluding those in grey in Fig. 1a who left well before retraction occurred) reveal that an overwhelming majority belong to STEM fields such as biology, medicine, chemistry and physics, with less than 1% originating from non-STEM fields (Fig. 1b). Additionally, the share of women in the group who leave around the time of retraction is greater than the share of women in the group who leave after or have continued careers (30% versus 25.5%, respectively; chi-squared (1, N = 12,742) of 20.07, P < 0.001). Moreover, our analysis indicates that authors who leave academic publishing around the time of retraction are significantly less experienced—as measured by academic age (Welch’s t(10,465.17) = −82.459, P < 0.001), number of papers (Welch’s t(11,153.30) = −57.529, P < 0.001), number of citations (Welch’s t(10,713.92) = −37.005, P < 0.001) and number of collaborators (Welch’s t(10,724.73) = −30.594, P < 0.001)—in comparison with authors who stay.
Looking at the characteristics of the 4,267 retracted papers authored by the sample above, we created the following two groups: papers with authors who left around the time of retraction (Fig. 1c, blue) and papers with authors who left later or had ongoing publishing careers in 2020 (Fig. 1c, pink). Note that these two groups have an overlap: papers that had authors from both groups, specifically 26% of retracted papers are this sample. Looking at the distributions of these two groups of papers, we find that the majority of them were published in journals rather than conferences and were retracted after 2010 (Fig. 1c). Most of these papers fall within the top quartile in terms of journal ranking (information on journal ranking is unavailable for approximately 31% of the papers). The reasons for retractions vary, with approximately 23% attributed to misconduct, 33% to plagiarism, 24% to mistakes and an additional 20% to other reasons. For further details on establishing author and paper-level characteristics, see ‘Creating author- and paper-level features’ section and for the standardized mean differences and statistical comparisons between the two groups, see Supplementary Table 1.
These descriptive observations suggest that, despite some variability based on author-level characteristics, the careers of authors with retracted publications tend to be cut short, with an authors exit from publishing often occurring around the time of retraction. To confirm this, we created a comparison group for 2,743 retracted authors by matching them with a non-retracted author, as leaving scientific publishing can be a result of many reasons other than retraction. We matched non-retracted authors exactly on gender, affiliation rank, discipline, number of publications and number of collaborators at the start of their publishing careers (that is, in the first year of publishing). We also ensured that both the retracted author and their match had similar careers up to the time of the publication of their retracted paper. This was achieved by confirming that both authors published a paper in the year the retracted paper was released and that their affiliation rank and discipline were the same at that time. Additionally, we ensured that during that period, the matches are similar in terms of the number of collaborators and publications. This process results in a sample of 2,743 retracted authors with suitable matches. Figure 2a visualizes the difference in publishing career length of the retracted authors and their matched counterparts, confirming that retracted authors do leave publishing earlier. For the standardized mean differences and statistical comparisons between our matched sample and filtered sample, see Supplementary Table 2.
a, Cumulative visualization of the difference in publishing career length between retracted authors with known attrition years and their matched non-retracted counterparts. b, A raincloud plot54 showing the distribution of logged Altmetric score 6 months pre- and post-retraction. The x axis represents monthly windows between the retraction and attention, with 0 being the day of retraction (not displayed), −1 the month right before and 1 the month after. The y axis shows the logged Altmetric score for a paper in the given month. Note that Altmetric scores (0, 1) are frequent; for example, one tweet results in a score of 0.25. The boxes extend from the lower to upper quartile values of the data, with a line at the median; whiskers extend to the minimum and maximum values within 1.5 times the interquartile range of the lower and upper quartiles, respectively. The black trend line represents the average logged Altmetric score. The total number of papers being plotted is 6,507. Papers with no attention within the 12 month window are excluded. The comparison across months shows retracted papers receive the most attention within 1 month of retraction. c, The gap between the cumulative proportion of high-attention retracted authors who left publishing and their matched counterparts, shown at different cut-offs of attention as measured by the Altmetric score.
Next, we investigated the relationship between attrition and the amount of attention received by the retraction. In particular, heightened levels of attention may bring authors into the spotlight, potentially influencing how the broader scientific community, including individuals who may not have been previously familiar with their scholarly work, perceives them. To probe this relationship, we used a third dataset, Altmetric, a database of online mentions of publications, containing a record of more than 191 million mentions for over 35 million research outputs that we merge with RW (see ‘Merging RW with Altmetric’ section). We measured attention using the Altmetric score, tracking it in the 6-month period before and after the retraction event as per refs. 30,31 (see ‘Calculating the Altmetric score’ section for the calculation of the Altmetric score). In Fig. 2b, we show the distribution of the logged average attention received by the retracted papers in the filtered sample during this time window, highlighting that attention peaks during the month of retraction. For a breakdown by social media, news media, blogs and knowledge repositories, see Supplementary Fig. 6. We find that, while most retracted papers receive no attention, some gain worldwide publicity. More specifically, 64% of retracted papers receive no attention during their life course, which increases to 75.4% when considering our time window (papers without attention are not shown in Fig. 2b). Furthermore, in Fig. 2c, we display the gap between the proportion of authors who left publishing among the retracted authors and their matches whom we identified when creating the baseline in Fig. 2a. We present this gap at different cut-offs of attention as measured by the Altmetric score. This gap illustrates the possible impact of attention on the likelihood of authors leaving publishing. We find that this gap increases with attention in this sample, suggesting that retracted authors leave even earlier when receiving more attention compared with their matched counterparts.
We corroborate the result that retractions are a key factor in attrition from publishing using a Cox proportional hazard model estimated on all retracted authors, including those who have not been matched in the previous analysis. In this model, authors leave their publication career as defined above or are censored in 2020. We include yearly observations for each author from the start of their publishing careers. We control for several time-invariant factors, including gender and cohort, as well as time-varying factors, such as affiliation rank and discipline, which are updated annually based on meta-data from authors’ publications in MAG. Experience is represented as the cumulative counts of publications, citations and collaborators, also calculated from MAG. Importantly, we introduce a binary variable ‘retracted’, which remains zero until the year of retraction and switches to one thereafter. Table 1 presents the results from this model and confirms that experiencing a retraction does indeed precipitate authors’ exit from scientific publishing. We perform several robustness analyses using this model. First, to complement literature such as ref. 32, which underscores distinctions in author contributions to research articles based on authorship order, we show that the association between retraction and leaving publishing is slightly stronger for first and last authors compared with all authors (descriptively) (Supplementary Table 3). Second, we include authors with multiple retractions whose retractions cluster in a single year, so the model assumption of considering retraction as a single event is most likely to be met and find substantively similar results (Supplementary Table 4). Most authors with multiple retractions have them clustered within a single year (Supplementary Fig. 7), that is, the majority of authors with multiple retractions are included here.
While retraction may not necessarily result in an author leaving scientific publishing, it can still impact career progression by affecting an author’s reputation. Therefore, we extended our analysis to examine how the retraction of a scientific publication influences an author’s career, specifically for those who remain in scientific publishing post-retraction. We focused on three main outcomes: (1) the number of collaborators retained post-retraction, (2) the number of new collaborators gained post-retraction and (3) the share of open triads closed (that is, when an author co-authors with someone who previously worked with their co-author23). We performed another matching experiment similar to creating a baseline, pairing retracted authors with ongoing post-retraction publishing careers to comparable non-retracted authors who are not part of their collaboration network. We explicitly excluded all collaborators, not just past ones, to eliminate any negative spillover effects that retractions may cause, as studied in ref. 4 (note that this is a different approach to ref. 11). In the matching process, we ensured that each non-retracted author identified as a match shares the following exact characteristics with the retracted author: gender, academic age, affiliation rank at the start of their career and affiliation rank and scientific discipline at the time of retraction. Additionally, matches were similar (that is, within 30% of the values of the retracted author) in terms of the number of publications, citations and collaborators at the time of retraction, with the closest matches selected based on a theoretically calibrated distance function (see ‘Analytical sample for the matching experiment’ section). To focus on post-retraction career trajectories, we ensured that both retracted authors and their matches have published at least one paper post-retraction, and we evaluated outcomes in the 5 years after the retraction. This approach aligns the careers of retracted authors with their non-retracted matches, avoids survival bias33 and ensures that all authors have the same amount of time to accumulate and retain collaborators. The matching experiment results in the matching of 2,348 retracted authors, each matched to an average of 1.73 non-retracted authors. For an evaluation of the matched sample, see ‘Analytical sample for the matching experiments’ section, for more details on calculations and binning decisions, refer to Supplementary Note 2 and for the standardized mean differences and statistical comparisons between authors who stay in academic publishing and those matched, see Supplementary Table 5.
Figure 3 summarizes our findings from the matching experiment. We compared retracted authors with their non-retracted counterparts by averaging across all closest matches when multiple matches are present. We find that authors who have experienced a retraction in their careers tend to gain a significantly higher number of new collaborators and retain significantly more of their previous collaborators, as illustrated in Fig. 3a,b. These results are based on two-sided Welch’s t-tests to account for unequal variances, and are corroborated using Kolmogorov–Smirnov and Wicoxon signed-rank tests, as detailed in Supplementary Tables 6–8. We also find that these results largely and consistently hold across various factors, including gender, year of retraction, author order on the retracted paper, reason for retraction (mistake, plagiarism or misconduct), type of retraction (author led versus journal led), attention (high versus low), discipline, affiliation rank and the time between the retracted paper’s publication and retraction. We find substantively similar differences between the retracted authors and their matches when restricting matches to a 20% difference in terms of the number of papers, collaborators and citations before retraction, and the same direction using an even stricter restriction (a 10% difference) (Supplementary Figs. 8 and 9). Furthermore, when looking at only the first and last authors and their matches, and only at retracted authors and matches whose affiliations are in the same country, the results hold (Supplementary Figs. 10 and 11). We do not find credible evidence that retracted and non-retracted authors close a different proportion of open triads with authors who had previously co-authored with their past collaborators (Fig. 3c).
a–c, The difference between the numbers of collaborators retained (a), the numbers of collaborators gained (b) and the proportions of triads closed 5 years post-retraction (c) for the authors (N = 2,348) who were retracted (red circles) and their matched non-retracted pairs (green squares). These are further stratified by gender, year of retraction, academic age, author order, reason of retraction, type of retraction and discipline. Data are presented as mean values. The solid line represents a statistically significant difference using a two-sided Welch’s t-test (assuming unequal variances). Supplementary Tables 6–8 present the 95% confidence intervals and the results from additional non-parametric tests.
It is evident that authors who survive a retraction tend to maintain, on average, a greater number of previous collaborators and gain a higher number of new collaborators. However, it is essential to examine the characteristics of these retained and newly formed relationships among retracted authors in comparison to their matched counterparts. Therefore, in our next analysis (Fig. 4), we focused on three key characteristics: the academic age (seniority), number of papers (productivity) and number of citations (impact) of the collaborators of both retracted authors and their matched non-retracted counterparts. Additionally, we analysed these characteristics across different groups. Specifically, we considered early career (0–3 years of experience at the time of retraction), mid-career (4–9 years of experience) and senior (10 or more years of experience) authors separately. We also examined different reasons for retraction—misconduct, plagiarism and mistake—as well as the level of attention the retraction received (high versus low).
a–c, The comparison based on academic age (a), number of papers (b) and number of citations (c) for retained collaborators of retracted and non-retracted authors of different age groups. d–f, The comparison based on academic age (d), number of papers (e) and number of citations (f) for collaborators gained by retracted and non-retracted authors of different age groups. g–i, The results of the difference-in-difference analysis based on academic age (g), number of papers (h) and number of citations (i) comparing the difference of collaborators retained and collaborators lost for different age groups. j–l, The comparison based on academic age (j), number of papers (k) and number of citations (l) for retained collaborators of retracted and non-retracted authors stratified by reasons of retraction. m–o, The comparison based on academic age (m), number of papers (n) and number of citations (o) for collaborators gained by retracted and non-retracted authors stratified by reasons of retraction. p–r, The results of the difference-in-difference analysis based on academic age (p), number of papers (q) and number of citations (r) comparing the difference of collaborators retained and collaborators lost stratified by reasons of retraction.
Overall, our findings indicate that, although retracted authors have larger collaboration networks, in terms of the seniority and productivity of their retained collaborators, these networks are qualitatively weaker. Importantly, these differences are not a result of differences in the quality of collaboration networks before retraction. While these characteristics are not ones we matched on, as this would further decrease the sample of retracted authors with suitable matches, we demonstrate that the distribution of these characteristics before retraction are the same in terms of seniority and productivity, and are closely similar in terms of impact (Supplementary Fig. 12). As for newly gained collaborators, we do not find credible evidence for differences in this respect, but retracted authors gain more impactful collaborators overall.
The direction of these differences generally persists across career stages, but not all differences are significant at the usual levels. We find that early career retracted authors develop qualitatively different collaboration networks post-retraction compared with their matched counterparts, experiencing a significant loss. Namely, although the collaborators they retain are not less impactful (no credible evidence of a difference), they do retain less senior and less productive collaborators compared with their matched counterparts (Fig. 4a–c and Supplementary Table 9). We also find that even senior authors retain less senior collaborators post-retraction compared with their matched pairs (Fig. 4a). In terms of collaborators gained, we find that senior retracted authors are not affected when it comes to the age of their collaborators (no credible evidence of a difference), but gain significantly more productive and more impactful collaborators (Fig. 4d–f and Supplementary Table 10). All these results are based on two-sided Welch’s t-tests to account for unequal variances.
We observe little heterogeneity in the direction of the differences across retraction reasons (Fig. 4j–o and Supplementary Tables 12 and 13). Authors retracted due to mistakes do not develop qualitatively different collaboration networks compared with their matched counterparts in terms of the seniority, productivity and impact of their retained and new collaborators. This lack of credible evidence for these differences suggests that mistakes might be easier to overcome than misconduct or plagiarism (as in those cases the differences in terms of seniority are statistically significant), keeping sample size constraints in mind. In cases of misconduct, we find statistically significant differences (based on two-sided Welch’s t-tests) in the productivity and impact of new collaborators, indicating that those retracted for misconduct build stronger, not weaker, new networks in this regard compared with their matched pairs. Finally, to complement our previous analysis on attention, we observe that authors whose retracted papers receive a high level of attention (Altmetric score >10) build significantly worse networks in terms of seniority and impact of their retained collaborators. In other words, these retracted authors retain collaborators who are less senior and less impactful compared with their matched counterparts (Supplementary Fig. 13a–f and Supplementary Tables 15 and 16).
We also performed a difference-in-difference analysis to contrast retracted authors and their matches, examining whether the retained collaborators are qualitatively different from those who were lost (Fig. 4g–i,p–r, Supplementary Fig. 13g–i and Supplementary Tables 11, 14 and 17). This analysis helps determine whether, for example, although retracted authors retain more collaborators, those retained are less senior, less productive and less impactful compared with those they lose relative to their matched counterparts. It is important to note that the initial distribution of all pre-retraction collaborators are identical or very similar (Supplementary Fig. 12). Authors who did not retain or lose any collaborators are excluded from this analysis as meaningful comparisons cannot be made. For each retracted and matched author, we calculated the average of the variable of interest for both retained and lost collaborators and then computed the difference by subtracting the average for lost collaborators from the average for retained collaborators. Positive differences indicate that retained collaborators were more senior, more productive and more impactful than the ones lost. Overall, this analysis reveals a difference in terms of impact: retracted authors experience a greater relative loss in the impact of their collaborators compared with their matched counterparts. No credible evidence is found for the difference in differences in terms of seniority and productivity (although the differences are in the same direction as impact). The direction of the difference in differences we observe is consistent across career stages as well as for retraction reasons. However, as previously discussed, these differences are generally not statistically significant. Finally, as a robustness check, we repeated the analysis focusing only on retracted authors who were first or last authors. Although the reduced sample size resulted in a loss of statistical significance, we found that the direction of the differences, while not significant in most cases, remained largely consistent (Supplementary Fig. 14).
In addition to examining these qualitative differences, we explored whether retracted authors develop collaboration networks with collaborators who are more physically distant post-retraction compared with pre-retraction, and how this compares with their matched counterparts. We also analysed whether this shift is evident in the fields they publish in post-retraction compared with pre-retraction (see Supplementary Note 3 for details on the distance calculation). Our findings, as illustrated in Supplementary Fig. 16, indicate that both groups expand their collaboration networks with more distant collaborators over time. This trend is probably influenced by authors and their collaborators moving between institutions. Specifically, we observe that, on average, post-retraction collaborators of retracted authors are approximately 30 km more distant compared with those of their matches. However, we do not consider this difference to be particularly meaningful given the global scale of science production. In terms of disciplinary focus, our analysis reveals that both retracted authors and their matched counterparts undergo a divergence in the fields they publish in over time. Interestingly, this shift is strikingly similar for both retracted and non-retracted authors, suggesting that retracted authors do not move from their original fields in a manner different from their non-retracted counterparts post-retraction.
In summary, the career impact of retractions may not be fully understood without considering attrition, that is, leaving scientific careers where publishing is integral to success. We find that experiencing a retraction is associated with an earlier exit from publishing. However, the results of the matching experiment suggest that retracted authors who continue to publish do not suffer a reduction in the size of their collaboration networks. On average, these authors actually build larger collaboration networks compared with their counterparts by retaining more collaborators and establishing more new collaborations. This increase in size, however, is accompanied by qualitative differences: retracted authors tend to retain collaborators who are less senior, less productive and less impactful, which is balanced by gaining more impactful collaborators. We do not find significant and consistent heterogeneity by career stage and reason for retraction, though the relatively smaller and statistically insignificant difference in cases of mistakes may suggest that mistakes are easier to overcome compared with misconduct or plagiarism.
Discussion
Retractions can have significant consequences for authors’ careers, leading to their departure from scientific publishing. In this study, we conducted an analysis utilizing data from RW, MAG and Altmetric, identifying and examining an analytical sample of around 4,500 retracted papers involving over 14,500 authors. Our findings reveal that (1) around 45.9% of authors left their publishing careers around the time of retraction, (2) authors who left exhibited shorter pre-retraction careers had fewer citations, collaborators and publications compared with those who stayed, (3) higher attention precipitated retracted authors’ exit compared with their similar counterparts, (4) retracted authors who stayed post-retraction formed larger collaboration networks, retaining more collaborators and gaining more new ones, and (5) overall they built qualitatively weaker collaboration networks in terms of their co-authors’ seniority, and the productivity of those they retained, but gained more impactful new collaborators compared with their similar counterparts.
It is important to acknowledge that our study is not without limitations. First, online attention, as measured by the Altmetric score, captures the volume rather than the quality of attention and lacks a nuanced description of its specific sources beyond categories of platforms. Additionally, the score fails to reveal how relevant the coverage may be for retracted authors, which, if explored, could yield further insights into our findings. Moreover, retained and new collaborators may be qualitatively different beyond the aspects we explore. Second, our matching analysis reveals that retracted authors matched to similar non-retracted counterparts were, on average, more junior compared with the average retracted author both when constructing matches for the baseline and for retracted authors who published after retraction. Therefore, it is possible that our estimates represent a lower bound of a difference, assuming that more senior authors possess greater resources to further develop their networks. Conversely, they may also represent an upper bound, assuming that more established authors receive less benefit of the doubt when assessing their culpability compared with their early career counterparts. Third, throughout we document average effects, which might mask tremendous author-level heterogeneity that we are unable to explore, including their workplace (academic and/or research intuition or industry) and other factors, for example, their teaching load and service obligations. Fourth, in addition to these characteristics, retracted authors also vary in their awareness of the issues about their paper that led to the retraction and the actions that colleagues and mentors might have taken to shield (or chastise) retracted authors. Future work may incorporate additional author-level detail that allows authors on retracted papers to be treated differently depending on their involvement behind the reason of retraction. Some prior work has leveraged retraction notices in smaller samples in this way, which reveals such heterogeneity most of the time in terms of misconduct, but not in terms of mistakes10. Fifth, while we incorporated some authors with multiple retractions in robustness analyses, our modelling assumptions for this study treated retraction as a single event. Future work should integrate better multiple retractions to see if they have compound linear or nonlinear effects.
Some considerations fall outside of the scope of the paper. For instance, self-retraction might signal integrity, which could be a factor contributing to why some retracted authors develop larger collaboration networks. It is possible that these authors become more cautious in their future endeavours to avoid a second retraction, making them desirable collaborators. It is also plausible that they change how agentically they search for new collaborators and cultivate already existing relationships to compensate for the assumed (and empirically documented) negative impact of retractions. The role of the scientific community is similarly underexplored. Specifically, some retracted authors might receive support from their colleagues, presumably in cases when their papers are retracted for mistakes, or instances when retractions are viewed as ‘honourable’ or the ‘right thing to do’. Bringing these authors on as collaborators may be one way to show them support. These assertions, we hope, would form the starting point of future work.
Our study serves as an initial step in documenting how important institutions of science, such as retractions that serve a key role in policing the content of the canon, impact the careers of scientists. Future research should complement our work by exploring how authors navigate retractions and the micro-mechanisms underlying the strategies employed by retracted and non-retracted authors when seeking collaborators. It would also be valuable to investigate whether retracted and non-retracted scientists are sought after for similar opportunities, particularly when retracted authors’ work was not retracted due to misconduct. Furthermore, the role of online attention in these matters also deserves further exploration, as it becomes intertwined with the names of authors whose work is discussed, extending beyond the scientific content of papers and encompassing a broader set of issues.
Methods
Data sources
The analyses presented in this paper rely on three datasets:
-
(1)
RW26 is the largest publicly available dataset of retracted articles, obtained on the 18 May 2021. At the time, it contained 26,504 retracted papers published in 5,844 journals and conferences. The earliest publication record goes back to the year 1753, whereas the latest record is in 2021. The dataset consists of articles classified by a combination of 104 reasons for retraction.
-
(2)
MAG34 is one of the largest datasets for scientific publication records. We collected this dataset on the 30 July 2021. It contained, at the time, approximately 263 million publications, authored by approximately 271.5 million authors, with the earliest publication record in 1800.
-
(3)
Altmetric35 is a database of online mentions of publications. It contains a record of more than 191 million mentions for over 35 million research outputs. It uses unique identifiers (for example, Digital Object Identifier or DOI, and PubMedID) to match attention to research across several social media platforms, blogs, news sites and knowledge repositories.
Merging RW and MAG
To merge MAG and RW we used a two-step approach. Merging these two datasets took place after filtering the RW by eliminating bulk retractions and retractions before 1990 and post-2015. As both MAG and RW provide the DOI of the publication record, we started with these identifiers, as this is a persistent identifier unique to each document on the web. However, as not all records in MAG and RW have a DOI, we also identified papers in RW and MAG with the exact same title. Out of 6,704 papers in RW after filtering, we merged 2,646 papers to MAG. To increase the size of our dataset for analysis, we merged the rest of the publication titles in RW using fuzzymatcher36, which employs probabilistic record linkage37 to find similar titles based on Levenshtein distance. We validated the robustness of our fuzzy matching by randomly sampling 100 retracted papers and manually checking the accuracy of the merge. Out of 100 sampled papers, 99 in RW were linked to the correct entry in MAG. As a result of this second step, we additionally merged 3,542 retractions, resulting in a total of 6,188 (92%) retracted papers in RW linked to their corresponding entry in MAG. Last, we filtered papers with authors with multiple retractions and data with missing fields (Supplementary Note 1). This resulted in the final ‘filtered’ sample.
Creating author- and paper-level features
We used RW and MAG to create features for authors and papers. Here we discuss the features that required additional data collection and calculations, such as gender, scientific discipline, type of retraction and venue ranking.
Gender
To identify the perceived gender of authors, we used Genderize.io to map author first names to gender. Genderize.io returns the probability indicating the certainty of the assigned gender. We exclude all authors whose gender could not be identified with >0.5 probability. We validated the name-based gender identified by Genderize.io by comparing the agreement (or concordance) of its labels against another classifier, Ethnea. Ethnea is a name-based gender and ethnicity classifier specifically designed for bibliographic records. We compared the labels of 31,907 authors in RW for whom ‘male’ or ‘female’ labels were available using both Genderize.io and Ethnea. We found that the assignment of these labels agreed for 31,028 (that is, 97%) retracted authors with a Cohen’s κ score of 0.93 showing an almost perfect level of agreement38. Our approach is in line with prior research that uses similar name-based gender classifiers39,40,41,42,43,44; however, automated classifiers, such as the ones here, have significant shortcomings. They do not rely on self-identification and therefore could misgender authors. Annotations are performed on the basis of historical name–gender associations to assign male or female to an author, recognizing that there are expansive identities beyond this limiting binary that our approach cannot explore.
Scientific discipline
To assign a scientific discipline to every author, we utilized the fields in MAG that span more than 520,000 hierarchically structured fields. For every paper p and field f, MAG specifies a confidence score, denoted by score(p, f) ∈ [0, 1], which indicates the level of confidence that p is associated with f. The aforementioned hierarchy contains 19 top-level fields, which we refer to as ‘disciplines’. These fields are art, biology, business, chemistry, computer science, economics, engineering, environmental science, geography, geology, history, materials science, mathematics, medicine, philosophy, physics, political science, psychology and sociology. Almost every field, f, has at least one ancestor that is a discipline. Let D(f) denote the set of all disciplines that are ancestors of f. For any given paper, p, the set of disciplines associated with p is denoted by D(p), and is computed as
For any given author, a, let P(a) denote the set of papers authored by a. We compute the discipline(s) of a as
where denotes the set cardinality operator. In other words, a is associated with the most frequent discipline(s) among all papers authored by a up to and including the retraction year.
Retraction reasons
To identify the reason for retraction, we manually extracted the retraction notes of 1,250 retracted papers. The reasons for retraction can be classified into four broad categories: (1) misconduct, (2) plagiarism (note that some prior research considers plagiarism as misconduct, for example, ref. 45), (3) mistake and (4) other. Every retraction note was evaluated by multiple annotators. We started with two annotators and assigned additional annotators up to five until a majority reason was reached. If a majority reason was not reached, the reason was classified as ‘ambiguous’. Finally, if no reason was provided for the retraction, then it was classified as ‘unknown’. The final distribution of the reasons for retraction of the annotated papers was 251 (20%) misconduct, 311 (25%) plagiarism, 347 (28%) mistake, 170 (14%) other, 121 (10%) unknown and 50 (4%) ambiguous. Note that the reason ‘plagiarism’ includes plagiarising the work of others as well as prior work by the authors. On the basis of a random sample of 100 retraction notes, 50 referred to taking someone else’s work without proper reference, 30 referred to lacking citations or quotes from the authors’ own work and 20 did not include information about whose work has been plagiarized. Therefore, 30−50% of this category is self-plagiarism.
We used the manually annotated papers to automatically code reasons of retraction for the rest of the papers in RW using a label propagation algorithm. There are 104 unique reasons for retractions provided in RW. Each retracted paper is associated with one or more of these reasons. We mapped the 104 reasons to a majority coarser class of plagiarism, misconduct, mistake and other in our analysis. Then, we used this mapping to annotate the rest of the papers without labels using the majority class (see Supplementary Fig. 17 for a more detailed visualization of the label propagation algorithm). The final distribution of the reasons for retraction after label propagation for the filtered sample was as follows: 1,106 (24%) misconduct, 1,513 (33%) plagiarism, 1,078 (24%) mistake, 498 (11%) unknown, 334 (7%) other and 49 (1%) ambiguous. These numbers are comparable to those reported about a decade ago in a sample of papers indexed in PubMed46. In our analysis, we merged all three of the other, unknown and ambiguous categories into the ‘other’ category.
Type of retraction
Using the manually extracted retraction notes, we also identified whether the retraction was author led or journal led. The breakdown of the different types of retractions is as follows: 604 (48%) author led, 499 (40%) journal led, 119 (10%) unknown and 28 (2%) ambiguous. These data are only available for the manually annotated papers.
Journal and conference ranking
To identify the ranking for the venue (journal or conference) of retracted papers, we utilized the database of SCImago journal rankings (SJR)47. For a given journal or conference year, the SJR score is computed as the average number of weighted citations received by the articles published in the venue during the past 3 years48. Based on this score, for each subject area, a quartile is also assigned to each journal. SJR provides rankings from 1999 to 2020. We used the year of publication of the retracted article to identify the SJR score and quartile ranking of the venue. Out of the 4,578 papers and 14,579 authors in the filtered sample, we were able to identify the rankings for 3,129 (68.3%) papers and 10,379 (71.2%) authors. Note that papers before 1999 do not have this information nor do papers whose venues were not featured in SJR.
Merging RW and Altmetric
For each paper in RW, we used the associated DOI or the PubMedID to query the Altmetric application programming interface (API). Out of the 26,504 retracted papers in RW, we are able to identify 11,265 (42.5%) papers with online presence based on their unique identifiers. There were 15,239 papers for which an Altmetric entry could not be located; however, these papers and their respective authors are also part of our analysis.
Calculating the Altmetric score
The Altmetric score is a weighted count of the attention a research output receives from different online sources (for example, Twitter (now X), news and so on). The Altmetric API, however, only provides the current cumulative Altmetric score for a given record and does not give the breakdown or a customized score for a given time window. Since we focused on the 6 months before and after retraction to isolate the attention that the retraction probably garnered, we computed this score using the methodology detailed by Altmetric on their webpage49. While the algorithm Altmetric uses to compute its score is proprietary50, the description allows us to closely estimate it. We computed the Spearman correlation of the available cumulative Altmetric score against our computed score for the complete time window using our methodology. For the 11,265 retracted papers for which an Altmetric record (and score) is available, the Spearman correlation is high (ρ = 0.96). We computed the Altmetric score for both the retracted paper and its respective retraction note (most retraction notes and papers have separate DOIs) and aggregated these by taking their sum. We assigned papers that are not indexed in Altmetric an attention score of zero. We validated this choice by randomly sampling 100 of these papers and manually searching for mentions of them on Google and on Twitter using the title and the DOI of the paper. Out of this 100, 96 have not received any attention. The remaining papers only garnered one mention on average. For this reason, we treated the papers that do not have mentions on Altmetric as having an attention score of zero.
Analytical sample for the matching experiments
In this Article, we report the results of two matching experiments. First, we created a baseline of attrition of authors comparable to retracted authors. Second, we created a comparable set of authors to retracted authors with post-retraction careers. The procedures we employed in both cases are similar, except that in the second case we create matched sets using more confounders. For this reason, we describe our process in detail for the creation of this second sample only to reduce redundancy.
We used our filtered sample of 4,578 papers and 14,579 authors to generate the analytical sample for the matching experiment. Of the 14,579 authors, we found suitable matches for 2,348 authors (16.1%). These matches were established on the basis of a three-step process. First, we performed exact matching on gender, academic age and affiliation rank at the start of the career, as well as affiliation rank and scientific discipline at the time of retraction. Second, we picked a threshold (30%) within which we accepted matches on the remaining characteristics: number of papers, number of collaborators and number of citations pre-retraction; that is, these characteristics of the match ought to be within 30% of the same for the retracted author. Third, we used a calibrated distance function to achieve balance on the three latter characteristics, giving more weight to the number of pre-retraction collaborators, our most important confounder. Specifically, we identified the closest match for each author using the lowest-weighted Euclidean distance that minimizes the standardized mean difference for the number of papers, the number of citations and the number of collaborators between the author and the match, calculated over the set of potential matches identified in the second step. We repeated these steps using a 20% and a 10% threshold, and present robustness analyses with this threshold in Supplementary Figs. 8 and 9.
For each author a, let pa, ca and oa denote the standardized number of papers, the number of citations and the number of collaborators of a by the year of retraction. We choose the closest match m for each author by minimizing the following distance function
where wpapers = 0.1, wcitations = 0.1 and wcollaborators = 0.8 denote the weights determined empirically by minimizing the standardized mean differences.
If the collaboration year of a collaborator was missing, we could not place them on the authors’ career timeline. In other words, we could not identify whether that collaboration occurred pre- or post-retraction. As such, in the case of a missing collaboration year, we removed the author and their corresponding matches from our analysis altogether. We carried out our analysis on retracted authors who authored at least one paper post their retraction year in the 5 year window following retraction. All matched authors met the same criteria.
The ‘matched sample’ of the authors who stayed in scientific publishing is different from those in the filtered sample in the following important ways: the matched sample is younger, has fewer papers, fewer citations and fewer collaborators on average, and is slightly more likely to be a middle author, and is also more likely to be in institutions ranked 101–1,000 (Supplementary Fig. 5). In sum, the matched authors are lower status, on average, compared with the non-matched filtered sample. These differences are essential to consider when evaluating our inferences. We calculate standardized mean differences51 for our matched sample and find that these values are 0.036, 0.017 and 0.053, respectively, for the number of papers, citations and collaborators. For the rest of the characteristics, the standardized mean differences are 0, as the matches are exact. These statistics give us confidence that retracted authors are matched to non-retracted authors with similar career trajectories up to the time of retraction.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The MAG dataset can be downloaded from https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/. The RW database can be accessed from https://retractionwatch.com/. Access to the Altmetric API can be requested from https://www.altmetric.com/solutions/altmetric-api/. The processed data necessary to reproduce main plots and statistical analyses are freely available via GitHub at https://github.com/samemon/retraction_effects_on_academic_careers.
Code availability
The code necessary to reproduce the main plots and tables is available for download via GitHub at https://github.com/samemon/retraction_effects_on_academic_careers.
References
Lynn, F. B., Podolny, J. M. & Tao, L. A sociological (de)construction of the relationship between status and quality. Am. J. Sociol. 115, 755–804 (2009).
Azoulay, P., Bonatti, A. & Krieger, J. L. The career effects of scandal: evidence from scientific retractions. Res. Policy 46, 1552–1569 (2017).
Mistry, V., Grey, A. & Bolland, M. J. Publication rates after the first retraction for biomedical researchers with multiple retracted publications. Account. Res. 26, 277–287 (2019).
Jin, G. Z., Jones, B., Lu, S. F. & Uzzi, B. The reverse Matthew effect: consequences of retraction in scientific teams. Rev. Econ. Stat. 101, 492–506 (2019).
Hussinger, K. & Pellens, M. Guilt by association: how scientific misconduct harms prior collaborators. Res. Policy 48, 516–530 (2019).
Petersen, A. M. et al. Reputation and impact in academic careers. Proc. Natl Acad. Sci. USA 111, 15316–15321 (2014).
Aksnes, D. W. & Rip, A. Researchers’ perceptions of citations. Res. Policy 38, 895–905 (2009).
Vuong, Q.-H., La, V.-P., Ho, M.-T., Vuong, T.-T. & Ho, M.-T. Characteristics of retracted articles based on retraction data from online sources through february 2019. Sci. Ed. 7, 34–44 (2020).
DVM, B. Fallibility in science: responding to errors in the work of oneself and others. Adv. Methods Pract. Psychol. Sci. 1, 432–438 (2018).
Mongeon, P. & Larivière, V. Costly collaborations: the impact of scientific fraud on co-authors’ careers. J. Assoc. Inf. Sci. Technol. 67, 535–542 (2016).
Sharma, K. & Mukherjee, S. The ripple effect of retraction on an author’s collaboration network. J. Comput. Soc. Sci. 7, 1519–1531 (2024).
Lee, S. & Bozeman, B. The impact of research collaboration on scientific productivity. Soc. Stud. Sci. 35, 673–702 (2005).
He, Z.-L., Geng, X.-S. & Campbell-Hunt, C. Research collaboration and research output: a longitudinal study of 65 biomedical scientists in a New Zealand university. Res. Policy 38, 306–317 (2009).
Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039 (2007).
LaCour, M. J. & Green, D. P. When contact changes minds: an experiment on transmission of support for gay equality. Science 346, 1366–1369 (2014).
Konnikova, M. How a gay marriage study went wrong. The New Yorker (22 May 2015); https://www.newyorker.com/science/maria-konnikova/how-a-gay-marriage-study-went-wrong
Obokata, H. et al. Stimulus-triggered fate conversion of somatic cells into pluripotency. Nature 505, 641–647 (2014).
McNeill, D. Academic scandal shakes Japan. New York Times (6 July 2014); https://www.nytimes.com/2014/07/07/world/asia/academic-scandal-shakes-japan.html
Coleman, J. Social capital in the creation of human capital. Am. J. Sociol. 94, S95–S120 (1988).
Beaver, D. D. Reflections on scientific collaboration (and its study): past, present, and future. Scientometrics 52, 365–377 (2001).
Bosquet, C. & Combes, P.-P. Are academics who publish more also more cited? Individual determinants of publication and citation records. Scientometrics 97, 831–857 (2013).
Ductor, L., Fafchamps, M., Goyal, S. & van der Leij, M. J. Social networks and research output. Rev. Econ. Stat. 96, 936–948 (2014).
Shi, F., Foster, J. G. & Evans, J. A. Weaving the fabric of science: dynamic network models of science’s unfolding structure. Soc. Netw. 43, 73–85 (2015).
Kim, J. & Diesner, J. Over-time measurement of triadic closure in coauthorship networks. Soc. Netw. Anal. Min. 7, 9 (2017).
Zhelyazkov, P. I. Interactions and interests: collaboration outcomes, competitive concerns, and the limits to triadic closure. Admin. Sci. Q. 63, 210–247 (2018).
The retraction watch database. The Center for Scientific Integrity http://retractiondatabase.org/ (2023).
Sinha, A. et al. An overview of Microsoft Academic Service (MAS) and applications. In Proc. 24th International Conference on the World Wide Web (WWW '15 Companion) 243–246 (Association for Computing Machinery, 2015).
Wang, K. et al. A review of microsoft academic services for science of science studies. Front. Big Data 2, 45 (2019).
McCook, A. One publisher, more than 7000 retractions. Science 362, 393–393 (2018).
Peng, H., Romero, D. M. & Horvát, E.-Á. Dynamics of cross-platform attention to retracted papers. Proc. Natl Acad. Sci. USA 119, e2119086119 (2022).
Abhari, R. et al. Twitter engagement with retracted articles: who, when, and how? (v2). Preprint at https://arxiv.org/abs/2203.04228 (2022).
Lissoni, F., Montobbio, F. & Zirulia, L. Inventorship and authorship as attribution rights: an enquiry into the economics of scientific credit. J. Econ. Behav. Organ. 395, 49–69 (2013).
Brown, S. J., Goetzmann, W., Ibbotson, R. G. & Ross, S. A. Survivorship bias in performance studies. Rev. Financ. Stud. 5, 553–580 (2015).
Wang, K. et al. Microsoft Academic Graph: when experts are not enough. Quant. Sci. Stud. 1, 396–413 (2020).
Adie, E. & Roe, W. Altmetric: enriching scholarly content with article-level discussion and metrics. Learn. Publ. 26, 11–17 (2013).
Linacre, R. fuzzymatcher. GitHub https://github.com/RobinL/fuzzymatcher (2025).
Sayers, A., Ben-Shlomo, Y., Blom, A. W. & Steele, F. Probabilistic record linkage. Int. J. Epidemiol. 45, 954–964 (2016).
McHugh, M. L. Interrater reliability: the kappa statistic. Biochem. Med. 22, 276–282 (2012).
AlShebli, B. K., Rahwan, T. & Woon, W. L. The preeminence of ethnic diversity in scientific collaboration. Nat. Commun. 9, 5163 (2018).
Lee, E. et al. Homophily and minority-group size explain perception biases in social networks. Nat. Hum. Behav. 3, 1078–1087 (2019).
Squazzoni, F. et al. Peer review and gender bias: a study on 145 scholarly journals. Sci. Adv. 7, eabd0299 (2021).
Morgan, A. C. et al. The unequal impact of parenthood in academia. Sci. Adv. 7, eabd1996 (2021).
Liu, F. et al. Gender inequality and self-publication are common among academic editors. Nat. Hum. Behav. https://doi.org/10.1038/s41562-022-01498-1 (2023).
Peng, H., Teplitskiy, M. & Jurgens, D. Author mentions in science news reveal widespread disparities across name-inferred ethnicities. Quant. Sci. Stud. https://doi.org/10.1162/qss_a_00297 (2024).
Gaudino, M. et al. Trends and characteristics of retracted articles in the biomedical literature, 1971 to 2020. JAMA Intern. Med. 181, 1118–1121 (2021).
Fang, F. C., Steen, R. G. & Casadevall, A. Misconduct accounts for the majority of retracted scientific publications. Nature 109, 17028–17033 (2012).
Scimago journal rank. SJR https://www.scimagojr.com/journalrank.php (2025).
Guerrero-Bote, V. P. & Moya-Anegón, F. A further step forward in measuring journals’ scientific prestige: the SJR2 indicator. J. Informetr. 6, 674–688 (2012).
How is the altmetric attention score calculated? Altmetric https://help.altmetric.com/support/solutions/articles/6000233311-how-is-the-altmetric-attention-score-calculated (2025).
Trueger, N. S. et al. The altmetric score: a new measure for article-level dissemination and impact. Ann. Emerg. Med. 66, 549–553 (2015).
Flury, B. K. & Riedwyl, H. Standard distance in univariate and multivariate analysis. Am. Stat. 40, 249–251 (1986).
Oransky, I. & Marcus, A. Tracking retractions as a window into the scientific process. Late resveratrol researcher Dipak Das up to 20 retractions. Retraction Watch https://retractionwatch.com/ (2012).
Cox, D. Regression models and life-tables. J. R. Stat. Soc. B 34, 187–220 (1972).
Allen, M. et al. Raincloud plots: a multi-platform tool for robust data visualization. Wellcome Open Res. 4, 63 (2019).
Acknowledgements
We gratefully acknowledge support and resources from the High Performance Computing Center at New York University Abu Dhabi. We thank I. Oransky and A. Marcus, co-founders of Retraction Watch (RW)52, as well as The Center For Scientific Integrity, the parent organization of RW, for diligently maintaining a curated list of scientific retractions, and making it freely available to researchers. We also thank Altmetric.com and Microsoft Academic Graph for providing the data used in this study. We thank S. Adhikari, A. Chae, A. Dutt, R. Kall, D. Khan, R. Kukreja, R. Malhotra and Z. Shahzad for finding and annotating the retraction notices. We thank P. Bearman, S. Goyal, B. Lee, F. (M.) Liu, M. Park, N. Weber and the participants of the Workshop on the Frontiers of Network Science 2023 for thoughtful comments and suggestions.
Author information
Authors and Affiliations
Contributions
S.A.M., K.M. and B.A. conceived and designed the study. S.A.M., K.M. and B.A. were responsible for the methodology. S.A.M. and B.A. collected and processed the data. S.A.M., K.M. and B.A. conducted the analysis. S.A.M., K.M. and B.A. did the visualization. K.M. and B.A. supervised the study. S.A.M., K.M. and B.A. wrote the methods section. S.A.M., K.M. and B.A. created the Supplementary Information. K.M. and B.A. wrote the main text of the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors B.A. and K.M. acknowledge that their study was inspired by a personal experience, experiencing a retraction of one of their papers, which galvanized the research questions asked in this paper, namely, how retractions influence the careers of authors of scientific papers? S.A.M. declares no conflicts of interest.
Peer review
Peer review information
Nature Human Behaviour thanks Ho Fai Chan, Rachel Hall, Wu Youyou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–17, Tables 1–17 and Notes 1–3.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Memon, S.A., Makovi, K. & AlShebli, B. Characterizing the effect of retractions on publishing careers. Nat Hum Behav 9, 1134–1146 (2025). https://doi.org/10.1038/s41562-025-02154-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-025-02154-0