Original Article
Open Access

‘As of my last knowledge update’: How is content generated by ChatGPT infiltrating scientific papers published in premier journals?

Artur Strzelecki

Corresponding Author

Artur Strzelecki

Department of Informatics, University of Economics in Katowice, Katowice, Poland

Search for more papers by this author
First published: 24 December 2024

Abstract

The aim of this paper is to highlight the situation whereby content generated by the large language model ChatGPT is appearing in peer-reviewed papers in journals by recognized publishers. The paper demonstrates how to identify sections that indicate that a text fragment was generated, that is, entirely created, by ChatGPT. To prepare an illustrative compilation of papers that appear in journals indexed in the Web of Science and Scopus databases and possessing Impact Factor and CiteScore indicators, the SPAR4SLR method was used, which is mainly applied in systematic literature reviews. Three main findings are presented: in highly regarded premier journals, articles appear that bear the hallmarks of the content generated by AI large language models, whose use was not declared by the authors (1); many of these identified papers are already receiving citations from other scientific works, also placed in journals found in scientific databases (2); and, most of the identified papers belong to the disciplines of medicine and computer science, but there are also articles that belong to disciplines such as environmental science, engineering, sociology, education, economics and management (3). This paper aims to continue and add to the recently initiated discussion on the use of large language models like ChatGPT in the creation of scholarly works.

Key points

  • Articles in high-impact journals may include content generated by AI without this use having been declared.
  • Many papers with AI-generated sections are being cited in subsequent scholarly works, impacting the trust in published research.
  • Papers featuring AI-generated content were predominantly found in the field of computer science; however, other fields such as medicine and engineering are also affected.
  • The current review and publication processes are not fully equipped to catch AI-generated content, raising concerns about the integrity of scientific publications.
  • The issue requires ongoing monitoring and re-evaluation of publication practices to maintain the credibility of scientific literature.

INTRODUCTION

ChatGPT is a well-known large language model (LLM) that publicly launched in November 2022. Since then, it has quickly gained popularity worldwide, becoming the fastest subscribed service in the history of internet services (Haque & Li, 2024). ChatGPT has sparked a revolution in text development, language processing and is helping with content and data analysis. It is a sophisticated AI-tool developed by OpenAI and was the first of its kind to become widely popular, and has been used widely in schools, universities, and businesses (Deike, 2024; Gołąb-Andrzejak, 2023; Strzelecki & ElArabawy, 2024). Today, it is employed in a wide range of scenarios as a helpful tool. Although this paper discusses ChatGPT usage, it is important to emphasize that there are several other publicly available LLMs. These include Google Gemini, Meta Llama, Mixtral, OpenChat, and others, which have similar operational capabilities (Menz et al., 2024).

Ongoing discussion among researchers highlights the many ways ChatGPT is being put to use. A major area is education, where students are using it to write essays and complete assignments (Karnalim et al., 2024; Salifu et al., 2024). In the medical field, researchers are using it to analyze data (Divito et al., 2024). In other areas, it can help with correcting language and acts as a talking partner for exploring new ideas and trends that people may be interested in studying (Punar Özçelik & Yangın Ekşi, 2024; Steiss et al., 2024). However, it is important to recognize that ChatGPT, despite its utilities, has its limitations, and can be something of a double-edged sword. Being essentially a statistical machine, it calculates the frequency of word occurrences, proximities, and the likelihood of the next word's appearance, taking into account content architecture and the context of text. Therefore, the creators of this tool have pointed out its susceptibility to inaccuracies and its occasional tendency to provide misleading information (Fang et al., 2024; Polyportis & Pahos, 2024). Another significant limitation is its knowledge base, which is only updated to a certain point in time. Consequently, the information it accesses might not be current.

The literature to date has paid attention to bibliographic citations and the fact that ChatGPT does not provide real information (Giray, 2024). Although it is possible to prompt the model in a follow-up query to provide actually existing bibliographic citations, typically, it limits its responses to sources that have thousands or hundreds of citations, are well-established in the literature, and can be easily recognized by ChatGPT as valid (Alyasiri et al., 2024). However, when a topic requires more detailed knowledge or a specific approach, the tool is incapable of providing any citation sources that are genuine (Buchanan et al., 2024).

Additionally, the literature has addressed whether ChatGPT can be considered a co-author of a scholarly article. The OpenAI sharing and publication policy states that content co-authored with ChatGPT should be attributed to the name or company (OpenAI, 2022). However, there have been attempts to list ChatGPT as a co-author, and such papers have been published (Stokel-Walker, 2023). Subsequently, several editorials by editors of reputable scientific journals, such as Science, highlighted that ChatGPT does not meet the criteria for authorship (Thorp, 2023). ChatGPT does not qualify as an author because it cannot engage in illocutionary acts, such as making promises or assertions, it does not possess the necessary mental states, such as knowledge, belief, or intention, and it is unable to be accountable for the texts it generates (van Woudenberg et al., 2024). This issue has also been established in the literature (Lund & Naheem, 2024).

Subsequent discussions have considered the possibility of using this tool in the preparation of scientific articles. Here, major scientific publishers have reached a consensus and some have decided to allow the use of this tool solely for language correction (Inam et al., 2024). This adheres to the same principle previously applied to similar types of tools that support spelling and grammar, especially when it comes to writing in English by non-native speakers of the language (Gatrell et al., 2024). However, this is conditional upon such language editing being indicated in the article in the declarations section, and authors taking full responsibility for how the text has been translated, corrected, and published. There is consensus here because major publishers like Elsevier, Springer, Taylor & Francis, Emerald, and others now apply this policy (Ganjavi et al., 2024). An ongoing discussion relates to the use of LLMs to support researchers in creating peer review reports (Mollaki, 2024; Oviedo-García, 2024; Piniewski et al., 2024), but the policies for writing research papers are being quicker established (Garcia, 2024). These publisher policies are still evolving and becoming more nuanced over time. Some uses need to be disclosed in an acknowledgements section, and some within a methods section, and distinguish between use for improving readability, and more substantive use.

These publishers, however, do not allow content generated by ChatGPT or any other LLM to appear in scientific articles. Generated means that ChatGPT produced content in response to a user's prompt. Despite these declarations, this paper highlights an emerging problem. It is being increasingly observed that content generated by ChatGPT is going undeclared and undetected, resulting in its appearance in articles published in scholarly journals (Baronchelli, 2024). Typically, this situation pertains to journals that may not be indexed in databases like Web of Science or Scopus, and are sometimes considered lower-quality. It is important to note that the lack of indexing does not necessarily imply low quality, nor does indexing in these databases automatically guarantee high-quality content. However, it is increasingly noticed that even in high-quality journals and publications, content generated by ChatGPT has appeared, indicating a growing problem with the publishing, peer review, editing and acceptance process in scientific publications.

Examples of such publications include papers where authors left in sections of text generated by ChatGPT. One of the early examples is a paper published in the Resources Policy journal. This publication was noticed by journalists who raised the issue in Wired magazine, and which was subsequently commented on in academic publications (Engle & Nedelec, 2024). The authors submitted a correction to Resources Policy, explaining that the mistake was unintentional and resulted from language correction (Yang et al., 2023). Similarly, pieces of text generated by ChatGPT were recently discovered in other peer-reviewed papers published in Surfaces and Interfaces (Zhang et al., 2024), and in Radiology Case Reports (Bader et al., 2024). Both papers have now been retracted and removed, respectively.

These findings were made from text analysis and the understanding that introductory, or summary sections were produced by AI-tool. Such cases come to light when a reader carefully reads the paper. By this point, however, it is often too late, as the primary obligation of the authors is to produce original content derived from their research. Following this, it falls to reviewers to evaluate the scientific merit of the work and to editors to decide whether to accept the manuscript for publication.

Given what has been discussed so far, a newly identified research gap is how to find and recognize scientific articles that have been partially generated by ChatGPT or any other LLM. The focus is not on linguistic corrections but on content creation by the AI tool Therefore, the aim of this paper is to demonstrate how one can identify that a scientific article bears signs of having been partially written by ChatGPT. The following research questions are proposed:

Research question 1.How can papers that have been partially generated by ChatGPT be identified and recognized?

Research question 2.How are papers containing AI-generated content cited by other works, and to which disciplines do they belong?

Research question 3.How do publishers respond to papers containing AI-generated content?

The study provides a discussion on what can be done to improve publishing policies and enhance the quality of articles published in scientific journals.

THEORETICAL FRAMEWORK

The research method applied in this study is based on the SPAR4SLR theoretical framework proposed by Paul et al. (2021) for conducting systematic reviews and literature surveys. Choosing the SPAR4SLR method for a systematic literature review provides a structured, transparent and replicable approach that ensures comprehensive coverage and rigorous analysis of the literature. While Paul et al. (2021) originally designed the framework to review articles within a single topic from one discipline and to perform systematic literature reviews, this protocol can also be adapted to identify, describe and analyze papers that contain AI-generated content. The framework divides the study development process into three phases: assembling, arranging and assessing. Each of these phases is further divided into two sub-phases: identification and acquisition, organization and purification, evaluation and reporting.

Assembling

Identification

The identification method used to find content generated by AI LLMs primarily relies on utilizing databases that index scientific articles and analyze content. The discovery of whether contents were generated by AI LLMs is usually made by readers (but not the author, neither the reviewer, nor the editor) when the scientific publication is available to be downloaded, read and indexed by academic databases.

Google Scholar is a popular scientific database that quickly indexes content and has access to full texts of publications that are available open access as well as via a subscription model (Pereira & Mugnaini, 2023). As a search engine for scholarly content, Google Scholar gains access to content published by academic publishers, and Google's search indexing bots are allowed to access and review the content under the subscription model. Subsequently, the indexed content is made available to Google Scholar users. With the use of Campus Activated Subscriber Access, individuals on the campus of an institution that collaborates with publishing companies like Springer, Elsevier, or Emerald, can access the content of articles subscribed to by the institution. Thus, Google Scholar has access to all published article content from all known and major publishers and also indexes all other available scientific articles published by any journal that is accessible for indexing from the scholarly search engine level.

Acquisition

In March 2024 on the social network X (formerly Twitter), several people published screenshots from the Google Scholar search engine (Moore, 2024; Saboo, 2024; Saçan, 2024). These screenshots displayed example queries that, upon review of the searched works, turned out to contain fragments generated by ChatGPT. One of the recurring fragments was the information presented by ChatGPT about its last update. The search for articles meeting the criterion of content generation by AI model began with the introduction of one of the most common phrases generated by ChatGPT ‘as of my last knowledge update’. It indicates that its last update is at a specific point in time. This attempt was successful and yielded many results, indicating that scientific articles contain this fragment of content. These results needed to be filtered to exclude publications where authors purposefully used ChatGPT and published conversations with it, within which there are mentions ‘as of my last knowledge update’. Filtering was achieved using an excluding word ‘-chatgpt’. The ‘minus’ character is an operator that filters results to not contain this word. The obtained results were analyzed to then employ the ‘snowball’ method to find other possible content fragments that would indicate text generation by ChatGPT. Snowballing in this case means that if a paper contains already identifiable parts generated by ChatGPT, there may be other text fragments that also originate from ChatGPT and may be used to search other papers.

Another frequently occurring phrase returned by the AI model is the information that it does not have access to specific data. Therefore, the string ‘I don't have access to’ was entered into the Google Scholar search engine, yielding further results containing this string. In this case, the snowball method was also applied to discover other possible variants of this formulation leading to the discovery of content generated by ChatGPT. Based on a previously identified paper published in Surfaces and Interfaces (Zhang et al., 2024), which was found to contain AI-generated content, two other search queries added to the corpus are ‘certainly, here is’ and ‘certainly, here are’.

Literature have already documented cases of mistakenly inserting a fragment of the ChatGPT user interface containing the phrase ‘regenerate response’ into a scientific article (Conroy, 2023). This phrase was also used to find scientific articles where it appears in the content without any justified connection to the surrounding text. ‘Regenerate response’ was an option available for a time after generating content by ChatGPT to regenerate the content according to a user's prompt. This way, users of the AI model could explore its capabilities to create different-sounding text.

The next step involved querying ChatGPT itself about the most common phrases it returns that could indicate a text was written by it. The prompt was: ‘What are your typical responses, that could serve as a watermark to recognize that the content comes from you. Can you provide a list of sample typical responses?’ The model responded that it strives for its outputs to be diversified and reflect a variety of expressions, but suggested a few phrases that tend to recur. The response is presented in Fig. 1.

Details are in the caption following the image
ChatGPT's response to the prompt ‘What are your typical responses, that could serve as a watermark to recognize that the content comes from you. Can you provide a list of sample typical responses?’.

An initial analysis of these formulations revealed only one previously identified piece of text ‘As of my last update in’. Searches for other phrases generated by ChatGPT did not return any results indicating text generated by the language model. Therefore, in the second iteration, ChatGPT was asked to provide possible formulations that would frequently occur but as a source, those that had already proven to be effective were introduced to it. The prompt was: ‘Ok, can you provide more which are close to examples as “I'm sorry, but I don't have access to”’, ‘As an AI language model’, ‘as of my last knowledge update’, ‘I don't have access to specific’, ‘As of my last update’. In response, the model proposed several formulations that were indeed reflected in published scientific articles. The response is presented in Fig. 2.

Details are in the caption following the image
ChatGPT's response to the prompt ‘Ok, can you provide more which are close to examples as “I'm sorry, but I don t have access to”, “As an AI language model”, “as of my last knowledge update”, “I don t have access to specific”, “As of my last update”’.

Arranging

Organization

In the organization stage, authors adopted a coding method similar to that used in systematic literature reviews based on bibliometric review. The coding process of the articles is based on paper references, journal names, publisher names, metrics received from scientific databases such as Web of Science and Scopus, quartile information in which the journal is ranked, and the current number of citations displayed by Google Scholar.

Purification

During the purification stage, authors set a single criterion for papers to qualify for further study. This criterion was that the papers must be, or will in the future be, indexed in the Web of Science or Scopus databases. This inclusion criterion was based on checking the journal in which the article was published and determining if the journal is listed in a scientific database. Often, articles are first indexed in Google Scholar and only later appear in scientific databases. Therefore, many of these papers have not yet been indexed in the scientific databases. Consequently, analyzing the journal title allowed this determination to be made. The purification stage took place before organizing the articles because this was a primary assumption. This deviates from the established protocol in the SPAR4SLR framework; however, as this study is not a systematic analysis of articles already published and recognized in the field, this change is justified.

Assessing

Evaluation

In the evaluation stage, typical bibliographic analyses proposed in the SPAR4SLR framework should be performed. However, it is not feasible to conduct co-authorship analysis, co-citation analysis, social network analysis or thematic modelling for papers in which the only commonality is the presence of different AI-generated text fragments and where papers come from various fields and cover different topics. Nevertheless, there are several assessments proposed that involve classifying articles into the main disciplinary areas and determining whether any actions have been taken by the publisher in response to findings that the text was partially generated by AI.

Reporting

Reporting the results is entirely drawn from the SPAR4SLR protocol as it involves discussions, visual summaries in the form of tables and charts, and highlighting the limitations of the analyses conducted. There was no deviation from reporting protocol.

A visual illustration of the applied protocol is presented in Fig. 3. Figure 3 includes information about the three stages of systematic analysis: assembling, arranging, and assessing, as well as each of the two sub-stages within the three main tasks: identification, acquisition, organization, purification, evaluation and reporting.

Details are in the caption following the image
Protocol for the study based on SPAR4SLR.

RESULTS

Table 1 presents the identified queries entered into the Google Scholar search engine and the number of results obtained as of 30 September 2024. The search identified 1,362 scientific articles in which the content contains unequivocal confirmation that portions of the text were generated by ChatGPT. The number of articles was such that they could be manually analyzed, paper by paper. The majority of the results returned by Google Scholar linked to publications found in journals not indexed in quality scientific databases such as Web of Science and Scopus, or on platforms serving to publish preprints such as arXiv, researchsquare, SSRN and others. However, a smaller portion of the results belongs to publishers who are recognized as major scientific publishers with significant influence on readers. Many of the identified articles were published in journals that are listed in the Web of Science and Scopus databases and have quality indicators like Impact Factors and CiteScores derived from their number of citations.

TABLE 1. Search queries used to find papers with AI-written content and the tentative total number of results on 30 September 2024.
Search query Tentative number of results
2022 2023 2024 Total
‘as of my last knowledge update’ -chatgpt 2 51 65 127
‘as of my knowledge cutoff’ -chatgpt 2 38 25 68
‘as of my last update’ -chatgpt 3 49 78 140
‘my last training cut-off’ -chatgpt 0 2 2 4
‘certainly, here are’ -chatgpt 11 165 217 507
‘certainly, here is’ -chatgpt 6 43 64 716
‘certainly, here's” -chatgpt 9 69 117 223
‘certainly let s’ -chatgpt 4 18 44 167
‘I don t have access to specific’ -chatgpt 0 12 9 36
‘I don t have access to real-time’ -chatgpt 1 3 15 30
‘as an AI language model’ -chatgpt 6 64 49 152
‘regenerate response’ -chatgpt 5 84 22 147
‘I can provide you with some insights into’ -chatgpt 0 3 0 3
‘my responses are generated based on patterns’ -chatgpt 0 1 0 1
‘please refer to the source material for a more detailed description’ -chatgpt 0 1 0 1
‘keep in mind that the situation may have evolved’ -chatgpt 0 1 2 2
Total 49 604 709

Table 1 also presents detailed results of the number of articles found for each year for each query entered into Google Scholar. It should be noted that the sum of these 3 years (2022 to 2024) does not add up to the total number of articles found without time restrictions. In previous years, articles appeared that could contain specific phrases. This is particularly visible with the phrases ‘certainly, here is’, ‘certainly, here's’ and ‘certainly, here are’. Results from before 2022 show that these phrases are sometimes used in scientific articles, such as in interviews where someone is directly quoted. Nevertheless, the analysis of texts was conducted manually by reviewing each found full text in search of content that was automatically generated.

Figure 4 contains a screenshot resulting from searches in Google Scholar that illustrate how the search engine displays previously identified queries used to detect AI-generated text in the content of scholarly articles. Figure 4 includes the query ‘“as of my last knowledge update” -chatgpt’ which, after filtering out texts containing the chatgpt string, shows that the authors of these papers used the tool and intentionally published some results of its work. Figure 4 shows the result that contains this search query.

Details are in the caption following the image
Search results for a search query ‘“as of my last knowledge update” -chatgpt’ in Google Scholar.

Table 2 lists identified articles located in peer-reviewed journals published by recognized publishers, and where the journals are indexed by scientific databases such as Web of Science and Scopus. The results in Table 2 are sorted in descending order by the highest percentile values in the categories of journal indexing in the Scopus database. There are also several identified articles in journals not yet indexed in WoS and Scopus, but which are published by well-known publishers. However, they are not presented as examples as the cut-off is to be indexed in Scopus and where the journal is above 50th percentile. Table 2 also includes text excerpts identifying that the content was written by ChatGPT and the current status of the article. Identification is by phrases usually associated with the ChatGPT output. Typically, articles are active, meaning they are available in the form they were accepted post-publication. In some cases, articles were altered after publication. Three articles were retracted and one was corrected. There is also a case where an article has not yet been changed, but the following notice has appeared next to it: ‘Journal Notice: The Editorial Office is aware of the concerns raised regarding the potential use of AI or AI-assisted tools during the preparation of this publication and is currently investigating these concerns in collaboration with members of the Editorial Board’.

TABLE 2. Published papers using an AI-generated text with journal publisher, statistics and status.
Source Journal Publisher IFa CSb Scopus highest percentile Excerpt Status Citationsc
(Liu, Liao, et al., 2024) TrAC, Trends in Analytical Chemistry Elsevier 11.8 20.0 99% ‘Certainly, here are some key research gaps in the current field of’ Active 19
(Lazăr et al., 2024) Trends in Food Science & Technology Elsevier 15.1 32.5 99% ‘Certainly, here are some areas for future research regarding’ Active 2
(Su et al., 2023) Energy Elsevier 9.0 15.3 98% ‘Certainly, here are some potential areas for future research that could be explored’ Active 10
(Adel, 2024) Smart Cities MDPI 7.0 11.2 98% ‘As of my knowledge cutoff date in January 2022, there may be recent developments’ Active 17
(Koyunoğlu, 2024) Sustainable Technology and Entrepreneurship Elsevier n/a 12.3 98% ‘As of my knowledge cutoff in 2021’ Active 5
(Behiry & Aly, 2024) Journal of Big Data Springer 12.4 17.8 97% ‘Certainly! The use of a validation set in’ Active 14
(Ahmad et al., 2024) Partial Differential Equations in Applied Mathematics Elsevier n/a 6.2 97% ‘Certainly, Here are further data regarding’ Active 1
(Gath-Morad et al., 2024) Building and Environment Elsevier 7.1 12.5 97% ‘Certainly, here is the revised paragraph’ Active 0
(Wang, Zhu, et al., 2024) Virtual and Physical Prototyping Taylor & Francis 10.2 13.6 97% ‘Certainly, here is an expanded description’ Active 0
(Guo, Yu, et al., 2024) International Journal of Applied Earth Observation and Geoinformation Elsevier 7.6 12.0 96% ‘Certainly, here is the pseudo code for the specially designed spatial scene’ Active 3
(Huang et al., 2024) IEEE Transactions on Reliability IEEE 5.0 12.2 96% ‘Certainly, here are a few examples of discussing possible protection measures’ Active 1
(Majrashi et al., 2024) International Journal of Biological Macromolecules Elsevier 7.7 13.7 96% “Certainly, here's Table 1 outlining the formulations” Active 1
(Hassan et al., 2024) International Journal of Hydrogen Energy Elsevier 8.1 13.5 95% ‘As of my last update in 2021, policies and regulations around hydrogen are’ Active 30
(Yadav et al., 2023) Urban Climate Elsevier 6.0 9.7 95% ‘Certainly, here are a few examples of evidence supporting the WHO definition […] Certainly, here are a few examples’ Active 40
(Salami et al., 2023) Environment, Development and Sustainability Springer 4.7 10.2 95% ‘Certainly, here are the limitations associated with a review’ Active 2
(Alsagri et al., 2024) Process Safety and Environmental Protection Elsevier 6.9 11.4 95% ‘Certainly, here are the formulas for’ Active 3
(Xu et al., 2024) IEEE Transactions on Geoscience and Remote Sensing IEEE 7.5 11.5 95% ‘Certainly, here is the adjusted equation with reduced’ Active 3
(Abdelfattah & El-Shamy, 2024) Journal of Environmental Management Elsevier 7.9 13.7 95% ‘Certainly, let's delve into the global status of’ Active 26
(Chang & Lin, 2024) IEEE Access IEEE 3.4 9.8 92% ‘Certainly, here are the three major contributions in a more compact form’ Active 2
(Wang, Huang, et al., 2024) IEEE Access IEEE 3.4 9.8 92% ‘My training only goes up until September 2021’ Active 0
(Muzammul et al., 2024) IEEE Access IEEE 3.4 9.8 92% ‘Certainly! Let's discuss the evaluation metrics of’ Active 7
(Xie et al., 2024) Scientific Reports Springer 4.3 7.5 92% ‘Certainly, let's formulate a system of’ Active 0
(Wasim et al., 2024) Scientific Reports Springer 4.3 7.5 92% ‘Certainly, let's consider a business scenario’ Active 0
(Khan et al., 2023) Energy Conversion and Management: X Elsevier 7.1 8.8 92% ‘Certainly, let's focus on a detailed comparison between’ Active 11
(Hegazy et al., 2023) Journal of Energy Storage Elsevier 8.9 11.8 91% ‘Certainly, here's a table summarizing’ Active 25
(Niaz et al., 2024) Journal of Energy Storage Elsevier 8.9 11.8 91% ‘Certainly, here's the text without bullet points” Active 6
(He et al., 2023) Journal of Energy Storage Elsevier 8.9 11.8 91% ‘Certainly, let's refine the focus’ Active 4
(Zhou et al., 2023) Surface and Coatings Technology Elsevier 5.3 10.0 90% ‘Regenerate response’ Active 3
(Liu, Wang, & Wang, 2024) International Journal of Fuzzy Systems Springer 3.6 7.8 89% ‘Certainly, here is an edited version of the criteria for selecting’ Active 1
(Shoukat et al., 2024) PLoS One PLOS 2.9 6.2 89% ‘Regenerate response’ Retracted 18 Apr. 2024 6
(Kovtun et al., 2024) Royal Society Open Science The Royal Society 3.0 6.0 89% ‘Certainly, let's formulate the system of’ Active 0
(Arya & Pal, 2024) International Journal of Information Technology Springer n/a 6.0 89% ‘Certainly, here are the steps to find’ Active 0
(Kancharapu & Ayyagari, 2024) International Journal of Information Technology Springer n/a 6.0 89% ‘Certainly, let's delve into how’ Active 6
(Hwang & Ballouli, 2024) Sport Management Review Taylor & Francis 5 9.0 88% ‘Certainly! Let's expand on the intriguing difference between’ Active 0
(Senapati et al., 2024) Engineering Applications of Artificial Intelligence Elsevier 7.4 9.6 88% ‘Certainly, let's consider four key’ Active 1
(Lou et al., 2024) Chemical Research in Toxicology ACS 3.7 7.9 88% ‘Certainly, here are the top 10 SAs for both data sets’ Active 3
(Abd El-Latef et al., 2023) Biocatalysis and Agricultural Biotechnology Elsevier 3.4 7.7 88% ‘Certainly, here's a textual description of a synthesis diagram for’ Active 16
(Ferdowsi & Razmi, 2024) Asian-Pacific Journal of Second and Foreign Language Education Springer 1.7 2.9 87% ‘Certainly! Here's the expanded implications section in one unified paragraph’ Active 0
(Sumit et al., 2024) Cluster Computing Springer 3.6 9.7 87% ‘As of my last knowledge update in January 2022, there isn't a widely’ Active 2
(Shafiq et al., 2024) IEEE Transactions on Consumer Electronics IEEE 4.3 7.7 86% ‘Certainly! Here are the equations for calculating the mean values of’ Active 1
(Saglam et al., 2024) Energies MDPI 3.0 6.2 85% ‘Certainly, here is the revised paragraph’ Active (Spotted by publisher) 1
(Alqarafi et al., 2024) Biomedical Signal Processing and Control Elsevier 4.9 9.8 85% ‘Certainly, here's a literature survey for’ Active 3
(Jiang et al., 2024) Frontiers in Medicine Frontiers Media 3.1 5.1 85% ‘Regenerate response’ Active 2
(Koondhar et al., 2024) Energy Strategy Reviews Elsevier 9.1 12.8 85% ‘Certainly, let's delve deeper into the’ Active 2
(Remzan et al., 2023) Multimedia Tools and Applications Springer 3.0 7.2 84% ‘Regenerate response’ Active 3
(Jain et al., 2024) Environmental Science and Pollution Research Springer 5.8d 8.7 83% ‘Certainly, here are a few additional ideas (being researched heavily across the globe)’ Active 0
(Li & Yao, 2023) Environmental Science and Pollution Research Springer 5.8d 8.7 83% ‘Regenerate response’ Retracted 1 Jul 24 0
(Shekhar et al., 2024) Annals of Operations Research Springer 4.4 7.9 82% ‘Certainly, let's examine the scenario’ Active 0
(Tsai et al., 2023) Toxins MDPI 3.9 7.5 82% ‘Regenerate response’ Corrected 2
(Rinaldi, 2023) International Journal for the Semiotics of Law Springer 0.9 2.0 82% ‘As of my last knowledge update in September 2021’ Active 0
(EL-Omairi & El Garouani, 2023) Heliyon Springer 3.4 4.5 82% ‘Certainly, here's the expanded text translated’ Active 17
(Al-Qahtani et al., 2024) Environmental Technology (United Kingdom) Taylor & Francis 2.2 6.5 80% ‘Certainly, here are some advantages, disadvantages, and limitations’ Active 18
(Khalaf et al., 2024) Journal of Molecular Structure Elsevier 4.0 7.1 80% ‘Certainly, here's a comparison of’ Active 12
(Bertini et al., 2023) International Journal of Theoretical Physics Springer 1.2 2.5 80% ‘Can you provide me more examples of three entangled concepts? Certainly, here are some examples of three entangled concepts’ Active 3
(Kamalasekaran & Sundramoorthy, 2024) RSC Advances Royal Society of Chemistry 3.9 7.5 79% ‘Certainly, here are some potential future directions and areas of research and development in the field of’ Active 0
(Usman et al., 2024) Non-coding RNA Research Elsevier 5.9 7.7 78% ‘Certainly, let's delve even deeper into the pivotal role of’ Active 5
(Tarla et al., 2023) Physica Scripta IOP Publishing 2.6 3.7 78% ‘Regenerate response’ Retracted 14 Sep. 2023 6
(Saraswat et al., 2024) Physica Scripta IOP Publishing 2.6 3.7 78% ‘Certainly, let's shift our focus to the’ Active 0
(Asiri et al., 2023) PeerJ Computer Science PeerJ Publishing 3.4 6.1 78% ‘Certainly, let's break down the steps outlined in the pseudocode’ Active 1
(Abdelshafeek & El-Shamy, 2023) Food Bioscience Elsevier 4.8 6.4 78% ‘Certainly, here's an example that illustrates’ Active 11
(Mahdi et al., 2024) Irish Journal of Medical Science Springer 1.7 3.7 77% ‘Regenerate response’ Active 1
(Qian et al., 2024) Annals of Nuclear Energy Elsevier 1.9 4.3 77% ‘Certainly, here's a brief discussion of each’ 0
(de Lima Dias et al., 2024) Polymer Bulletin Springer 3.1 6.0 76% ‘Certainly! Here is the translation into’ Active 0
(Sharma & Bhende, 2024) Polymer Bulletin Springer 3.1 6.0 76% ‘Certainly, here are some current challenges’ Active 3
(Marwan et al., 2024) Bioresource Technology Reports Elsevier n/a 7.2 74% ‘Certainly, here's Table 1 outlining the formulations’ Active 6
(Moutsopoulou et al., 2024) Materials MDPI 3.1 5.8 73% ‘Certainly, here are the mathematical’ Active 0
(Upadhyay & Gupta, 2024) Journal of Food Process Engineering Wiley 2.7 5.7 73% ‘Certainly, here is a list of the top 10 countries’ Active 0
(Khan et al., 2024) Health and Technology Springer 3.1 7.1 71% ‘As of my last update in September 2021’ Active 0
(Jocelyn et al., 2024) Frontiers in Public Health Frontiers Media 3.0 4.8 70% ‘As of my last update in April 2023’ Active 3
(Sabour-Takanlou et al., 2024) Clinical Genetics Wiley 2.9 6.5 70% ‘as of my last knowledge update in January 2022’ Active 0
(Guo, Yang, et al., 2024) IET Image Processing Wiley 2.0 5.4 69% ‘Certainly, here are the formulas for calculating’ Active 1
(Goel & Digalwar, 2024) Procedia Computer Science Elsevier n/a 4.5 69% ‘Certainly, here are key contributions of the authors’ Active 1
(Tarafder et al., 2024) World Journal of Microbiology and Biotechnology Springer 4.0 6.3 68% ‘As of my last knowledge update in September 2021’ Active 0
(Joshi et al., 2023) International Journal on Interactive Design and Manufacturing Springer n/a 4.0 66% ‘Certainly, let's look at a practical example of the’ Active 12
(Bhatti et al., 2024) Revista Española de Documentación Científica CSIC 1.0 2.2 65% ‘Certainly! Here are 50 references related to’ Active 21
(Husnain et al., 2024) Revista Española de Documentación Científica CSIC 1.0 2.2 65% ‘Certainly! Here are 30 references related’ Active 27
(Nithya et al., 2024) International Journal of System Assurance Engineering and Management Springer 1.7 4.3 65% ‘Certainly, let's delve into a more detailed explanation of’ Active 0
(Alsaif et al., 2024) Optical and Quantum Electronics Springer 3.3 4.6 64% ‘As of my knowledge cutoff in September 2021’ Active 4
(Batool et al., 2024) Optical and Quantum Electronics Springer 3.3 4.6 64% ‘Certainly, here are some properties of the operator’ Active 3
(Kharya et al., 2024) Frontiers in Artificial Intelligence Frontiers Media 3.0 6.1 63% ‘To the best of the authors’ last knowledge updated on September 2021’ Active 0
(Tyagi et al., 2024) Communications in Soil Science and Plant Analysis Taylor & Francis 1.3 3.3 63% ‘Certainly, here are the main steps’ Active 0
(Singh et al., 2024) Diagnostics MDPI 3.0 4.7 62% ‘Certainly, here are the research questions’ Active 0
(Progression & Strategies, 2024) Applied Mathematics and Nonlinear Sciences Sciendo n/a 2.9 62% ‘Certainly! Let's delve deeper into the topic of’ Active 0
(Marouani et al., 2023) Processes MDPI 2.8 5.1 60% ‘Certainly, here are some additional points for further evaluation’ Active 69
(Madani et al., 2024) Batteries MDPI 4.6 4.0 59% ‘As of my last update’ Active 3
(Joshi et al., 2024) Measurement: Sensors Elsevier n/a 3.1 58% ‘Certainly, let's look at a specific example’ Active 0
(Patil et al., 2023) Materials Today: Proceedings Elsevier n/a 4.9 58% ‘Regenerate response’ Active 16
(Jannet et al., 2024) Journal of The Institution of Engineers (India): Series D Springer n/a 2.0 53% ‘Certainly, let's delve into the comparison of’ Active 0
(Al-Mahmud et al., 2024) Journal of Crystal Growth Elsevier 1.7 3.6 52% ‘Certainly, here's a comparison of’ Active 1
  • a IF—Impact factor value by Clarivate as of 2023.
  • b CS—CiteScore value by Scopus as of 2023.
  • c Citations—Number of citations in Google Scholar as of 30 September 2024.
  • d On hold after releasing IF 2023.

The analysis of 89 articles listed in Table 2 shows that as many as 28 of them are in journals with Scopus percentile values of 90 and above. Two journals have a 99th percentile, indicating that they are the top journals in their field. The journal TrAC, Trends in Analytical Chemistry holds the first and second position in the categories Chemistry/Spectroscopy and Chemistry/Analytical Chemistry, respectively. The journal Trends in Food Science & Technology ranks second in the category Agricultural and Biological Sciences/Food Science.

Twenty-five articles are published in journals that have a percentile between 80 and 89. In total, 64 articles were found in journals considered to be in Q1, top quartile, recognized as the group of the best journals in their respective fields. Twenty-five articles are in the percentile range between 50 and 75, indicating that the journals in which these articles are found belong to Q2.

Table 2 also includes information on the number of citations each article has received, as measured by the citations listed in Google Scholar. According to the received results, 60 articles have already been cited, with a total number of citations amounting to 528. One article had received over 60 citations, six articles had received 20 to 40 citations, and 41 articles had received one or more citations but fewer than 10. The status of these citations was checked on 30 September 2024.

The articles have also been grouped according to the scientific disciplines they pertain to. This was achieved using a matching system that aligns the title and abstract of an article with keywords indicative of major discipline. This classification process utilizes a lexicon of discipline-specific keywords. For instance, the discipline of Medicine is marked by keywords that include: clinical, healthcare, disease, surgery, diagnostic and patient. Engineering literature may often be signified by terms like: sensor, electrode, nanoparticle, mechanical, and electrical, which mirror the field's technical aspects. Research in Computer Science is usually connected with: algorithm, network, software, data, security and system. Works in Environmental Science might be identified by: ecology, environment, climate, pollution and sustainable. Economic research often encompasses: economy, financial, market, business and trade, whereas sociological studies frequently engage with: society, social, culture, religious, and inequalities. This keyword-based system facilitates the structured and systematic categorization of articles into their appropriate fields within the corpus of scientific literature.

After analyzing the 89 articles from Table 2, the results have been presented in Fig. 5. Nineteen articles have been categorized under the field of Medicine, 17 articles under Computer Science, 16 articles under Engineering, 10 articles under Environmental Science, seven articles each have been placed under Sociology, Education and Management, and six articles have been aligned with the field of Economics.

Details are in the caption following the image
An overview over disciplines covered by the scientific articles discovered as containing content generated by ChatGPT and included in scientific databases.

Figure 6 contains results from searches in the database of texts narrowed only by searching papers published by a certain publisher. In this case, the IGI-Global texts that bear the marks of content generation by ChatGPT are presented and these are usually chapters in published books. The general results in Google Scholar also include book chapters published by other publishers like Springer and conference proceedings published and indexed by IEEE. Another group of texts that contains content generated by ChatGPT are preprints. Since preprints are still at the pre-review stage, it may turn out that the final versions of these texts will not contain signs of content generation.

Details are in the caption following the image
Search results for a search query ‘“as of my last” -chatgpt source:igi’ in Google Scholar.

DISCUSSION

When considering content generate by AI, the first issue to consider is whether this is a serious problem and whether the number of identified articles is large enough to constitute a serious issue. For comparison, it is noteworthy that in the year 2023, the Scopus database included 2,511,962 scientific publications with the status of an article. If this is compared to the number of 89 articles found in the journals indexed in the Scopus database, the problem discussed constitutes only 0.000035% of the total, which is less than a margin of error. A similar result (0.000035%) is obtained for the 80 articles found in the journals indexed in the Web of Science, which reported 2,291,309 publications with the status of an article for the year of 2023. There has been a noticeable increase in the number of research papers incorporating content generated by ChatGPT in 2022, coinciding with its release date. However, it is important to note that the data for 2024 publications currently only covers the initial months of the year, while data representing the whole of 2023 is already available. Therefore, from a statistical point of view, it is a very small, minor share of such works. Nevertheless, the potential impact of even a small percentage of papers containing AI-generated content, especially in terms of potential implications for trust in scientific publishing, can be negative.

If it were just these calculations, there would seem to be no problem. However, it is important to note that when analyzing the texts of publications, in some sections of the content it can be observed that authors have masked information indicating, for example, that the ‘as of my last knowledge update was with a given month and year’, by only removing the information about the month and year and leaving the phrase ‘as of my last knowledge’. Similarly, the situation with the phrase ‘certainly, here are’ appears suspect. While it may sometimes contradict situations where, for example, an article is authored by one person who refers to themselves in the plural—though such a style of writing is also seen in academic articles—it is highly probable that the text was generated by ChatGPT. The provided results, therefore, certainly do not include the entirety of all articles that bear signs of being prepared using ChatGPT. If it were only about improving and correcting the spelling and grammar of the English language, using this tool and declaring that this tool was used for that purpose, then everything is in accordance with current publishers' policies. However, it should be emphasized that in each of the papers found and analyzed in this study, there is no information that an AI model was used, for example, for language correction or to improve the content of the text from a linguistic perspective. If such information had been stated, these publications would not have been included in this set because the articles would have had a declared possibility of using this tool for language improvement and would have been excluded at the stage of preparing and entering queries into Google Scholar.

Another observation is that if AI-generated content appears in one part of a publication, there is also a chance that other parts of the article might contain AI-generated content without leaving any trace. Following this line of reasoning, it can be acknowledged that there are articles that are partially generated by AI but do not contain the typical traces identified by the phrases left by AI within the content. This could even result from the authors intentionally using this tool merely for text correction, but it is important to remember that ChatGPT is prone to hallucinations and can perform actions that were not requested. For example, while correcting a paragraph of text in English, if there is more content to correct, ChatGPT may add a summarizing sentence at the end of the correction, which might be not noticed.

To address this issue, there have been attempts to use detectors for AI-generated content. For a while, a tool published by OpenAI was available to assess whether content was generated by ChatGPT. However, this tool was eventually withdrawn because the results were not reliable and often included false positives and false negatives, meaning that text written by a human was assessed as being generated automatically and vice versa (Elkhatat et al., 2023). Given the uncertainty, even from the company that published and released ChatGPT, there is no assurance that text checked by the detection tool can be automatically assessed and classified (Popkov & Barrett, 2024). Many major publishers use automated tools to identify if submitted text bears similarities to other already published works (plagiarism detection tools), with most publishers collaborating with Turnitin (Perkins et al., 2024). Turnitin also withdrew from providing information on what percentage of the text was generated by AI at the end of 2023. There were many cases in which text written by a human was reported by the tool as being generated by an AI model. Therefore, when it comes to the automatic identification of texts, the possibilities are currently limited (Hosseini & Resnik, 2024; Weber-Wulff et al., 2023).

There are three elements that have emerged from the results of this study and that are concerning. The first is that papers containing AI-generated content have been published in journals with quality indicators such as Impact Factors and CiteScores. Some of these journals are in the top quartile of publications and are published by well-known publishers such as Elsevier, Springer, Wiley and Sage. This situation can potentially compromise trust in both these journals and their publishers. The second concerning element is that some of these papers have already been cited by other research works. Thus, other authors are citing and referring to results presented in papers that may contain obvious errors. This undermines the trust in the results presented in scientific papers if parts of the text within these articles are written by an LLM, casting doubt on whether the rest of the content is checked and verified. The third element worth noting is that the majority of the journals and papers in the discussed sample appear to be predominantly within the main disciplines of medicine, computer science and engineering. This may be because researchers in these fields are closer to new solutions such as AI, language models, and machine learning, and may find it easier to apply these technologies. However, caution should also be exercised with papers that contain content produced by an AI model and belong to fields such as medicine or environmental science. In such disciplines, experiments are usually conducted, and the results of these experiments are reported. If the content of a paper already includes obvious text generated by an AI model, it may be harder to trust the results presented in these articles, especially in fields that affect human lives.

Throughout the entire publication process, attention should be paid to possible points of contact where text prepared using AI tools may be encountered, and where in the path from preparation and submission to a journal through potential text review it can be identified. The pathway should start with the article's authors carefully reading the text. If the authors have used ChatGPT to improve the text, they should declare this in accordance with the current publisher policies on using LLMs. The general policy among publishers states that AI tools must not be used to create, alter or manipulate original research data and results (Elsevier., 2023; Roche, 2024). Some publishers, like Emerald, during the manuscript submission process, expect the corresponding author to declare that no part of the text, results, tables or figures has been generated by AI, and this declaration must be completed during the text submission process. Other publishers have not yet implemented such declarations or are currently modifying their procedures used during the article submission process. If the authors adhere to these guidelines, such a situation should not arise at this stage.

If this was not the case, the next step is the submission of the paper to the publisher and the journal, where the text undergoes evaluation, partly automated using tools that collaborate with large publishers, for example, to check text similarity. This automatic evaluation is also limited because journal editors should not check the text or put any part of it into an external tool, outside the publisher's system, at least such is the policy at Elsevier. Elsevier's internal tools check the text's correctness. Initially, one or several editors review the text and decide whether it is good enough to send for external peer review, which is the next possible stage at which AI-generated text can be identified.

The text is then reviewed by reviewers who, at the inviting editor's request, undertake to review the text. As a result of their review, the reviewer expresses their opinion on how to improve the article so that it can be published or points out obvious flaws that lead to a recommendation of rejection. At this stage, the reviewer has the opportunity to notice that parts of the text bear signs of being generated by AI. The review process can consist of two to more rounds and involve two or more reviewers. Much depends in this respect on the journal editor's arrangements and the policy on the number of review rounds. After external review has taken place, the text moves to the decision phase of whether it should be accepted, and the editor making the decision again has the opportunity to become acquainted with it, and the AI-generated content may go further unnoticed. After acceptance, the text goes to the production team responsible for the layout and formatting of the text. The text may also undergo language correction at this stage performed by a language editor from the publisher. If no one notices by this point, the final version of the article published in the journal will contain AI-generated content. It is also possible that AI-generated content, which was not originally in the manuscript sent to the journal, may be added during the revision and text correction stage. Unfortunately, the examples indicated in Table 2 have passed fully through the publishing process and were ultimately published in their final version containing AI-generated content.

A separate stage in the publication process involves the use of preprints. It sometimes happens that a copy of the text submitted to a journal is also made available as a preprint. If the preprint contains AI-generated content, it may turn out that its final version, if accepted, will no longer contain this content. On the other hand, there have been preprints, for example on researchsquare.com, that contained such content but were later withdrawn by the authors (Cabanac, 2023b). Moving further, beyond ChatGPT, it should also be noted that there are other language models like Google Gemini, Meta Llama, Mixtral, OpenChat, which may not leave such a trace in terms of generating text and could also be used in the publishing of scientific works.

It is relevant here to address the question posed in the opinion paper by Dwivedi et al. (2023), ‘So what if ChatGPT wrote it?’. The exact strings identifying the use of ChatGPT in the referenced papers are usually found alongside significant themes in the article. ChatGPT does not respond to trivial questions but instead attempts to provide answers to the research questions posed by the authors in the article. However, how can one know whether the content provided by the AI model is true and validated? In this case, only the authors bear responsibility for what has been published. However, instances of hallucinations and the provision of false information by ChatGPT are known, hence if it turns out that false information has been published and others rely on it, because the papers in the journal are peer-reviewed, the journal ranks in the top quartile, has a high Impact Factor and CiteScore, the article is already cited, and the publisher is reputable, then potentially false information may continue to spread. Therefore, in response to the question, ‘So what if ChatGPT wrote it?’ one must check what was written and ensure that it is true.

Access to tools like ChatGPT and other supportive technologies is becoming increasingly widespread. From this revolution, there is no turning back. Therefore, it is crucial to consider how to use these tools so that scholarly work remains impeccable. First, these tools should be used with caution, bearing in mind what the creators themselves communicate. These are tools that are subject to active research and development and are known to have issues such as the generation of bias and misinformation. These tools should not be used for high-stakes decisions. Second, the text should be carefully read by authors and by editors and reviewers to ensure accuracy and integrity. Third, the use of AI should be disclosed in the same manner that other declarations are inserted into manuscripts on how the study was conducted, about competitive interests, and so on. This transparency will help maintain the trustworthiness and ethical standards of scientific publications.

It may now be the time to undertake a thorough reassessment and restructuring of existing publication frameworks, given some characteristic flaws like promotion of erroneous content. As LLMs continue to advance, identifying such bad misapplications of AI tools may become increasingly challenging. Perhaps some publishers will contact the authors and establish further communication on how to correct identified parts of the paper. Currently, the actions are very wide, from doing nothing, through silently changing the paper (Cabanac, 2023a), correcting the paper with corrigenda, to retracting or removing the manuscript entirely. The presence of these papers underscores a fundamental systemic concern related to the publish-or-perish pressure experienced by academics, frequently compounded by time constraints and resource limitations. This pressure may drive some to resort to questionable utilization of AI models. Consequently, the prevalence of research papers featuring unmistakably AI-generated content may necessitate ongoing monitoring, prompting institutions, funding bodies, peer reviewers and researchers to engage in critical self-reflection (Kshetri, 2024). It calls for the exploration of strategies to cultivate a culture of integrity in scientific publishing and the implementation of repercussions for breaches of such integrity (Gustilo et al., 2024).

Limitations

This study is not without limitations. First, some papers containing AI-generated text are not yet indexed in scientific databases, because Google Scholar is faster and content was primarily identified in Google Scholar. Second, the identification of research papers featuring evident AI-generated text relies solely on perfect string matching, which has its limitations. More advanced methodologies for detecting text likely to be significantly influenced or generated by AI models are discussed by Liang et al. (2024) and Yeadon et al. (2024). Third, there may be many cases of AI content being used, but without the errors and without the typical phrases, in cases where the author modified it a bit. Fourth, during the research process some journals were discontinued from Scopus or papers were retracted from journals. The presented state of the papers using phrases from ChatGPT will inevitably change in the future.

CONCLUSIONS

This paper tries to bring attention only to the observed situation. However, other researchers have already taken some actions against this problem with a very genuine approach. Glynn (2024) pointed to a spotted part of the paper ‘as of my last update in 2021’ with simple questions about why the authors did not update their research since 2021 or why multiple authors chose to use the singular first-person pronoun. Cabanac (2023a, 2023b) is marking such papers with simple questions, whether authors have forgotten to declare the use of an LLM in preparing the work. This paper tries to show that even the best journals with high rankings can exhibit failures to review papers at the level required to prevent such output.

AI will inevitably have an impact on scientific research in the future, from copyediting to writing literature reviews. It is important to be open and transparent about the extent to which AI has been and will continue to be used, particularly in publications and scientific research.

CONFLICT OF INTEREST STATEMENT

The author declares that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

AUTHOR CONTRIBUTIONS

A.S. wrote and revised the manuscript entirely.

ACKNOWLEDGEMENTS

During the preparation of this work, the author used ChatGPT in order to refine language. After using this tool/service, the author reviewed and edited the content as needed and takes full responsibility for the content of the publication.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.