Abstract
The goal is to do a text mining analysis of all scientific publications and find out what journal and what aspects are studying about the conspiracy theories of Covid-19. For this purpose, all publications available in the National Center for Biotechnology Information (NCBI) database were consulted as they were peer-reviewed papers. Of all these papers, only the abstracts of each one were studied using artificial intelligence techniques to determine, for example, whether the subject is of importance depending on the journals where it has been published, and above all, what possible relationships could be extracted from the information published in them. In addition, the "Net Prevalence per Covid19" index was definedin those countries with a high value, greater campaigns should be sponsored to avoid the misinformation generated by Covid-19, although this comment should be verified in future publications. The main challenge was to unify the abstracts and for this purpose, a text summarizer was used under artificial intelligence schemes. The results obtained indicate the tendency of certain topics by the frequency of the words obtained where the focus on the conspiration are the Covid-19 vaccines, but further work is still needed to continue working on this methodology to unify the results.
Author Contributions
Copyright© 2023
Isea Raúl, et al.
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Competing interests The authors have declared that no competing interests exist.
Funding Interests:
Citation:
Introduction
Thanks to advances in information technologies and how this information is distributed anywhere in the world in a wide range of database repositories focused on the management of publications and, in addition, in social networks The confinement due to the Covid-19 pandemic has generated a large amount of information to know what was being done, what preventive measures should be taken, or what treatments and measures were carried out to contain the epidemic. In fact, a range of conspiracy theories has been propagated claiming, for example, that vaccines are an attempt to commercialize health that Covid19 did not exist Other factors could be the failure to adopt measures to contain the spread of the disease, such as the use of face masks, as well as belittling vaccination campaigns motivated solely by conspiracy theories. In fact, a study published in Sweden reported that, in 2021, one in three people thought that Covid-19 was a product of pharmaceutical campaigns In view of this, the present work conducts a search of peer-validated scientific literature in order to elucidate the extent of conspiracy theories concerning Covid-19 in the scientific community, i.e., to analytically determine whether it is possible to speak of conspiracy from the realm of science. Such knowledge can help to contain disinformation in future pandemics that may strike humanity or those campaigns orchestrated to seek the confusion of society. The Covid-19 pandemic has recorded since the first incident in December 2019 more than six hundred and ninety million cases worldwide until April 2023, of which six million eight hundred people died, according to data recorded at John Hopkins University (available at coronavirus.jhu.edu). Since then, quarantine protocols (more or less strict in the different countries) as well as vaccination schedules have been established In fact, the director of the World Health Organization (WHO) pointed out on February 15, 2020, that there should be another front to combat the epidemic caused by Covid-19 focused on the fight against infodemic. This was due to the polarization of the debate and the questioning of the health measures of the various bodies responsible for counteracting the epidemic. It is well known that conspiracy theories proliferate as a result of disinformation spread mainly through social networks In fact, there was a wide range of conspiracy theories resulting from the Covid-19 confinement, where there were many sources that pointed, for example, that the coronavirus came from a biological laboratory in China, or that Bill Gates wanted to sponsor a vaccine for humanity, without overlooking the defamation that it was a product of telecommunications resulting from 5G radiation These conspiracy theories are usually the product of factors such as helplessness
Results
The first step was to find out how many scientific publications are available in the NCBI whose title contains the words "Covid-19" and "Conspiracy". That search yielded 501 publications as of April 1, 2023, which is distributed as follows: 104 publications published in 2020, 189 in 2021, 216 in 2022, and 69 in 2023. From all this, it can be seen that there was an increase in the number of publications in 2021 of just over 55%, while in 2022 the increase was 87.5% with respect to the previous year. An examination of the types of publications shows that 470 are Scientific Articles, 20 are Reviews, 13 are Editorial, 10 are Comments, and 8 are Letters. In other words, there is a significant effort made by the scientific community to explain the role of Conspiracy theories in Covid-19. In the 1311275570039500NOTE: The hyphen means that the journal has no h5 value from Google Scholar. have an h5 index in Google Scholar which indicates that this topic is not usually published in high impact journals. The next step was to determine the "Net Prevalence by Covid19" (defined in this manuscript) parameter from the data obtained from Johns Hopkins University as shown in NOTE: “-” means data not indicated As it can be seen in The next step was to determine the number of words in each Abstract. To do this, the longest abstract had 765 words, and to find out if this is a trend in all the papers, we proceeded to determine the average number of words per article, which were 196 words (exactly 195.87 average words per article). The distribution of words per article is shown in This distribution is not uniform, and to achieve this, we proceeded to calculate the unique number of words in each Abstract. The average word count was 117 words per article (117.29) and the distribution is shown in An important point is that there is a series of words that are frequent in the language and do not contribute anything to the analysis being carried out in the work, known as stopwords, such as "is", "that", "by", "so much", "then", among many others. These have been eliminated according to the methodology described in This is a list of the frequency of the 30 most repeated words in the abstracts, it means, 'covid19' (6106) ir., covid19 appears 6106 in all papers, ‘sarscov2’ (2888), 'pandemic' (2026), 'health' (1671), 'origin' (1623), 'study' (1535), 'vaccine' (1513), 'patients' (1456), 'coronavirus' (1454), 'disease' (1366), 'virus' (1093), 'results' (1061), 'conspiracy' (1013), 'infection' (965), 'data' (938), 'public' (888), 'vaccination' (840), 'severe' (815), 'information' (809), 'respiratory' (806), 'analysis' (791), 'may' (759), 'cases' (728), 'social' (714), 'associated' (698), 'among' (684), and 'risk' (657). The next step is to corroborate that all the Abstracts deal with Conspiracy Theories and COVID-19, and for this purpose, we find the keywords which said the general topics covered in the papers, buy with help of resumer obtained from abstract. The most frequent keywords obtained after applying the resumer are (from most frequent to least frequent) conspiracy (304 times), Covid (192), Belief and Beliefs (192), Theory(ies) (128), Study (110), Vaccine (93), Health (89), Pandemic (86), Vaccination(s) (92), Social (60), Public (45), Trust (38) and Perceived (34). According to this list, it can be seen that most of the papers deal with truth in conspiracy theories where Vaccine and Vaccination(s) have a high occurrence. Finally, as well as the words, we proceeded to identify phrases that occur more frequently in the texts and that can serve as an orientation of the topics being studied in the works, where phrases such as (the frequency of these phrases was indicated in parentheses): “this study aimed to evaluate covid 19 vaccine acceptance among” (0,20) “this study explored the associations between covid 19 conspiracy beliefs” (0,20) “conspiracy theories during the covid 19 pandemic” (0,29) “feelings of anxiety and lack of control” (0,29) “belief in conspiracy theories and the” (0,33) “belief in covid 19 conspiracy theories” (0,33) “belief in covid related conspiracy theories” (0,33) “beliefs conspiracy theories about covid 19” (0,33) “conspiracy beliefs about covid 19 and” (0,33) “conspiracy theories related to covid 19” (0,33) “and covid 19 conspiracy beliefs” (0,40) “as well as conspiracy mentality” (0,40) “association between conspiracy mentality and” (0,40) “belief in the conspiracy theory” (0,40) “beliefs in conspiracy theories and” (0,40) “conspiracy beliefs are associated with” (0,40) “vaccine acceptance results showed that” (0,40) “and vaccine conspiracy beliefs” (0,50) “believing in conspiracy theories” (0,50 “compliance with public health” (0,67) “conspiracy theories was positively” (0,50) “conspiratorial thinking and the” (0,50) “covid related conspiracy beliefs” (0,50) “study examined the association between conspiracy beliefs” (0,29) “infodemics conspiracy beliefs and religious fatalism” (0,33) “less support for public health policies” (0,40) “web interest in conspiracy hypotheses and” (0,40) “pollution and climate change” (0,50) “an infodemic of” (0,67) “anxiety and depression”(0,90) “and conspiracy mentality” (0,67) From all the above sentences, it is easy to observe that the studies are focused on the association of the occurrence of Covid-19 cases by a belief in a conspiracy associated with vaccines, where the relationship of the conspiracy with religion could be inferred, but it is a very punctual aspect. The last (and future work) is to calculate the similarity using the cosine likelihood test, which is based on the cosine mathematical function that seeks to measure how similar texts are once they are reduced to a vector (details in
Country
Number of publications
United States
155
United Kingdom
76
China
37
Canada
29
Italy
28
Australia
24
Germany
23
Korea
17
Poland
16
Pakistan
15
Sweden
15
Switzarland
15
Netherlands
14
Austria
11
Journal
Number of Publications
Scholar Google h5
Journal nationality
Int J Environ Res Public Health
32
152
Nigeria
J Med Internet Res
22
-
Germany
Front Psychol
21
-
Switzerland
PLoSOne
16
198
USA
PersIndividDif
16
-
England
Vaccines
15
-
Netherlands
HealthCommun
14
-
USA
JMIR Infodemiology
10
-
Canada
BMJ
9
190
UK
France
39.850.030
65.584.518
60,76%
-
Greece
5.965.643
10.316.637
57,83%
-
Switzarland
4.399.088
8.773.637
50,14%
15
Netherlands
8.610.372
17.211.447
50,13%
14
Germany
38.368.891
83.883.596
45,74%
23
Australia
11.352.930
26.068.792
43,55%
24
Italy
27.715.384
60.262.770
42,67%
28
UnitedKingdom
24.448.729
68.497.907
35,69%
76
UnitedStates
106.363.949
334.805.269
31,77%
155
Spain
13.798.747
46.719.142
29,54%
-
Turkey
17.232.066
85.561.976
20,14%
-
Poland
6.504.194
37.739.785
17,23%
16
Pakistan
1.580.153
17
15
Canada
4.634.277
38.388.419
12,07%
29
Venezuela
552.483
29.266.991
1,89%
China
503.302
-
<1%
37
Conclusion
The paper performs a search of all publications that have focused on conspiracy theories centered on COVID-19 available in the NCBI database. It is seen that there is a significant correlation between the Covid-19 Net Prevalence rate with the number of cases and publications in some countries, so it should be analyzed country by country to determine the reasons for this. In fact, it was surprising that China and Russia do not publish about conspiracy theories, while the United States and the United Kingdom are the most prominent in this regard. On the other hand, the keywords as well as the phrases indicate that conspiracy theories are focused on vaccines. Also, when examining what kind of articles have been written on the subject, the vast majority are scientific articles, not reviews or letters, so the subject cannot be taken lightly and should serve as a warning to face future pandemics that may strike mankind. Finally, we are beginning to work on a cluster concept that involves the similarity between texts to group them. However, it is not a simple calculation to analyze due to the number of variables to be used and we are still developing a new cluster concept that will take into account the degree of similarity between papers.