Abstract
In contributing to the initiative to address the COVID-19 pandemic and in order to enhance the knowledge on driving forces shaping the evolution of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) (isolated from Tunisian patients), a comparison in relation to other coronaviruses infecting humans (SARS-CoV-1, MERS-CoV, HCoV/229E, HCoV/NL63, HCoV/OC43, and HCoV/HKU1) as well as animals (SARS-CoVs in tiger, bats, civet, pangolin, bovine, and MERS-CoV in dromedary/camel), was conducted. In-depth analysis was carried out involving 115 sequences of spike glycoprotein-coding gene extracted from the international databases. Phylogeny inference allowed the reconstruction of a bifurcating tree where four distinct groups were delineated and at the same time, three animal accessions (SARS-CoV-2/tiger, MERS-CoV/camel, and SARS-CoV/bovine) shifted from the animal group and integrated the human coronaviruses clades. Nonetheless, in the presence of reticulate events such as recombination, networks described better the phylogenetic relationships rather than the classic dendrogram. Thus, networks were produced and identified four clusters containing sharply demarcated subgroups (eight subdivisions). Except networked phylogenies of SARS-CoV-1, SARS-CoV-2, and HCoV/HKU1, all the others showed edges and boxes illustrating the occurrence of incompatibilities related to the sequences of spike glycoprotein-coding gene. Thereby and consolidating this result, three methods (RDP package, GARD, and RECCO) were used to detect breakpoints in aligned sequences. Except the clades SARS-CoV-1 and SARS-CoV-2, all the remaining phylogenetic subdivisions were subject to recombination. Furthermore, the screening of selection pressure in all studied sequences by various statistics-based models of the HyPhy package, showed that, similarly, the lineages belonging to the clades SARS-CoV-1 and SARS-CoV-2 were not under selection. In contrast, all members of the remaining clades underwent, to different extents, adaptive selection as well as purifying selection.
Author Contributions
Copyright© 2021
Boulila Moncef.
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Competing interests The authors have declared that no competing interests exist.
Funding Interests:
Citation:
Introduction
Seven distinct zoonotic human coronaviruses described as spillovers since they crossed the species boundaries and jumped from animals to humans, are currently known. Four of them (HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV/HKU1) are responsible of the common cold ; whereas, the remaining three can cause from mild to severe respiratory diseases in humans i.e., SARS-CoV-1 (2003), MERS-CoV (2012), and recently SARS-CoV-2 (2019). Taxonomically, although HCoV-229E and HCoV-NL63 species are alpha-coronaviruses, HCoV-OC43, HCoV/HKU1, SARS-CoV-1, MERS-CoV, and SARS-CoV-2 species are beta-coronaviruses (subfamily Nowadays, humans are facing a real threat represented by viruses which are characterized by a rapid evolution enabling them to escape from natural immunities and currently available and used medicines which could lead to many cases of death. To evolve, viruses take advantange of compacted genomes, huge population sizes and short generation times. Besides, high mutation rates, antagonistic epistasis, and extensive selection cofficients are additional assets contributing to their adaptation and survival. Evolution of viruses can be defined as the change with time of the frequency distribution of genetic variants or mutants in a population. The study of evolution points at exploring various driving forces that can shape the genetic structure of virus populations. Viruses can evolve under the influence of various mechanisms among which : (i) mutation (change in DNA or RNA sequence that creates a new allele : point mutation, insertion/deletion, reversion) ; (ii) reassortment (exchange of genetic material in segmented virus genomes due to a co-infection of a host by two or more viruses resulting in the shuffling of gene portions and generating progeny viruses with novel genome combinations) ; (iii) gene duplication : a genetic process, responsible for the genesis of novelty and redundancy, is considered as a major mechanism involved in the evolutionary history of different eucaryotes such as plants In furthering our understanding on molecular evolution of the 2019 novel coronavirus(SARS-CoV-2), an in-depth investigation of main evolutionary forces that shape its genetic diversity, was conducted across the protein S-coding gene sequences. In addition, a comparison was made with other coronaviruses infecting humans as well as animals in order to explore possible relationships, if any, among all analyzed sequences.
Materials And Methods
A database of 115 spike glycoprotein-coding gene sequences comprised 15 sequences of each of SARS-CoV-1, SARS-CoV-2, MERS-CoV, HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU-1 coronavirus species infecting humans, and 10 accessions of other coronaviruses infecting animals (bats, civet, pangolin, tiger, bovine, dromedary/camel) were used in this study ( CLUSTAL X 2.1 In order to detect possible recombination signatures in all 115 accessions involved in this study, three different methods were utilized: RDP4.97 package, RECCO, and GARD. RDP4.97 package Seeking for selection signatures occurring in 115 sequences of spike glycoprotein-coding gene partitioned into clusters and subclusters, three approaches were explored : site, branch, and gene. Precisely, the first approach consisted of finding individual sites under positive/negative selection. Hence, four different methods i.e., the Single-Likelihood Ancestor Counting (SLAC), Fixed Effects Likelihood (FEL), Random Effects Likelihood (REL)
Host
Coronavirus species
Cluster
Subgroup
Country
Accession
Homo sapiens
SARS-CoV-1
Cluster I
Subgroup I
Canada
NC_004718
SARS-CoV-1
Cluster I
Subgroup I
Germany
AY310120
SARS-CoV-1
Cluster I
Subgroup I
Germany
AY291315
SARS-CoV-1
Cluster I
Subgroup I
China
AP006561
SARS-CoV-1
Cluster I
Subgroup I
Taiwan
AY291451
SARS-CoV-1
Cluster I
Subgroup I
USA
MK062184
MK062183
MK062182
MK062181
MK062180
MK062179
JX163928
JX163924
KF514412
SARS-CoV-2
Cluster I
Subgoup II
Tunisia
MT955168
MT955169
MT955171
MT955172
MT955173
MT955174
MT559037
MT499215
MT499216
MT499218
MT499219
MT499220
Tiger
SARS-CoV-2
Cluster I
Subgroup II
USA
MT365033
Bat
SARS-CoV/Bat RatG13
Cluster I
Subgoup III
China
MN996532
SARS-CoV-HKU3-1
Cluster I
Subgroup III
China
DQ022305
SARS-CoV/ SL-CoV-ZC45
Cluster I
Subgroup III
China
MG772933
SARS-CoV/ SL-CoV-ZXC21
Cluster I
Subgroup III
China
MG772934
SARS-CoV/WIV1
Cluster I
Subgroup III
China
KF367457
Civet
SARS-CoV/A022
Cluster I
Subgroup III
China
AY686863
Pangolin
SARS-CoV/PCOV_GX-P4L
Cluster I
Subgroup III
China
MT040333
Homo sapiens
SARS-CoV-1
Cluster I
Subgroup III
China
AY508724
SARS-CoV-2
Cluster I
Subgroup III
Tunisia
MT499217
MT955170
MT559038
Homo sapiens
MERS-CoV.
Cluster II
None
Saudi Arabia
MG757605
MH013216
MG011362
Host
Coronavirus species
Cluster
Subgroup
Country
Accession
Homo sapiens
MERS-CoV.
Cluster II
None
Saudi Arabia
MG011361
MG011360
KU710264
KU851859
KT806053
KT806048
MG011344
MG366483
MG011341
KX154684
KT806047
Homo sapiens
MERS-CoV.
Cluster II
None
Qatar
MN507638
Camel
MERS-CoV-Camel1-1
Cluster II
None
Saudi Arabia
KF917527
Homo sapiens
HCoV-229E
Cluster III
Subgroup I
USA
KF514433
KF514432
MN369046
MN306046
KY967357
KY621348
KY996417
MT438700
KY674919
KF514431
KY674914
KF514429
Homo sapiens
HCoV-229E
Cluster III
Subgroup I
Germany
NC_002645
AF304460
Homo sapiens
HCoV-229E
Cluster III
Subgroup I
Italy
JX503061
60896587376000Homo sapiens
HCoV-NL63
Cluster III
Subgoup II
USA
KF530114
KF530113
KF530112
KF530111
KF530110
KF530109
KF530107
KF530106
KF530105
Homo sapiens
HCoV-NL63
Cluster III
Subgroup II
China
MK334047
MK334046
MK334044
MK334043
JX104161
Homo sapiens
HCoV-NL63
Cluster III
Subgroup II
South Korea
MG772808
Host
Coronavirus species
Cluster
Subgroup
Country
Accession
Homo sapiens
HCoV-OC43
Cluster IV
Subgroup I
USA
KF530099
KF530098
KF530097
KF530096
KF530095
KF530094
KF530092
KF530091
KF530090
KF530089
KF530088
KF530087
KF530086
KF530085
Homo sapiens
HCoV-OC43
Cluster IV
Subgroup I
Mexico
KX344031
Bovine
SARS-CoV/B-ENT
Cluster IV
Subgroup I
USA
NC_003045
Homo sapiens
HCoV-HKU1
Cluster IV
Subgroup II
USA
KF430201
KF686346
KF686344
KF686343
KF686342
KF686341
KF686340
KY674943
KY674942
KY674941
KY674921
MK167038
Homo sapiens
HCoV-HKU1
Cluster IV
Subgroup II
China
NC_006577
AY597011
Homo sapiens
HCoV-HKU1
Cluster IV
Subgroup II
France
HM034837
Results
The Maximum Likelihood (ML) algorithm incorporated in MEGA-X software allowed the reconstruction of a phylogenetic tree where 115 spike glycoprotein-coding gene sequences were split into four distinct groups ( Pairwise nucleotide identity comparison among 115 sequences subdivided in clusters and subclusters (eight subdivisions) (according to NeighborNet phylogeny inference performed in SplitTree4 software) of spike glycoprotein-coding gene of coronavirusspecies infecting humans (SARS-CoV-1, SARS-CoV-2, MERS-CoV, HCoV/229E, HCoV/NL63, HCoV/OC43, and HCoV/HKU1) and a mixed group containing various species of animals particularly bat, civet, camel, pangolin, tiger, and bovine, was carried out and the analyses resulted in the following percentage intervals : 89.99-100.00, 99.86-100.00, 26.76-99.92, 99.40-99.97, 40.35-100.00, 35.22-99.97, 31.12-100.00, and 54.47-100.00 ; and the number of most distant and closest couple of accessions were as follows : 9-16, 1-15, 1-1, 2-3, 4-2, 1-3, 4-2, and 1-4, for members of the subgroups I, II, and III (cluster I), cluster II, subgroups I, and II (cluster III), and subgroups I, and II (cluster IV), respectively ( Although mutations are changes in nucleotide sequences due to errors in replication or repair, substitutions are mutations that have passed through the filter of selection. Substitution often involve amino acids with similar chemical characteristics supporting two evolutionary principles : (i) mutations are rare events, (ii) most dramatic changes are removed by natural selection. The analysis of both number and the type of substitutions, that have occurred during the evolution, are of central importance for the study of molecular evolution. In determining the substitution pattern and rates in the sequence of spke glycoprotein-coding gene of 115 lineages studied here, a substitution matrix was estimated for each type of subdivision as described earlier. Accordingly, it was revealed for subgroup I (cluster I) that rates of different transitional substitutions oscillated from 11.87 to 18.75; while, rates of transversional substitutions ranged from 3.76 to 5.93. Regarding subgroup II (cluster I), rates of transitional substitutions varied from 10.96 to 18.42 ; whereas, rates of transversional substitutions spanned from 3.85 to 6.46. Concerning subgroup III (cluster I), although rates of transitional substitutions extended from 9.51 to 26.03, rates of transversional substitutions fluctuated from 2.77 to 8.10. With regard to cluster II, whilst the span of rates of transitional substitutions ranged from 4.92 to 43.28, that of rates of transversional substitutions varied from 1.78 to 3.14. With respect to subgroup I (cluster III), as the extent of rates of transitional substitutions was from 7.12 to 36.08, that of rates of transversional substitutions was from 2.65 to 5.13. As regards, in subgroup II (cluster III), the expanse of rates of transitional substitutions was from 7.51 to 45.11, that of rates of transversional substitutions was from 1.46 to 3.31. Relating to subgroup I (cluster IV), rates of transitional substitutions oscillated from 6.49 to 39.25 ; while, rates of transversional substitutions fluctuated from 2.19 to 4.86. Referring to subgroup II (cluster IV), rates of transitional substitutions expanded from 9.77 to 28.64; whereas, rates of transversional substitutions ranged from 2.37 to 6.94 ( Generally, recombination is consisting of an exchange of genetic material in segmented as well as in unsegmented virus genome. It allows the virus to acquire new traits. In order to obtain reliable and relevant results, three bioinformatics methods were used : RDP package, RECCO, and GARD. The objective of the use of RDP package version 4.97 was threefold : (i) to identify the recombinant coronaviruses, (ii) to determine the different site recombination locations, (iii) to recognize the potential parentals. Accordingly, seven lineages having the accession numbers: KF530112, from USA, MK334043, MK334044, MK334046, MK334047 and JX104161, from China, and MG772808, from South Korea belonging to the coronavirus species HCoV/NL63, were recombinants ; the number of recombination sites were as follows : 5, 5, 3, 2, 3, 4, and 4, respectively. Furthermore, there was no geographical impediment with regard to the donor of genetic material. In other words, both major and minor parents could have any geographical origin ( *** significant at This evolutionary mechanism, defined as the existence of genetic elements that encode for the same function resulting in sequence redundancy, was searched in all sequences of spike glycoprotein-coding gene considering each subdivison separately as was the case for previous analyses. Based on the reconstruction of a newick tree performed by MEGA-X algorithm, search was done in all branching points of the topology. The search for duplication events was accomplished by determining the position of the root on a branch(es) that generated the minimum number of duplication events using an unrooted gene tree (newick) analysis. As a result, although gene duplication events were revealed in all nodes of each tree of seven subdivisions cluster I (subgroups I and II) cluster II cluster III (subgroups I and II) and cluster IV (subgroups I and II), they were located in only two branching points (the rooted node and the branching point of two Tunisian SARS-CoV-2 lineages (MT955170, and MT499217) belonging to subgroup III of cluster I ( Natural selection, briefly defined as the unequal survival and reproduction of hereditary material due to environmental forces resulting in the preservation of favorable adaptations, was screened in aligned sequences of all eight described subdivisions. To find selection signature over sequences, site-specific models (SLAC, REL, FEL, IFEL, FUBAR, and MEME) and branch-specific models (aBSREL and GA-branch), were used. As indicated in Legend: N/A : Not Applicable : *SLAC, IFEL, FEL, and MEME models were used at the significance level of 0.1 ;*REL model was used at the significance level of 0.02 ; *FUBAR was used at posterior probability ≥ 0.9.
Phylogroup
Subgroup
Nucleotide substitution matrix
Residue
A
T
C
G
Cluster I
Subgroup I
A
-
T
-
C
-
G
-
Subgroup II
A
-
T
-
C
-
G
-
Subgroup III
A
-
T
-
C
-
G
-
Cluster II
None
A
-
T
-
C
-
G
-
Cluster III
Subgroup I
A
-
T
-
C
-
G
-
Subgroup II
A
-
T
-
C
-
G
-
Cluster IV
Subgroup I
A
-
T
-
C
-
G
-
Subgroup II
A
-
T
-
C
-
G
-
Recombination pattern
Host
Coronavirus
RecombinantIsolate(accession #)
Position in the alignment
Position in the sequence (without gaps)
Putative parentals (Major x Minor)
R
G
B
M
C
S
3S
Homo sapiens
HCoV/NL63
KF530112
2456-4262
2225-3997
MK334047 x KF530110
3.350.10 E -3
-
-
1.893.10 E -3
6.725.10 E -3
1.106.10 E -2
-
2488-4144
2257-3882
MK334047 x MK334044
2151.10 E -5
4.755.10 E-6
1.914. 10 E-2
1.953.10 E -4
1.106.10 E -4
4.744.10 E -2
3.542.10 E -6
3821-4262
3560-3997
MK334047 x Unknown (MK334044)
3.738.10 E -2
-
1.914.10 E -2
4.459.10 E -4
-
5.568.10 E -3
3.542.10 E -6
4145-end
3883-end
JX104161 x Unknown (KF530105)
1.028.10 E -3
4.571.10 E -3
-
5.329.10 E -6
2.714.10 E -4
3.366.10 E -30
4262-end
3997-end
MK334044 x MK334047
-
-
-
1.237.10 E -2
7.498.10 E -4
4.757.10 E -27
1.849.10 E -6
MK334047
2456-4262
2225-3997
KF530112 x Unknown (KF530110)
3.350.10 E -3
-
-
4.18010 E -2
3.502.10 E -3
1.106.10 E -2
3.020.10 E -2
3603-end
3345-end
JX104161 x Unknown (KF530105)
1.028.10 E -3
4.671.10 E -3
-
5.329.10 E -6
2.714.10 E -4
3.366.10 E -30
3.488.10 E -4
3911-4206
3650-3941
KF530112 x Unknown (KF530105)
7.334.10 E -3
1.745.10 E -2
5.902.10 E -4
-
-
1.106.10 E -2
1.404.10 E -2
MK334046
2457-end
2226-end
KF530112 x Unknown (KF530110)
3.350.10 E -3
-
-
4.180.10 E -2
3.502.10 E -3
1.106.10 E -2
3.020. 10 E -4
4284-end
4019-end
JX104161 x Unknown (KF530105)
1.028.10 E -3
4.671.10 E -3
-
5.329.10 E -6
2.714.10 E -4
3.366.10 E -30
3.488.10 E -4
MK334044
129-1044
36-914
KF530112 x KF530105
7.936.10 E -8
2.906.10 E- 5
4.919.10 E 5
4.811.10 E -7
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
1044-end
914-end
KF530105 x KF 530 112
4.426.10 E -3
1.937.10 E -2
2.558.10 E -2
2.181.10 E -10
6.383.10 E -8
-
-
1044-1323
914-1192
KF530105 x MG 772808
2.041.10 E -5
6.929.10 E -5
2.775.10 E -3
1.746.10 E -4
1.684.10 E -3
2.817.10 E -3
1.286.10 E -4
MK334043
117-1044
24-914
KF530112 x KF530105
7.936.10 E -8
2.906.10 E -5
2.950.10 E -7
4.811.10 E -7
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
129-1044
36-914
KF530112 x KF530105
7.936.10 E -8
2.906.10 E -5
4.919.10 E -5
3.502.10 E -8
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
144-1044
51-914
KF530112 x KF530105
7.936.10 E -8
2.906.10 E -5
4.919.10 E -5
4.811.10 E -7
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
1044-1323
914-1192
KF530105 x MG772808
2.041.10 E -5
6.929.10 E -5
2.775.10E -3
1.746.10 E -4
1.687.10 E -3
2.817.10 E -3
1.286.10 E -4
Recombination pattern
Host
Coronavirus
RecombinantIsolate(accession #)
Position in the alignment
Position in the sequence (without gaps)
Putative parentals (Major x Minor)
R
G
B
M
C
S
3S
Homo sapiens
HCoV-NL63
MK334043
1044-end
914-end
KF530105 x KF530112
4.426.10 E -3
1.937.10 E -2
2.558.10 E -2
2.181.10 E -10
6.383.10 E -8
1.758.10 E -8
-
MG772808
1863-4266
1668-3961
MK334047 x KF530110
3.350.10 E -3
-
-
1.893.10 E -3
6.725.10 E -3
1.106.10 E -2
-
2913-4144
2661-3882
MK334047 x MK334044
2.151.10 E -5
4.755.10 E -6
1.914.10 E -2
1.953.10 E -4
1.106.10E -
4.744.10 E -2
3.542.10 E -6
3821-4144
3560-3882
MK334047 x Unknown (MK334044)
3.738.10 E -2
-
1.914.10 E -2
4.459.10 E -4
-
5.568.10 E -3
3.542.10 E -6
4288-end
4023-en d
MK334044 x MK334047
-
-
-
1.237.10 E -2
7.498.10 E -4
4.757.10 E -4
1.849.10 E -6
JX104161
59-1044
0-917
KF530112 x KF530105
7.936.10 E -8
2.906.10 E -5
2.950.10 E -7
4.811.10 E -7
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
129-1044
36-917
KF530112 x KF530105
7.938.10 E -6
2.906.10 E -5
4.919.10 E -5
4.811.10 E -7
1.219.10 E -4
5.959.10 E -29
1.979.10 E -6
1044-1344
917-1216
KF530105 x MG772808
2.041.10 E -5
6.929.10 E -5
2.775.10 E -3
1.746.10 E -4
1.684.10 E -3
2.817.10 E -3
1.286.10 E -4
1044-end
917-end
KF530105 x KF530112
4.426.10 E -3
1.937.10 E -2
2.558.10 E -2
2.181.10 E -10
6.383.10 E -8
1.758.10 E -3
-
HCoV-HKU1
MK167038
1344-1615
1252-1495
KY674921 x KF686344
4.764.10 E -52
2.256.10 E -50
1.396.10 E -48
3.8.10 E -14
3.639.10 E -14
1.277.10 E -18
3.6.10 E -12
1344-1615
1252-1495
KY674921 x NC_006577
8.3.10 E -52
4.007.10 E -50
2.414.10 E -48
4.409. 10 E -14
3.603.10 E -14
1.967.10 E -19
3.6.10 E -12
Recombination pattern
Host
Coronavirus species
RecombinantIsolate(accession #)
Position in the alignment
Position in the sequence (without gaps)
Putative parentals (Major x Minor)
R
G
B
M
C
S
3S
Animal(Bat)
SARS-CoV.
MG772934
1178-2640
967-2172
MN996532 x DQ022305
2.808.10 E -2
-
2.590.10 E -7
1.403.10 E -6
1.899.10 E -7
2.760.10 E -5
4.353.10 E -9
1202-1494
991-1267
2.760.10 E -2
-
-
-
3.935.10 E -2
2.456.10 E -4
3.031.10 E -5
1792-2640
1438-2172
MN996532 x DQ022305
2.808.10 E -2
-
6.570.10 E -4
2.010.10 E -4
2.283.10 E -5
2.760.10 E -5
4.353.10 E -9
1850-2632
1478-2164
MN996532 x AY686863
1.826.10 E -3
-
-
2.009.10 E -3
2.373.10 E -5
2.786.10 E -2
-
MG772933
1149-1494
989-1270
2.760.10 E -2
-
-
-
3.935.10 E -2
2.456.10 E -4
3.031.10 E -5
1163-2640
957-2175
MN996532 x DQ022305
2.808.10 E -2
-
6.570.10 E -4
2.010.10 E -4
2.283.10 E -5
2.760.10 E -5
4.353.10 E -9
1866-2623
1497-2158
MN996532 x AY686863
1.826.10 E -3
-
-
2.009.10 E -3
2.373.10 E -5
2.786.10 E -2
-
2197-2561
1782-2096
MN996532 x DQ022305
2.808.10 E -2
-
6.570.10 E -4
2.010.10 E -4
2.283.10 E -5
2.760.10 E -5
4.353.10 E -9
DQ022305
1178-1661
955-1365
MN996532 x MG772934
2.369.10 E -3
-
4.270.10 E -3
9.977.10 E -3
1.038.10 E -3
1.560.10 E -2
2.687.10 E -4
Cluster
Subgroup
Coronavirus species
AICc
∆AICc
Breakpoint location
LHS
RHS
Significance
Cluster I
Subgroup III
SARS-CoV-1, SARS-CoV-2 (Homo sapiens), and SARS-CoV. (Animals)
77709.3
221.891
206
0.00120
0.00060
***
795
1.00000
0.00060
N.S.
2043
0.99960
0.00060
N.S.
Cluster II
None
MERS-CoV/Homo sapiens and Camel/Dromedary
12135.5
12.2531
2382
0.23880
0.00020
N.S.
Cluster III
Subgroup I
HCoV/229E
32821.7
19.5223
680
1.00000
0.00080
N.S.
1049
0.00080
0.00080
***
1223
1.00000
0.64400
N.S.
3332
0.27040
0.00080
N.S.
Subgroup II
HCoV/NL63
24759.4
7.68904
325
0.00100
0.98000
N.S.
907
1.00000
1.00000
N.S.
1533
0.74300
0.01100
N.S.
2260
0.00100
0.00100
***
3432
0.26100
0.01400
N.S.
Cluster IV
Subgroup I
HCoV/OC43
50702.4
9.07933
82
1.00000
1.00000
N.S.
448
0.00100
0.00100
***
1510
0.01300
0.25900
N.S.
1649
1.00000
0.00400
N.S.
3557
1.00000
0.15200
N.S.
Subgroup II
HCoV/HKU1
31777.5
175.713
1251
1.00000
0.00120
N.S.
1500
0.26880
1.00000
N.S.
1677
0.00120
0.00120
***
2136
0.00120
0.00120
***
2254
0.07440
1.00000
N.S.
3120
0.00120
0.96960
N.S.
Cluster
Subgroup
Coronavirusspecies
SLAC*
IFEL*
FEL*
MEME*
FUBAR*
REL*
Number of selected sites
Number of selected sites
Number of selected sites
Number of selected sites
Number of selected sites
Number of selected sites
Positive
Negative
Positive
Negative
Positive
Negative
Positive
Negative
Positive
Negative
Positive
Negative
Cluster I
Subgroup I
SARS-CoV-1
0
0
0
0
0
0
0
N/A
0
0
0
0
Subgroup II
SARS-CoV-2(Humans +Tiger)
0
0
0
0
0
0
0
N/A
0
0
0
0
Subgroup III
SARS-CoV-1,SARS-CoV-2, and SARS-CoV (Animals)
5
16
27
52
27
65
46
N/A
0
16
1
34
Cluster II
None
MERS-CoV(Humans +Camel)
0
3
1
1
0
8
0
N /A
1
24
0
35
Cluster III
Subgroup I
HCoV/229E
0
6
19
54
7
45
43
N/A
0
12
1
23
Subgroup II
HCoV/NL63
0
6
10
99
9
137
17
N/A
0
54
0
195
Cluster IV
Subgroup I
HCoV/OC43
0
6
25
50
12
41
34
N/A
0
14
0
13
Subgroup II
HCoV/HKU1
0
4
9
59
2
123
34
N/A
0
116
206
113
Cluster
Subgroup
Coronavirusspecies
ω rate classes
Number of branches
% of branches
% of tree length
Number of branches under selection
Cluster I
Subgroup I
SARS-CoV-1
1
6
100
100
0
Subgroup II
SARS-CoV-2(humans +Tiger)
1
10
100
100
0
Subgroup III
SARS-CoV-1+SARS-CoV-2+SARS-CoV (Animals)
1
8
44
0.020
1
2
10
56
100
0
Cluster II
None
MERS-CoV Homo sapiens +Camel/Dromedary
1
20
95
8.4
0
2
1
4.8
92
1
Cluster III
Subgroup I
HCoV/229E
1
15
71
0.00077
0
2
6
29
100
5
Subgroup II
HCoV/NL63
1
20
87
0.66
0
2
3
13
99
1
Cluster IV
Subgroup I
HCoV/OC43
1
15
65
0.0010
0
2
8
35
100
7
Subgroup II
HCoV/HKU1
1
10
59
0.0019
0
2
7
41
100
5
Cluster
Cluster I
Cluster II
Cluster III
Cluster IV
Subgroup
Subgroup I
Subgroup II
Subgroup III
N/A
Subgroup I
Subgroup II
Subgroup I
Subgroup II
Coronavirus species
SARS-CoV-1(humans)
SARS-CoV-2(humans + tiger)
SARS-CoV/(animals + humans)
MERS/CoV(humans +camel)
HCoV/229E(humans)
HCoV/NL63(humans)
HCoV/OC43(humans + bovine)
HCoV/HKU1(humans)
# rates
2
2
2
4
4
5
4
5
c-AIC
10018.3
10161.8
77653.7
11737
36279.4
24556.9
52150.6
35343.6
∆c-AIC
4.66823
6.29354
3.5932
3.26432
1.06512
5.41797
2.85
1.44849
Number of models in 95% confidence set
275
1876
7288
2831
1526
1871
1395
2548
Class I
1
0
1.5271
1
0.0553
1.0716
0.8282
1.0410
# branches
4
3
7
5
6
3
6
7
%
60
29
18
10
0
0
42
65
Class II
0
1
0.4878
0.6042
0.3115
0
0.1008
0.6476
# branches
3
10
12
6
7
7
5
2
%
40
71
82
36
1
0
0
24
Class III
-
-
-
0.000
1.3927
0.1893
1.7708
0.0687
# branches
-
-
-
13
5
7
7
3
%
-
-
-
37
25
1
58
11
Class IV
-
-
-
0.1050
0.9925
0.059
0.0127
0.1798
# branches
-
-
-
5
5
6
7
3
%
-
-
-
17
74
0
0
0
Class V
-
-
-
-
-
21.9245
-
0
# branches
-
-
-
-
-
4
-
4
%
-
-
-
-
-
99
-
0
Discussion
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is widespread over the globe. Many aspects relating to its evolution remain unresolved. Consequently, it is essential to understand if transmissibility, virulence and pathogenicity of SARS-CoV-2 are still in an active phase of adaptation to new individuals and environments. Elucidating molecular evolutionary mechanisms controlling the COVID-19 pandemic and its devastating effect is crucial for forecasting and the choice of the best way for its annihilation. In this study, the author conducted an in-depth analysis to scrutinize under which evolutionary driving force SARS-CoV-2 can evolve in order to get more chance to survive by escaping body s defenses. Since the crown-shaped spikes enable the binding to and penetrating into the host cells, they are constantly challenged and undergo rapid molecular evolution. For this reason, the choice was made on this glycoprotein and on the gene encoding it. Accordingly, 15 Tunisian accessions of SARS-CoV-2 of Spike glycoprotein-coding gene were part of a set of sequences encompassing at the same time sequences from other coronaviruses infecting humans and animals. Based on networked phylogenetic relationships, eight subdivisions were delineated after a lineage rearragement performed using SplitTree4 algorithm. A panel of analyses involving various methods and algorithms showed that SARS-CoV-2 experienced neither recombination nor selection. These results were consistent with those of Bai et al. As of March 2020, the Tunisian officials imposed a stringent lockdown in order to curb the number of infected people and reduce the negative impact on the country s economy. At the end of six weeks, about 50 deaths have been recorded. By the end of September 2020, there was an upsurge of infection by SARS-CoV-2 due to relaxation of part of the population. Unfortunately, many citizens trivialized the disease and its incidence. As a result, as of February 25, 2021, 230,443 confirmed cases of COVID-19 have been reported resulting in 7,869 deaths (worldometers.info/coronavirus/country/tunisia/). To date, there are no specific anti-SARS-CoV-2 drugs ; in contrast, there are vaccines able to offer hope for a path out of pandemic, but not yet available in the country (vaccination should start in March, 2021, according to officials and it is the medical staff that will benefit first). More importantly, today in Tunisia, people are concerned about the recent emergence in the country of the British variant (B 1.1.7) of SARS-CoV-2 (announced on March 2d, 2021 by the Tunisian authorities) which has been estimated to be more contagious than the previously circulating form of the virus. On the other hand, it was reported in other countries the presence of another variant from South Africa as well as the variant (B.1.1.248) detected in Japan in travelers returning from Brazil. The latter possesses both the N501Y mutation, the most infectious, as well as the E484K mutation which is present on the "South African" variant. The combination of the two mutations could impact negatively the effectiveness of the vaccines and, at the same time, it could be very contagious. Similarly, a new variant called B.1.526 was identified in New York. It contains the mutations E484K and S477N that could reduce the effectiveness of vaccines (to be ascertained). Therefore and in order to prevent further possible appearance and spread of these variants in Tunisia, it is compulsory to comply with the health protocol stipulating that social distancing, the wearing of masks and the frequent hand washing must be respected. Further, the ventilation of areas used by occupants is of a considerable importance. The current pandemic is the sixth to strike humanity since the Spanish flu of 1918. But the frequency and severity of these global epidemics may well accelerate in the years to come, due to our way of life and the incredible adaptive capacities of viruses. Changes in the way we use land, the expansion and intensification of agriculture, as well as unsustainable trade, production and consumption are disrupting nature and increasing contact between wildlife, livestock, pathogens and humans. It's a path that leads straight to pandemics. We are much more threatened by viruses, whose genetic plasticity is decisive for the jump from one species to another. SARS-CoV-2, after having undoubtedly remained dormant for years in an animal reservoir, we now know that the virus most likely originates from the bat in which it would have differentiated from other lineages 40 to 70 years ago. In intensive farming, thousands of animals with great genetic homogeneity, crammed together in one place ; these are the ideal conditions for the virus to develop and make the mutations necessary to adapt to humans. A new threat to humans which boils down to the possible future appearance of SADS-CoV (Swine Acute Diarrhea Syndrome Coronavirus) which has been recorded in pigs and able to infect and replicate in human cells. Similarly, another zoonotic illness caused by Nipah virus may constitute an additional threat to humanity. It can cause severe disease and death in people. The Nipah virus transmission is thought to result from a direct contact with sick pigs as well as through consumption of fruits contaminated by saliva or urine from infected fruit bats. To date, no drug or vaccine targets Nipah virus. In other respects, recently in the United States, researchers have discovered human-to-human transmission of Chapare virus, a rare virus that can cause hemorrhagic fevers. They believe that rats carry this virus and then transmit it to humans. Symptoms of the disease are fever, abdominal pain, vomiting, bleeding gums, rash as well as pain behind the eyes. But it seems that the transmissibility is lower than for respiratory viruses such as influenza or Covid-19. Finally, it is recommended to monitor landscapes dominated by human activities more closely than wild areas. In addition, the protection of natural areas and the restoration of habitats degraded by humans could benefit both the environment and public health. Moreover, it is necessary to think about global biosecurity, evaluate weaknesses and strengthen health systems in developing countries.