Taiwan Central Disease Control (CDC) works with Taiwan AI Labs (AI Labs), set up AI transmission tracing system with the virus sequencing data and unstructured multi-omics data. This system identifies the transmission based on virus sequence data with contact history. This study also demonstrated source-unknown cluster and cases being set up the transmission path by adding sequencing data. One case is without sequencing, but the connection made by correlated sequenced cases. This system also helps to understand the international transmission path and confirmed the infection clusters. Taiwan identified the transmission path of all viruses with sequencing data.
As next-generation sequencing (NGS) technology develops rapidly in these years, the researchers can quickly obtain the whole genome sequence (WGS) of pathogens. Utilization of WGS in epidemic transmission analysis facilitates the rapid and accurate identification of samples' relativeness, which can be used to identify the path of transmission within a population and provide information on the probable source.
Since the beginning of the SARS-CoV-2 outbreak on January 22, 2020, Taiwan Executive Yuan (EY), when the CDC announced the first case of SARS-CoV-2 subsidizing the cost of virus sequencing for confirmed cases. The CDC works with hospitals and research institutes, to accelerate data accumulation. These data provide perspectives for epidemic researchers and laying a foundation for subsequent infectious disease researches.
In order to discover the virus transmission path for confirmed cases, Taiwan start using virus sequence data tracing virus transmission with the tool branch out from NextStrain and developed as an open-source tool https://covirus.cc/phylogeny/COVID19-latest.
AI Labs designed an AI algorithm to take the features from confirmed case information including traveling history, daily activities, cases relationship, and features from sequencing data. These data are supported by the CDC. AI algorithm is applied to establish a case-to-case propagation map. All Taiwan confirmed cases are displayed with a clear association graph according to the features including geographic whereabouts and genetic phylogenetic relationships.
In the genetic analysis, the system takes international virus data from GISAID automatically and constructs the evolution and the spreading history of this epidemic. The display model utilizes the tool branched from NextStrain for data visualization.
For the cases with an unknown source, Taiwan CDC leverage the WGS the phylogenetic tree to discover the cases that are highly relative to their virus strain, and with sample’s travel information, geographical location, and activity information to identify transmission routes. (Figure 2)
Taiwan identified infection clusters by phylogenetic tree and tour groups. 2 clusters, for example, are returning from Egypt (Figure 3) and Turkey (Figure 4). Cases from each group are all posited on the same branch of the phylogeny tree. Based on the result, cases in each group share the same source of infection as confirmed cluster.
The Case 34 is a 50-year-old woman with an unknown source. Figure 4 shows that samples relevant to this case are posited on the same branch including Case 48. Case 48 traveled from the United Kingdom. Based on this information, Taiwan CDC is able to trace Case 48 by associated member traveled from UK.
Case 336 is a confirmed case without sequencing data. To trace the transmission path of Case 336, we analyzed Case 347, infected by Case 336. Phylogenetic analysis shows that Case 347 is associated with Case 77, Case 143, and Case 144 who are having transmission history back to Europe (Figure 6). Therefore, Taiwan CDC is able to set up associated transmission path.
With the deployment of a virus transmission tool with an AI algorithm in Taiwan, Taiwan CDC can identify 100% of the sources for virus with sequence data. The infection clusters can be confirmed with sequencing data. The source of source-unknown case can be identified by sequencing the virus. Case without sequencing can also be traced back based on other sequenced cases.