A measure to estimate the risk of imported COVID-19 cases and its application for evaluation travel-related control measures

Overall framework

Figure 3 outlines the methodological framework of the study. First, we developed a measure of imported COVID-19 risk. The rationale for the measure is the risk of importation is proportional to the number of entrants, and the prevalence of COVID-19 in the country of departure. Thus, air travel volumes from selected countries to Seoul and the prevalence of COVID-19 cases in the countries of departure were multiplied to derive the measure. The main improvement in the measure from the previous studies is that real-time mobile data was used to estimate air travel volume, and time-varying detection rates were considered to estimate the prevalence of COVID-19. Use of air travel data from the previous yearten and not consider the detection efficiency13 have been suggested as limitations in former studies.

Figure 3

The methodological framework of the study.

Second, the developed measure was used to evaluate the travel-related control measures. Using the measure, we fit our model to the period when the detection rate was assumed to be 100%, as testing-on the day of arrival, and post-entry quarantine for 14 days were in effect. Then, we calculated the number of expected imported cases during the period when only body temperature screening, health surveys, and the declaration of travel records were required. The expected number was compared with the observed number of imported cases to estimate the number of undetected imported cases, and the detection rate before mandatory testing.

Data sources

Data were obtained from three sources. First, the daily numbers of confirmed COVID-19 cases in Seoul were obtained18 to identify the number of imported COVID-19 cases in Seoul for the period between 24 January and 30 June, 2020, since the earliest COVID-19 case in Seoul was confirmed on 24 January 2020. Second, the daily numbers of roaming users from each country to Seoul between 1 January and 30 June were acquired from Korea Telecom (KT). Travelers from Korea to other countries use roaming services to make/receive calls from regions outside the coverage areas of their home networks. KT has the second largest market share among mobile operators in Korea at 31%26; therefore the KT data are sufficient to estimate the trend in the number of entrants to Korea. Third, the daily numbers of confirmed COVID-19 cases between 1 January and 7 July, were usedfirst to estimate the prevalence of COVID-19 in countries outside Korea.

Dataset construction

Country selection

Countries of interest were selected based on the travel history of the reported imported COVID-19 cases in Seoul. For example, if an identified imported case had traveled to Italy, then Italy was selected. Cases with travel histories to more than a single country (24 cases) and unknown regions (two cases) were excluded from this procedure, as the source of infection could not be specified. Thirty countries were selected (Table 3). However, Austria, China, Malaysia, Poland, Singapore, Thailand, and Vietnam were excluded from the model fitting procedure as there were no reported imported cases from these countries after 1 April 2020, when testing on the day of arrival and 14-day post-entry quarantine for entrants became mandatory. Including countries with no imported cases after the implementation of mandatory testing may introduce bias to the model estimates.

Table 3 Countries selected as eligible for model fitting.

Entrants from each country to Seoul

The number of entrants from the selected countries to Seoul was calculated using the data provided by KT. These data provided the daily number of roaming users by country of departure and residential region in Korea during 2020. As we used the airline travel volume from a single mobile operator, the data do not represent the exact travel volume. Yet these data have been reported to be representative of the trends in domestic27and international travel volume28.

Estimating the prevalence of COVID-19

The prevalence of COVID-19 in the selected countries were estimated to assess the risk of exposure to COVID-19 among entrants traveling to Seoul, Korea. The local prevalence of COVID-19 in the selected countries were derived based on the daily number of confirmed new casesfirst. The reported incidence of COVID-19 is considered to be underestimated due to incomplete testing29.30. Thus, we extended a method used previously11. The daily strength of the testing policy for each country was derived using the Oxford COVID-19 Government Response Tracker (OxCGRT)thirty first. The OxCGRT classifies the strength of testing policy as follows: 0-no testing policy, 1-those who have both symptoms and meet specific criteria (key workers, classified as contacts, traveled overseas), 2-anyone showing symptoms, and 3-open public testing. Based on the reporting rate previously suggested: 0.092 (95% confidence interval [CI] 0.05, 0.20)32, we assigned testing policies 1, 2, and 3 to the reporting rates of 0.05, 0.092, and 0.20 respectively. The reporting rate for no testing policy (0) was assumed to be 0.01. The detection rates of two additional studies were considered as a sensitivity analysis. The results are provided in the Supplementary File.

The methods for estimating the incident ((I_{t} )) and prevalent ((PT} )) infectious cases on day tconsider the ascertainment period and reporting rate, is described in detail by Fauver et al.11 The model used prevalent cases as existing (prevalent) cases as well as new (incident) cases to serve as sources of infection. Briefly, (I_{t – d – 2}) was estimated using the reported new cases ((C_{t} )) and the reporting rate (({uprho }_{t} )) on day t. Time from symptom onset to isolation (testing) d was assumed to be 5 days, and cases were assumed to become infectious 2 days before symptom onset33.

$$I_{t – d – 2} = frac{{C_{t} }}{{rho_{t} }}$$


Then, (I_{t}) and the probability that a patient who became infectious on day i was still infectious on day t was added to estimate (PT}).

$$P_{t} = mathop sum limits_{i = 1}^{t – 1} I_{i} left( {1 – gamma left( {t – i} right)} right ) + I_{t}$$


The cumulative distribution function (left( {fleft( x right)} right)) of the infectious period (gamma left( {t – i} right)) was assumed to follow a gamma distribution with a mean and standard deviation of 7 and 4.5 days, respectively. As Fauver et al. show, the shape (({upalpha })) and rate ((1/{uptheta })) of the gamma distribution was calculated34.

$$fleft( x right) = frac{{{uptheta }^{{upalpha }} }}{{{Gamma }left( alpha right)}}x^{{{ upalpha } – 1}} e^{{frac{ – z}{beta }}}$$


where ({Gamma }left( alpha right) = mathop smallint limits_{0}^{infty } t^{alpha – 1} e^{t} dt).

The calculated (PT}) was divided by the total population of each country in 2020 to estimate prevalence per 100,000. The datasets of the entrants and the COVID-19 prevalence were merged according to country and date. Weekly average entrant volume and COVID-19 prevalence per 100,000 were calculated using the merged dataset. Finally, the weekly sum of reported imported COVID-19 cases in Seoul was merged to the dataset containing the weekly average number of entities and the COVID-19 prevalence.

Statistical analyses

Description of the new measure

The method used by de Salazar et al.ten was extended to estimate the number of expected imported COVID-19 cases. The measure indicates the risk of COVID-19 importation was calculated as the product of the number of entrants and COVID-19 prevalence in the selected countries. specifically, the expected number of imported cases was assumed to follow an over-dispersed Poisson distribution and was assumed to be dependent on the product of the entrants ((E_{w})) and the prevalence per 100,000 on week w ((P_{w})).

$$begin{gathered} {text{Expected number of imported cases}} = {text{Quassipoisson}}left( {lambda_{w} } right) hfill \ lambda_{w} = beta_{0} + {upbeta }left( {E_{w} times P_{w} } right) hfill \ end{gathered}$$


Estimating the effectiveness of post-entry quarantine

The model was first fit conservatively based on the data from week 16 (13 April) onwards. Testing on the day of arrival and after the 14-day post-entry quarantine came into effect for all arrivals beginning on 1 April. However, as the median incubation period is 5.1 days and the 95% percentile is 11.7 days35, many reported imported cases during weeks 14 (2020.03.30–2020.04.05) and 15 (2020.04.07–2020.04.12) could have arrived before 1 April. The regression coefficient ({upbeta }) was estimated based on the data of week 16 and onwards using the maximum likelihood method. Then, the expected number of imported cases was calculated based on the estimated β. A bootstrap sample of 500,000 was used to compute the 95% CI for the expected number of imported cases. The model fit was assessed by identifying whether the reported imported cases were within the confidence intervals of the fitted estimates, and by using the (R^{2}) statistics.

All results are provided by week number. We used data from 1 January to 30 June and the corresponding week (weeks 1–26), as the dates are provided in the tables. The number of undetected imported cases was computed by subtracting the number of reported imported cases from the number of expected cases. As in a previous study, upper or lower bounds for the undetected imported cases were calculated by subtracting the reported cases from the upper or lower bound of the expected imported cases36. The undetected cases were presented as 0 if the point estimate or CI of the undetected cases was < 0. Moreover, the reporting rate for the imported cases was computed as a ratio of reported imported cases to expected imported cases.

$${text{Reporting }};{text{rate }}left( {text{% }} right) = frac{Reported; imported ;COVID – 19; cases}{{Expected; imported ;COVID – 19; cases}} times 100$$


The lower bound for the reporting rate was computed as the ratio of the reported imported cases to the expected upper bound, and the upper bound for the reporting rate was calculated as the ratio of the reported imported cases to the expected lower bound. The reporting rate for undetected cases was presented as 100% if the reported cases exceeded the expected cases. By multiplying the calculated reporting rate and the reported imported cases after the testing and post-entry quarantine policies on the entrants came fully into effect, the number of imported cases without these policies was computed into effect.

Ethical statement

The data used were either publicly available, completely non-identifiable data collected for the purpose of disease control, or in an aggregated form. Thus, no ethics approval was required.

Leave a Comment