Real-time Privacy Preserving Framework for Covid-19 Contact Tracing

: The recent unprecedented threat from COVID-19 and past epidemics, such as SARS, AIDS, and Ebola, has affected millions of people in multiple countries. Countries have shut their borders, and their nationals have been advised to self-quarantine. The variety of responses to the pandemic has given rise to data privacy concerns. Infection prevention and control strategies as well as disease control measures, especially real-time contact tracing for COVID-19, require the identification of people exposed to COVID-19. Such tracing frameworks use mobile apps and geolocations to trace individuals. However, while the motive may be well intended, the limitations and secu-rity issues associated with using such a technology are a serious cause of concern. There are growing concerns regarding the privacy of an individual’s location and personal identifiable information (PII) being shared with governments and/or health agencies. This study presents a real-time, trust-based contact-tracing framework that operates without the use of an individual’sPII, location sensing, or gathering GPS logs. The focus of the proposed contact tracing framework is to ensure real-time privacy using the Bluetooth range of individuals to determine others within the range. The research validates the trust-based framework using Bluetooth as practical and privacy-aware. Using our proposed methodology, personal information, health logs, and location data will be secure and not abused. This research analyzes 100,000 tracing dataset records from 150 mobile devices to identify infected users and active users.


Introduction
Epidemics such as H1N1, SARS, Ebola, and the recent coronavirus have impacted millions of people worldwide, resulting in a large death toll. The World Health Organization (WHO) has issued several advisories and courses for action to limit the spread of COVID-19 [1] infection by tracing infected individuals. The tracing process requires infected individuals to share information with governments and local medical agencies, which are tasked with tracking and quarantining individuals who may have been in close contact with infected victims, and the subsequent collection of further information about the infected victims. Tracking involves acquiring the personal information of each infected individual, including their travel history, locations visited, recent contacts, and their health details [2]. While most individuals may be comfortable with sharing this information for their own and the nation's benefit, privacy-aware individuals may not be so willing. This can hinder the contact tracing process, even as the virus continues to spread at alarming rates [3]. Secure and privacy-aware contact tracing methods can inspire everyone, infected or not, of all ages to join the contact tracing, with an assurance of the data being processed confidentially and with no malicious intent. Globally, various contact tracing applications, such as mobile apps and global positioning systems (GPS), are being used. The infected individual is expected to selftest and self-report health details using mobile applications and location data [4]. However, the sharing of data depends on local infrastructure and networks, which rely on unsecured external technologies such as wireless access points, GPS, data networks, or even those involved in the deployment and maintenance of the application itself. The government of Singapore has launched a contact tracing app called "Trace Together." The Indian government's contact tracing app, which is called Aarogya Setu, performs real-time tracking, as illustrated in Fig. 1. This study proposes the use of a mobile application with Bluetooth connectivity to perform real-time contact tracing. The authors propose the use of mobile devices to send anonymous beacons of encrypted random code messages via Bluetooth. This allows for foolproof data privacy, and individuals remain anonymous. No collection of privacy-sensitive data is involved or dependent on external third-party IT infrastructure. The major highlights of this research are privacy-aware contact-tracing mobile applications with the following features.
• The proposed contact tracing application does not involve any public wireless or network infrastructure.
• No personally identifiable sensitive information, geolocation, or logs is shared or gathered. • Real-time tracking enables the rapid identification of locations corresponding to new infected cases. • Focuses on complete privacy and the use of an individual's Bluetooth connection to determine others within a specific range. • Around 100,000 contact tracing datasets were used, which involves 150 individual mobiles.
This technology process helps to monitor infected individuals as well as reduce the medical costs involved during quarantine measures. The focus is on testing a framework that ensures complete privacy. To evaluate the contact-tracing framework, T-test and regression analysis were used to validate datasets from real scenarios.
The remainder of this paper is organized as follows: Section 1 describes in detail contact tracing and privacy issues, and Section 2 presents the literature survey regarding the different contact-tracing methods employed by various researchers. The proposed Contact Tracking Framework (CTF) algorithm is illustrated in Section 3. Section 4 discusses the experimental results obtained and presents the T-test validation of the dataset reviewed.

Contact Tracing and Privacy Issues
Contact tracing involves the identification, assessment, and management of persons exposed to diseases to prevent any onward transmission. If scientifically applied, this can help break the transmission chain of infectious diseases and can be an effective health tool for managing outbreaks. With respect to COVID-19, contact tracing requires the identification of individuals who may have been exposed. Steps such as the quarantining of contacts and the isolation of cases need to be performed. The design and development involved in the contact tracing application system needed to consider various threat vectors in terms of privacy, as presented in Tab. 1. • Perform malicious activities such as accessing and uploading the data of individuals and other users in proximity and selling them on the dark web. • Snoop on data from other mobile apps running on the mobile device.
• Request additional app permissions for accessing storage, camera, SMS, emails, location etc., without the user's permission. • Analyzing the app data for generating further insights, which were not parts of the privacy or service. Nation states • Selectively analyze individuals or the community and retaining personal, health, or discriminative user data even after the outbreak has ended or the app has been uninstalled. • Perform mass surveillance.
• Analyze data for generating insights, which were not part of the service.
(Continued) • Can perform penetration testing to discover zero-day exploits. Release the vulnerability worldwide to cause chaos. • Access the user's device without their consent or proper authorization.
Gathering personal mobility details for health application purposes presents challenges, even if privacy ethics and issues are upheld. The analysis of any individual's mobility and health data can only be justified if the benefits are related to public health. Most existing contact-tracing solutions rely on wireless infrastructure for contact tracing to preserve privacy.

Literature Survey
During the 2014 EBOLA outbreak, the WHO expounded on the significance of contact tracing and even proposed protocols for tracing infected individuals. However, no mobile application or data-gathering technologies were deployed. The WHO has proposed recommendations for medical staff and those on the front line to improve the safety of using contact tracing applications. With COVID-19, several countries have mandated the use of mobile-based contact tracing, thus gathering data and making use of data obtained from mobile applications. Monitoring and regulating interactions are vital for preventing the spread of this disease. Internet-and mobilebased technologies have aided in terms of surveillance, modeling of infection, remote sensing, etc., to predict and control the disease spread [5]. This tactic of using new-age technology to deal with global epidemics is classified as digital epidemiology under a new domain, as described by Chancay-García et al. [6]. Recently, several researchers have assessed the categorization of mobile call data records. Dede et al. [7] and Christak et al. [8] tracked user mobility patterns to model and evaluate epidemic sickness. Tizooni et al. [9] explored the use of proxy systems for individual users. The authors evaluated the mobility flow to predict the spread of epidemics. The accuracy of the predictive analysis, which was performed using mobility data sources, varied with the epidemic rate of propagation and timing of data results gathered.
Salathe et al. [10] discussed the use of wireless technologies, such as the ZigBee protocol and Bluetooth, to detect and trace infected people. The authors obtained detailed data on the social contacts of infected persons during the infection period. Then, the authors recreated the social networks of potentially infected users. To evaluate the spread, diffusion, and impact, the authors also proposed the SEIR model based on features such as susceptible, exposed, infectious, and recovered. Mastrandrea et al. [11] presented a prototype of wearable sensors for determining contacts amongst individuals and students. The authors matched the results with contacts from personal records, and associated the spread of an epidemic using sensors and diaries with a notable difference in dynamics. Interest in contact-tracing strategies has increased in recent times, and different methods have been used to estimate the impact and rate of spread before and during the plagues, as well as the efficiency of measures against contiguous epidemics. In many outbreaks, contact tracing is the only feasible option to identify infected individuals, as presented by Lima et al. [12], Rubrichi et al. [13], and Fraser et al. [14], who also tested reasons that aid in controlling an outbreak.
Contact tracing methods adopt two primary models, namely, population-based and agentbased approaches. Klinkenberget et al. [15] proposed a population-based top-down approach for analyzing system research data from a macroscopic perspective. Then, Kwok et al. [16] and Müller et al. [17] presented an agent-based bottom-up approach, considering every individual as a self-regulating agent entity. Each agent is responsible for its own infection state, movement, and location to estimate unrelated and adaptive activities. The stochastic model introduced by Farrahiet et al. [18] and Keelinget et al. [19] involves grouping the associated measures and fundamental dynamics of epidemics using a deterministic approach. In previous years, contacttracing models have focused on a generic network of contacts. To improve the precision of such network contact models, Huerta et al. [20] presented a similar model as part of the epidemic regulation tactics. This method helped to estimate the impact of contact tracing and the random tracing of complex contact networks. Yang et al. [21] proved that by tracing the contacts at a low additional cost, the spread of an outbreak may be considerably reduced, and even eradicated. The FluPhone project developed at Cambridge University [22] was one of the first attempts to use mobile apps to determine contacts. Using wireless Bluetooth as a proxy, the application was able to estimate physical contacts. The application promoted users to report symptoms to determine the rate and risk of infection. Similar contact tracing schemes focus on privacy issues, such as the pan-European privacy-preserving proximity tracing (PEPP-PT) [23] and the MIT project Safe Paths [24]. Corporate enterprises such as Apple, Facebook, and Google teamed up to integrate their web portals with handheld and sensor devices to provide similar solutions for Android and iOS mobiles. Isella et al. [25] claimed that the practice of contact tracing and isolation did not prevent the COVID-19 epidemic. The decreasing infection count is primarily due to asymptomatic infected individuals who are undetected, and who it is believed contribute to the spread of the COVID-19 outbreak. Using mobile apps to find previous contacts, we mathematically proved that such epidemic diseases can be checked even when no one uses the mobile application.

Proposed Framework: CTF
A real-time contact tracking framework (CTF) was designed and developed as a secure mobile application using the Android platform, SDK tools, and Java. Instead of using a data network or IT infrastructure such as wireless or office networks, the lightweight application uses Bluetooth with the need for limited computing resources of the individual's mobile. However, there are unauthorized and malicious privacy impacts from threat vectors. The CTF process is trust-based; individuals own the process, and it is his/her prerogative to join or exit, and further perform regular 15 days self-assessment to determine any infection. The generated logs comprise a unique ID (user's Bluetooth), timestamp (date and time), and health status code (random salted number) for each application user. The contract tracking framework follows five phases, as shown in Tab. 2.
Entities involved in the CTF process require privacy protection. These include individuals (mobile IMEI and number), location (IP address and geolocation), health data, and command server communication. The proposed CTF application ensures that the data collected is never shared with any of these entities, and keeps the individuals anonymous. The proposed workflow is shown in Fig. 2. Each installed app enables Bluetooth, records the individual information, and encrypts the log on the mobile SD card locally. Report generation Opt-in individuals are provided with two types of reports-detailed and basic. The basic level report is uploaded to the command server and is encrypted with a public key with their consent, while the detailed report is encrypted with the private key.  This application was designed for Android mobile devices. The reports are saved locally on the mobile, and are encrypted with a private key in the form of two reports. The first report is a detailed description accessible only to individuals. If the individual is infected, he can share with the medical teams all of the details in full confidence using a private key in order to determine the treatment. The second report is a basic-level code encrypted with a public key and uploaded to the command server. Whenever an individual goes outside, the application scans other mobile devices using Bluetooth. This sends and receives anonymous encrypted beacons to and from other mobile devices. If the application can decrypt the basic report of other individuals, Bluetooth alerts are generated immediately.

Bluetooth Beacons
• Should not reveal any personal information, location, reports, or other individual information. • While scanning, any personally identifying information should not be revealed to other users. • Should be arranged, encrypted with a symmetric key to prevent any log being revealed to any other user. • Should be randomly generated every 24 h to prevent the identification of transmitted information.

Uninfected Individuals
These are individuals who were or are infected, and who are never: • Mandated to upload their details on the command server.
• Notified by the command server to verify potential contact with other non-infected.
• Receive medical certificate encrypted with medical teams' public key.

Infected Individuals
These are individuals who are infected: • Are given the option to opt-in so that others can determine if they are near to any infected user. • Can check if they are close to others or those who opted to join. • Can stay anonymous even from the admin teams of the command server.
• Can find an infected user and determine when or where the actual contact happened.
• Should be assured of their IMEI number or MAC address.
• Can use the TOR browser to upload or download their logs and reports, thus remaining anonymous.
These alerts warn about an infected person in an individual's proximity. This indicates the presence of an infected individual within a range of 8-10 m. The flow of the secure contact tracing process is shown in Fig. 3. In this case, the user should then proceed to be tested. This process is anonymous, and no information about the individual is shared with the command server or other individuals. Individuals can opt-in or opt out as the process is trust-based. Only those who shared the reports on the command server and individual infections were verified using Bluetooth. The authors formally prove that the application guaranteed privacy-sensitive features and trust verification for the individuals observed correctly. The following features were considered when designing the framework.

Proposed Algorithm: CTF
Algorithm 1 presents the proposed secure application workflow and the CTF process. Tab. 3 lists the notations used in the proposed algorithm. The proposed contact detection runs as a service utilizing Bluetooth beaconing. This confirms the proximity detection of data exchange with nearby phones, even as the advertisements are nonconnectable and undirected. Fig. 4 illustrates the flow of advertisements between the application and remote Bluetooth devices. Contact detection and advertisement services are run on devices with a Bluetooth 16-bit UUID 0 x FA5F to enable proximity sensing between devices. Devices advertise and scan using a 128-bit proximity identifier that is periodically modified. Each advertisement scan is timestamped, and the discoverable bit is initially set to 1 and captured. The scan interval window is 5 min, which is sufficient to provide the discovery of advertisers and coverage. The advertiser address and proximity are changed so that they cannot be linked in any way. The advertising intervals are changed every few hundred milliseconds. The scanning internal window performs periodic sampling for every few minutes.   5 illustrates the dataflow process and behavior for the device scans, which ensure that privacy is maintained as the most critical specification while designing the application. This is utilized with the Bluetooth protocol, which is location independent, yet it uses the Bluetooth beacon to detect the device proximity. The user proximity ID correlates and obtains IDs of other devices every 15 min. This reset window reduces the loss from privacy advertisements and is processed exclusively on the local device. If any user is detected to have COVID-19 symptoms, the user can consent to the sharing of the diagnosis keys with the main server. Thus, users have control and transparency regarding their participation for contact tracing. These precautions are implemented in the framework design to ensure user privacy.

Experimental Results
The results varied between randomly selected individuals and those infected. Moreover, this research considered different time slots during which users turn ON their Bluetooth to evaluate the effectiveness of our protocols in different scenarios. The validations were repeated to capture the randomness of the simulations for 150 devices. The authors conducted a parametric statistical t-test and regression analysis to ensure that the datasets had no violations of the information presented in random samples from 100,000 records. The use of regression validated the prediction of continuous dependent variables from independent variables in the datasets. The deviations from the linear point line are the errors. The distribution of the sample mean is normal, and the variances of the different parameters are similar. The null hypothesis assumes that if the data violate these assumptions, then it can be safely assumed that the results obtained have committed a Type I error, which is more or less than the alpha probability, and the T-Test validation parameters are interpreted as presented in Tab. 4. The requirement for performing the T-Test is the use of two independent samples with normally distributed data and samples with the same variance. The authors take the null hypothesis, H0: H1 − H2 = 0, where H1 and H2 are the means for the two datasets. The null hypothesis is that there is no difference between the means of the two datasets, or more formally, that the difference is zero. Tab. 5 presents the T-test validation of the CTF dataset sample.  0  5  11  18  10  11  17  31  20  13  26  39  30  19  34  47  40  21  43  55  50  24  49  66  60  28  53  73  70  31  57  79  80  34  65  83  90  37  74  88  100  39  79  91 Considering the datasets for i1 and i2, the authors used a significance level of 0.05 with a two-tailed hypothesis. The difference scores that were calculated are presented in Tabs. 6 and 7 below.      Because the P-value is less than the significance level, alpha 0.05 → Null Hypo (H0) is rejected and the Alternative Hypo (Ha) is accepted. From the graph shown in Fig. 6 below, individuals who randomly turn on their Bluetooth when going out or in crowded places display better performance and contagiousness probability than those who turn on Bluetooth only when outside their homes or only at certain specific hours of for a set duration.

Conclusion
The presented research work successfully demonstrates the real-time, trust-based contact tracing framework (CTF) as a feasible privacy-aware solution. Nation-states need not use methods or applications that pose privacy-related risks or face issues when an individual's personal information or health logs can be misused. This study considers the features and entities that are related to protecting the privacy of an individual. The focus is to build a trust-based framework with a lightweight Bluetooth-based mobile application. Using sample datasets, the authors have shown how contact tracing with three options can mitigate the spread of COVID-19. Existing contact tracing applications do not provide open-source software for research or experimentation purposes. In the future, the authors plan to release this research as an open-source software implementation for both Android and iOS devices.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.