Fapesp

FAPESP and the Sustainable Development Goals


COVID-19 Data Sharing/BR makes more datasets available


COVID-19 Data Sharing/BR makes more datasets available

Open-access repository established to facilitate research on the disease holds anonymized data including clinical examinations and laboratory test results from 485,000 patients processed by five institutions.

Published on 03/15/2021

By Elton Alisson  |  Agência FAPESP – COVID-19 Data Sharing/BR has just been updated by the institutions that are participating in Brazil’s first open-access repository of demographic, clinical and blood work data from patients tested for COVID-19.

The platform was launched in June 2020 by FAPESP in partnership with the University of São Paulo (USP) to facilitate sharing of patient data and support scientific research on the disease in various knowledge areas. It now offers anonymized data from 485,000 patients, including approximately 47,000 outcome records, and more than 23 million clinical examination and laboratory test records.

The datasets cover the period from November 2019 to December 2020. Although the first case of COVID-19 was notified in February 2020 by the Albert Einstein Jewish Hospital (HIAE) in São Paulo, the period covered by the data enables researchers to analyze patient histories and look for evidence of symptoms in patients treated before that.

“It’s not just an increase in the volume of data, which was expected as a result of the longer period covered. The repository now also holds data from two more institutions, enabling researchers to study a larger universe,” said Claudia Bauzer Medeiros, a professor at the University of Campinas’s Institute of Computing (IC-UNICAMP) and a participant in the project. “The diversity of outcome data has increased, for example. The new datasets cover the entirety of 2020, from the first reported case of the disease in Brazil onwards, enabling researchers to understand better how COVID-19 has progressed and to analyze the various ‘waves’ of the disease in Brazil.”

The previous upload to the repository took place in August 2020, comprising data on patients, outcomes, clinical examinations, and laboratory tests from Fleury Group (a diagnostic medicine company) throughout Brazil as well as HIAE and Hospital Sírio-Libanês (leading private hospitals in São Paulo). The latest upload comprises fresh data from the same institutions and from Hospital das Clínicas (the hospital complex run by the University of São Paulo’s Medical School, FM-USP) and Beneficência Portuguesa (BP, another major private hospital in São Paulo).

“There was demand for recent data in the repository,” Medeiros said. “The injection of data by these new institutions has considerably extended the variety of information available from the platform, as well as its temporal coverage. This extension will enable researchers to cover more ground and facilitate more comprehensive studies.”

Besides data, the participating institutions provide infrastructure, technology, and human resources to make data sharing possible. “Hospital das Clínicas is contributing a significant amount of data on the thousands of patients treated there in 2020 during an extraordinary operation that mobilized the entire institution,” said Tarcisio Eloy Pessoa de Barros Filho, head of FM-USP. “We’re not only contributing a lot of data but also greatly increasing the diversity of the data held by this multi-institutional repository.”

BP has also gone out of its way to provide high-quality data of use to researchers, including date, city, and state of birth, gender, and all COVID-19 diagnostic test results, both positive and negative.

“The inclusion of outcome data also enables researchers to track how cases progressed. All the data we export is completely anonymized in compliance with Brazil’s General Law on the Protection of Personal Data,” said Lilian Quintal Hoffmann, BP’s head of technology and operations.

Other participants in the initiative include the São Paulo State Association for the Development of Medicine (SPDM), Pensi Child Health Institute (linked to Sabará Children’s Hospital), and Real Hospital Português de Beneficência in Recife, Pernambuco State.

Data types

The repository holds three types of data: demographics (gender, year of birth, region of residence); clinical examinations and laboratory tests; and patient transfers and outcomes, when available (e.g. recovery or death).

Based on an analysis of the laboratory test data pertaining to almost 179,000 people tested for SARS-CoV-2 in Brazil, 33,200 of whom tested positive, a group of Brazilian researchers identified different clinical profiles influenced by the patient’s biological sex and age, as well as the severity of the disease (read more at: agencia.fapesp.br/34191/). 

Besides scientific research, the data available from the platform has also been used by private enterprise to develop technology for combating COVID-19, such as an artificial intelligence (AI) system to help diagnose diseases and predict outcomes developed by the startup DiagoNow.

The AI system uses patient data to create indicators and help medical teams make clinical decisions. It can detect COVID-19 false-negatives with 95% accuracy using no more than a full blood count (FBC). To reach this level of accuracy, the researchers who participated in the project used more than 30,000 FBC results available from the repository.

“The system isn’t available commercially yet,” said Molina Garcia, strategic leader of the project. “We’re moving ahead in partnership with Fleury and institutions such as Unimed [medical insurance and medical co-op] in Florianópolis [Santa Catarina State], Santa Casa de Ourinhos [São Paulo state] and BP that are helping us validate the platform.” Garcia is an undergraduate at the University of São Paulo in São Carlos, where he is studying computer engineering.

 

Source: https://agencia.fapesp.br/35348