By Veronika Kyrylenko
Last week, a nonprofit watchdog group posted evidence suggesting the National Institutes of Health (NIH) deleted genetic sequencing data on SARS-CoV-2 at the request of Wuhan University. The NIH representative denied the data was “deleted,” while admitting that it was, in fact, “suppressed.”
According to the March 31 email from NIH Media Branch Chief Amanda Fine to The Epoch Times, “They [genetic sequences of the virus] were not deleted. This is a really important point, and I’ve highlighted what did happen from what we provided to you earlier this week.”
The outlet said that the information Fine referred to as having been provided to The Epoch Times earlier was included in a story that was published on March 29:
“‘In June 2020, in response to a request by the same [Wuhan] researcher, National Center for Biotechnology [NCBI] gave the sequence data the status of “withdrawn,” which removes sequencing data from all public means of access but does not delete them.
“‘NCBI subsequently reassigned the status of the sequence data to “suppressed,” which means that sequence data are removed from the search process but can be directly found by accession number. This action to reassign the data was identified as part of NLM’s [National Library of Medicine] ongoing review into the matter. We are working to make more information available,’ the spokesperson said.”
NCBI is a subunit of the institute’s National Library of Medicine (NLM), and serves as the U.S. partner of the International Nucleotide Sequence Database Collaboration (INSDC).
On March 29, the Empower Oversight Whistleblowers & Research (EO) group published 238 pages of NIH emails and other documents that were obtained through a Freedom of Information Act (FOIA) request.
The documents show that a Wuhan University researcher submitted SARS-CoV-2 sequence information to the NIH’s Sequence Read Archive (SRA) in March 2020.
The unnamed scientist made an additional submission on the virus in June 2020. Later that day, however, he asked the NIH to retract the submission “because of error.”
The NIH declined the request, arguing that it preferred editing or replacing submissions over deleting them.
The emails showed that a few days later, the researcher resubmitted his request. This time, the NIH agreed to the request, and asked for clarification on whether a previous submission should be deleted as well.
“Yes, I want to withdraw both 2 submissions” as well as all “The Bioprojects, Biosamples and all SRA objects,” confirmed the Wuhan researcher.
The NIH then confirmed that it “had withdrawn everything.”
The correspondence between the NIH and the media and NIH internal emails suggest that the institute misled the reporters about its policy on removing sequences from its database.
For example, on June 19, 2021, an NIH official from the information and engineering branch wrote in an internal email, “The only way data is removed from the SRA … is if a submitter notifies us that the submission was in error.” However, that was not the stated grounds for the June 2020 removal of the genetic sequences identified in the submissions in question.
“Moreover, INSDC policy does not require that data be removed in the case of [an] erroneous submission, NIH refused to remove [the first] Submission ID SUB7554642 when Wuhan University initially claimed that it had been submitted in error,” points out the EO.
In an email responding to an American Association for the Advancement of Science (AAAS) inquiry about the withdrawn sequence, the NIH’s Renate Myles wrote on June 23, 2020 that researchers who submit data to the SRA hold rights to such data, implying that the researchers’ rights include having the data removed from the SRA. “Submitting investigators hold the rights to their data and can request withdrawal of the data,” she said.
Myles also wrote, “The requestor indicated the sequence information had been updated, was being submitted to another database, and wanted the data removed from the SRA to avoid version control issues.”
The EO discovered this seems to contradict INSDC’s written statement on data-sharing during the pandemic, which encourages submissions to multiple databases. “In cases where scientists have already established submissions to other databases, these submissions should continue in parallel to the INSDC submission.”
Furthermore, despite the fact that the NIH still has copies of all withdrawn sequences “for preservation purposes,” it has refused to examine them.
In October 2021, Fred Hutchinson Cancer Research Center evolutionary biologist Dr. Jesse Bloom contacted the NIH to discuss cooperating to analyze the deleted sequences. The NIH’s Steve Sherry dismissed the proposal, saying, “As you know, when data sets are withdrawn from the database, that status does not permit use for further analyses.”
Professor Bloom was the first to spot the deleted sequences. According to FredHutch.org, “In a report first published on the preprint server bioRxiv on June 22, [Bloom] reported uncovering SARS-CoV-2 sequences from early in the Wuhan outbreak that had been deleted from a National Institutes of Health database.” The professor retrieved the data through Google Cloud.
While Bloom’s discovery sparked some media and academic attention, the NIH did not respond to congressional oversight requests or the EO’s FOIA request about these sequence deletions. Only after EO sued to enforce its request did the institute release the redacted documents.
“Discovering the origin of COVID-19 is vital to ensuring that no pandemic like it ever happens again. Yet, shortly after it began, the NIH bowed to the wishes of researchers in Wuhan to terminate public access to genetic sequences that could shed light on how it began,” said Jason Foster, founder and president of EO, in a statement when releasing the FOIA documents.