The cardinal rule of science is that hypotheses must be proved. Neither strongly held beliefs nor good faith is good enough.

It is a wonder then that the Federal Bureau of Investigation, which invokes science when it uses forensic evidence in prosecutions, refuses to release a large body of data that could strengthen the validity of DNA testing—or tell us, perhaps, what we’re doing wrong.

Last month, Science magazine, America’s leading scientific journal, published a letter from 41 research scientists, forensic scientists, statisticians and legal scholars calling on the FBI to give qualified researchers access to the FBI’s 15-year-old National DNA Index System. We were among the signatories.

The FBI has never published the NDIS data or publicly described the results of its own research with the data. It is time to see what we can learn.

The NDIS data set contains the genetic profiles of more than 7 million people, most of whom have been convicted of serious crimes, such as rape. Besides the gender of the individual, each genetic profile targets only 13 points, or “loci” (out of millions that are available), on the human DNA double helix. The 13 targeted loci seem to be “noncoding,” meaning the genetic information at those locations is very good for identifying individuals, but has no direct role in traits such as eye or hair color, or susceptibility to certain diseases.

The letter to Science calls on the FBI to release the NDIS data after stripping out any identifying information that could link a DNA profile to a specific person.

The free and open sharing of data is at the very heart of the scientific process. Scientists are expected to show their evidence and share data, even—and perhaps especially—when the data do not support their own theories.

The NDIS data contain information on whether the practice of forensic DNA profiling aligns with DNA facts. There is much that science could learn from the data.

In many rape cases, for example, the number of “contributors”—that is the number of individuals who may have had sexual contact with the victim—is an important issue. The NDIS data could help us learn the frequency with which three-person mixtures produce DNA profiles that appear to have been produced by just two people.

The statistical weight of a DNA profile match is affected by the number of close relatives in the pool of alternative suspects. Scientific “kinship analysis” of NDIS data would allow researchers to assess how the probability of a match is affected by the presence or absence of relatives in NDIS.

The data could also yield valuable insights into the frequency and circumstances under which data errors occur.

Humans are ultimately responsible for the processing, interpretation and maintenance of the DNA profiles entered into the database. A review of a much smaller Australian offender DNA database suggests that errors occur in one out of every 300 or so entries. At the very least, this raises significant concerns about missed opportunities to develop investigative leads. Knowing more about the kinds of errors that most frequently occur will enable us to avoid or fix them.

The 1994 legislation that established NDIS explicitly anticipated that database records would be made available for research and quality control “if personally identifiable information is removed.” Release of the data would also be consistent with a March 2009 presidential memorandum to federal agency and executive department heads stating, “If scientific and technological information is developed and used by the Federal Government, it should ordinarily be made available to the public.”

The time has come to release the FBI’s DNA data to qualified researchers. Some of the things that are learned may make it harder for the government to secure convictions with DNA evidence.

But it is in everyone’s interest that scientific evidence is actually scientific.