Quality Data, Quality Findings

bar graph bearing 5 stars reflecting its quality

Stanford’s John Ioannidis recently joined 71 other methodologists in proposing that we lower the p-value threshold for claiming statistical significance in research from .05 to .005. This proposal is intended to reduce the rate of false positives and improve reproducibility in scientific research.

On the other hand, given the lax, inaccurate, or confusing ways the p-value has been applied, other researchers such as Jonas Ranstam have called for abandoning the p-value entirely in favor of confidence intervals. (A confidence interval is a range of values that provides a pretty good estimate of what the true value actually is—such as the degree to which a medication improves sleep or the likelihood that two samples came from the same genus).

Of course, given the nature of science and scientific research—namely, that we must rely upon sampling because we cannot study every person or every cell—neither the p-value nor confidence intervals can be the perfect arbiter of scientific “truth.” Instead, these figures only help us determine the extent to which scientific results can be generalized beyond the specific group (of people, of cells) tested or experimented upon.

So, p-value or confidence interval—take your pick.

To me, the important issue is how the findings of a research project can direct—with sufficient confidence—the next researcher’s work or the clinician trying to select a therapeutic course.

Ultimately, quality findings require quality data.

That’s why one of the goals of our new strategic plan targets improving the data gathered and analyzed in research studies.

NLM is investing in clinical terminologies to improve how the data collected during research and clinical care are labeled. Properly labeled data not only provide trustable indicators of the phenomena under study; they also allow researchers to more readily combine data from different studies, thus supporting data reuse and expanding the possibility of new findings from that data.

NLM is also investing in strategies to improve data capture and curation, both of which will improve the integrity and precision of data collected during research. Thoughtful, intentional curation and disciplined annotation will also make it easier to locate data sets, increasing the efficiency with which new studies can be designed and implemented.

So pick your side in the debate on the best way to signal significance, but remember that NLM resources are invested at multiple points along the research process, helping to ensure data quality, to simplify its discovery, and to apply analytical tools to uncover the insights the data hold.

The results of NLM’s investments promise to improve our ability to conduct research and interpret its findings, and in turn, those research improvements will be good for science, good for clinical care, and good for health.

Author: Patti Brennan

Director, US National Library of Medicine

What's on your mind?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s