Context Matters: Using Modern Comparative Genomics Resources to Revolutionize Your Research

Guest post by Valerie Schneider, PhD, Acting Chief of the Information Engineering Branch at the National Library of Medicine (NLM) National Center for Biotechnology Information (NCBI).

Over the last few years, we’ve shared with you how NCBI has focused on maximizing the impact of eukaryotic organisms and their genomic data on biomedical research through development of the NIH Comparative Genomics Resource (CGR). Eukaryotes are any single-celled or multicellular organisms whose cells contain a distinct and membrane-bound nucleus—essentially, all organisms that aren’t bacteria, archaea, or viruses. Yes, CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI toolkit of interconnected and interoperable data and tools. But more importantly, the project is (and has been from its outset) a collaboration with those in the genomics community.

Learn more about how comparative genomic analyses and CGR are powerful tools for scientific discovery.

Engaging with the community and gathering stakeholder feedback remains a key priority. We have hosted and participated in feedback sessions, user interviews, scientific conferences, webinars, and more. Members of our NLM Board of Regents CGR Working Group have helped us raise awareness of CGR with professional societies and individual researchers alike. To date, we’ve logged more than 350 distinct engagements with users, and we’re looking forward to many more!

You might be surprised to learn that the feedback we receive doesn’t only inform the development of resources in the NCBI toolkit; it also creates a parallel feedback loop that helps improve our communications about CGR itself. What have we learned about our outreach from feedback? In a nutshell… context matters! Our stakeholders get the most excited about CGR when they can relate to our messaging, whether that means understanding how resources in the NCBI Toolkit can answer their research questions or how their data types are valuable to CGR collaboration.

In response, our outreach has evolved. First, we’ve revised the descriptions of CGR tools to be more outcome focused. For example:

What are some things I can do with the NCBI Toolkit?

  • Search and Compare Genomic Sequences. Find regions of similarity between eukaryotic sequences with the Basic Local Alignment Search Tool (also known as BLAST). Try the new ClusteredNR database to explore evolutionary relationships and identify related organisms. 
  • Explore and Download NCBI Sequence Data. Browse genomic data, including sequences, annotation, and metadata, using NCBI Datasets. Download comprehensive packages of genomic or gene sequence data and metadata through the web interface and the command-line tools. 
  • Improve Data Quality Before Submission. Quickly find potential contamination in your sequence submission data using the Foreign Contamination Screening tool, a quality assurance process that you can run yourself. Evaluate your human, mouse, or rat genome assembly for completeness, correctness, and base accuracy with the Assembly Quality Control service

Second, in a series of NCBI Insights blog posts, we’ve shared the ways that some common types of comparative genomics research areas can benefit from CGR. These have included the use of animal models to study cancer, understanding human susceptibility to viral infections such as SARS-CoV-2, and expanding our knowledge of less-researched organisms.

Finally, we’re using CGR Impact Spotlights, a new channel where we explain how CGR tools can benefit research in a variety of topics of broad interest to comparative genomics. These brief summaries highlight already published research. By using author input to inform the content, NCBI is working with the research community to demonstrate the current and potential impact of CGR resources. The first two spotlights, available at the NCBI website, explored recent work in bioinformatics resource development and protein family evolution and orthology inference for a particular gene family. Are you interested in spotlighting your work and helping put CGR into context? Email us at!

These recent outreach updates are just one way in which user feedback has informed the CGR effort. As we continue our journey to spur innovation and discovery, we do so with the knowledge that context matters! Please subscribe to receive CGR updates and take advantage of opportunities to provide us with feedback that will improve your user experience.

Valerie Schneider, PhD

Acting Chief, Information Engineering Branch, National Center for Biotechnology Information (NCBI), NLM

Dr. Schneider oversees NCBI’s collection, creation, analysis, organization, curation, and dissemination of data and analysis tools in the areas of molecular biology and genetics, as well as the collection and management of bibliographic information. Most recently, Dr. Schneider served as the Deputy Director of Sequence Offerings and the head of the Sequence Plus Program at NCBI. In those roles, she coordinated efforts associated with the curation, enhancement, and organization of sequence data and oversaw tools and resources that enable the public to access, analyze, and visualize biomedical data. She has managed NCBI’s involvement in the Genome Reference Consortium, an international collective of academic and research institutes tasked with maintaining the value of the human reference genome assembly.

Featured image attachment page

Leave a Reply