Let’s meet Zhiyong Lu, PhD, a Senior Investigator for our National Center for Biotechnology Information (NCBI) Computational Biology Branch and member of the NLM Intramural Research Program. He is leading a team of research scientists to harness the power of machine learning and artificial intelligence (ML/AI) to advance the ways our products and services work—advancements that benefit scientific and bioinformatics communities every day!
For example, if you’ve ever used LitCOVID to access PubMed and PubMed Central articles and research topics related to COVID-19, you can thank Dr. Lu and his lab for developing the COVID-19 literature search system and cutting-edge ML algorithms needed to retrieve relevant information even before March 2020. They answered the issue of the myriad ways COVID was being studied and the language researchers used when conducting those studies by putting themselves in the shoes of anyone looking for relevant information:
What were they likely to search for? What keywords would they use in those searches? Would their searches turn up everything they needed, or could they potentially miss things because of ever-evolving lingo?
Another development he and his team created was DeepSeeNet, an AI tool that supports how doctors and scientists extrapolate research information related to one organ (eyes) to identify systemic diseases (age-related degeneration, heart attack, cognitive function decline) throughout the entire body… but keep reading—we’ll have him speak for himself on these and many other efforts.

To say that he’s been instrumental to NLM’s efforts to accelerate discovery and improve health care is a vast understatement! And he knows that behind him is a talented team of budding biomedical researchers who bring fresh, new perspectives to ML/AI, which will only further NLM’s goal of enabling biomedical research and supporting health care across the lifespan.
Now let’s turn to Dr. Lu: Learn more about the person behind the research and see what he has to say about his team and their work!
What makes your team unique? Tell us more about the people working in your lab.
When I think of our group, I am amazed by everyone’s breadth and diversity of their knowledge and experience.
While their primary background is computer science, it also includes physics, chemistry, and medicine. And the computer science backgrounds include systems, language, machine learning, etc. We also have team members from around the world at different career stages, ranging from students in their teens and 20s to senior scientists in their 60s and 70s.
What we study is broad. Some of us study text, others work on images. Our text studies include research literature and clinical notes. We not only perform cutting-edge research; we also make practice improvements to search algorithms and tools, making the PubMed literature more available and useful to the public.
What is your advice for young scientists or people interested in pursuing a career in research?
Choosing a good research topic and project is critically important. It is never too late to study something that excites you. Also important is to find a good mentor who is willing to share their wisdom with you and help you navigate challenges and difficulties throughout the course of your study.
What do you enjoy about working at NLM?
I enjoy working with many talented people at NLM every day, for the last 15 years! In addition, seeing our research being used by millions of users worldwide is very rewarding.
Where is your favorite place to travel, and where are you planning to travel this year?
My hometown—Suzhou, China—is one of my favorite places to visit, although such a trip has been disrupted by the pandemic.
I am planning a trip to Denmark in March 2023.
What inspires you?
I chose this career because research is a fun thing to do! Having the opportunity to see my research in real-world use is a plus.
You’ve read his words, and now you can hear him for yourself! Follow our NLM YouTube page for more exciting content from the NLM staff that makes it all possible. If you’d like to learn more about our IRP program, view job opportunities, and explore research highlights, I invite you to explore our NLM IRP webpage.
Transcript [Lu]*: Our research focuses on AI and machine learning for processing both biomedical text and image data. Using machine learning and AI techniques, we address a set of different problems. So these are different examples:
We find that it’s increasingly difficult for individual researchers to keep up with the rapid growth of the biomedical literature. We’re talking about over a million articles in PubMed every year, and that means two or three papers every minute.
We developed a new search algorithm called “Best Match.” Our priority is to put the most relevant papers at the top of their search results in the first page or in the second page because that’s where most of the user action is, and this algorithm is used in PubMed by millions of users on a daily basis.
The second example is our response to the COVID-19 pandemic. At the beginning of the pandemic, it was difficult to locate all the papers on COVID-19 because the term “COVID-19” was not used by the research community until later in that year [2020]. As a result, there are many different expressions used in the literature to refer to the COVID-19 pandemic. So a simple keyword search would miss many relevant papers.
In response, we put together a database called “LitCOVID,” using machine learning and, in particular, text classification algorithms that go beyond keyword matching that would find all the papers relevant to COVID-19 regardless of what kind of expressions were used by the authors.
My last example is our joint work with our clinical researchers at the National Eye Institute.
We developed an AI tool called “DeepSeeNet” for making diagnosis and prognosis in retinal diseases such as age-related macular degeneration. Not only can we use retinal images for predicting eye diseases; we can use retinal images to gain insights for other systemic diseases in other parts of the body.
In one of the recent works, we used retinal images to predict heart attacks with very high accuracy, and right now we are pursuing a similar project where we use retinal images to predict brain diseases such as dementia and cognitive function decline.
In the long run, what we really aim to do is to teach computers to read and understand scientific papers like scientists [and] to interpret x-rays or retinal images like radiologists and ophthalmologists for disease diagnosis at a speed and at an accuracy that’s above and beyond human ability.
*Transcript edited for clarity.