Picture of Heart Lake from the summit of Mount Jo in the Adirondacks

Current projects

Stay tuned for updates!

Past projects

Applications of automated information extraction in the plant sciences

My dissertation research focused on the application of automated information extraction to plant science problems, seeking to simplify and streamline the process of information propagation among scientists to advance scientific progress.


In a PICKLE: Building out training resources for natural language processing in the plant sciences

Many state-of-the-art algorithms for information extraction tasks require high-quality labeled data in the target domain, in which entities like genes and proteins, as well as the relationships between entities are labeled according to a set of annotation guidelines. While there exist guidelines and datasets for other domains, these resources need development in the plant sciences. In this project, we develop the Plant ScIenCe KnowLedgE Graph (PICKLE) corpus, and provide our annotation guidelines along with an initial corpus of 250 annotated abstracts. Additionally, we perform an analysis of the impact of adding new types to the evaluation of

This work has been accepted to in silico Plants! The dataset is available on Zenodo and Huggingface

Bridging disciplinary gaps in desiccation tolerance research through bibliometric research

Using a citation network, we have demonstrated that while desiccation tolerance is a biological phenomenon that exists across kingdoms of life, reseaarchers studying those kingdoms very rarely cite literature from other study systems. Using bibliometric analysis, we examine some of the disciplinary biases in the study of desiccation tolerance, and propose a rule-based algorithm to recommend new attendees to specialized academic conference to improve the integration between biological study systems.

This work is currently available as a pre-print.

Updated: 13 January 2025