Scraping Meaning from Technical Information: Semantic Mining of Chemical Information From Non-Chemistry Disciplines

Abstract

Presented at the Georgia Tech Career, Research, and Innovation Development Conference (CRIDC), January 27-28, 2020, Georgia Tech Global Learning Center, Atlanta, GA.The Career, Research, and Innovation Development Conference (CRIDC) is designed to equip on-campus and online graduate students with tools and knowledge to thrive in an ever-changing job market.Aaron Pital, in the School of Chemistry and Biochemistry at Georgia Tech, was the winner of a College of Science Travel Award.The pace of publication in the sciences has long since outstripped human ability to read and synthesize information. While interdisciplinary work can mitigate some of this burden, there remain fundamental questions about whether attentional blindness and the opportunity cost of reaching beyond the comfort of one’s expertise hold back innovation in speculative fields such as the origins of life. I present a brief model of associative information in scientific publication and propose tools derived from information theory, natural language processing, and data science to search for physical and chemical contexts embedded in literature from fields as diverse as soil science and drug design. The goals of these efforts are 1) to identify physical and chemical information of interest to select communities which would otherwise be unlikely to rise to the community’s attention and 2) to define rules for correlated information generally to improve literature cataloging, cross-referencing and searching

    Similar works