Paper accepted for publication in Journal of Information Systems. Retrieved 6/26/2006 from http://www.ischool.drexel.edu/faculty/thu/My%20Publication/Journal-papers/JIS_hu2006.pdf.The novel connection between Raynaud dise ase and fish oils was
uncovered from two disjointed biomedical literature sets by Swanson in 1986.
Since then, there have been many approaches to uncover novel connections
by mining the biomedical literature. One of the popular approaches is to adapt
the Association Rule (AR) method to automatically identify implicit novel
connections between concept A and concept C from two disjointed sets of
documents through intermediate B concept. Since A and C concepts do not
occur together in the same data set , the mining goal is to find novel connection
among A and C concepts in the disjoint data sets. It first applies association rul e
to the two disjointed biomedical literature sets separately to generate two rule
sets (AàB, BàC), and then applies transitive law to get the novel connection s
AàC. However, this approach generates a huge number of possible
connections among the millions of biomedical concepts and a lot of these
hypothetical connections are spurious, useless and/or biologically meaningless.
Thus it is essential to develop new approach to generate highly likely novel and
biologically relevant connections among the biomedical concepts. This paper
presents a Biomedical Semantic-based Association Rule System (Bio - SARS)
that significantly reduce spurious/useless/biologically irrelevant connections
through semantic filtering. Compared to other approaches such as LSI and
traditional association rule-based approach, our approach generates much fewer
rules and a lot of these rules represent relevant connections among biological
concepts