10 research outputs found

    Agreeing to disagree: reconciling conflicting taxonomic views using a logic-based approach

    Get PDF
    Taxonomy alignment is a way to integrate two or more taxonomies. Semantic interoperability between datasets, information systems, and knowledge bases is facilitated by combining the different input taxonomies into merged taxonomies that reconcile apparent differences or conflicts. We show how alignment problems can be solved with a logic-based region connection calculus (RCC-5) approach, using five base relations to compare concepts: congruence, inclusion, inverse inclusion, overlap, and disjointness. To illustrate this method, we use different “geo-taxonomies”, which organize the United States into several, apparently conflicting, geospatial hierarchies. For example, we align T(CEN), a taxonomy derived from the Census Bureau’s regions map, with T(NDC), from the National Diversity Council (NDC), and with T(TZ), a taxonomy capturing the U.S. time zones. Using these case studies, we show how this logic-based approach can reconcile conflicts between taxonomies. We have implemented these case studies with an open source tool called Euler/X which has been applied primarily for solving complex alignment problems in biological classification. In this paper, we demonstrate the feasibility and broad applicability of this approach to other domains and alignment problems in support of semantic interoperability.DEB- 1155984DBI-1342595DBI-1643002Ope

    Context determines content: an approach to resource recommendation in folksonomies

    No full text
    By means of tagging in social bookmarking applications, so called folksonomies emerge collaboratively. Folksonomies have shown to contain information that is beneficial for resource recommendation. However, as folksonomies are not designed to support recommendation tasks, there are drawbacks of the various recommendation techniques. Graph-based recommendation in folksonomies for example suffers from the problem of concept drift. Vector space based recommendation approaches in folksonomies suffer from sparseness of available data. In this paper, we propose the flexible framework VSScore which incorporates context-specific information into the recommendation process to tackle these issues. Additionally, as an alternative to the evaluation methodology LeavePostOut we propose an adaptation LeaveRTOut for resource recommendation in folksonomies. In a subset of resource recommendation tasks evaluated, the proposed recommendation framework VSScore performs significantly more effective than the baseline algorithm FolkRank

    Resolving "orphaned" non-specific structures using machine learning and natural language processing methods

    No full text
    Scholarly publications of biodiversity literature contain a vast amount of information in human readable format. The detailed morphological descriptions in these publications contain rich information that can be extracted to facilitate analysis and computational biology research. However, the idiosyncrasies of morphological descriptions still pose a number of challenges to machines. In this work, we demonstrate the use of two different approaches to resolve meronym (i.e. part-of) relations between anatomical parts and their anchor organs, including a syntactic rule-based approach and a SVM-based (support vector machine) method. Both methods made use of domain ontologies. We compared the two approaches with two other baseline methods and the evaluation results show the syntactic methods (92.1% F1 score) outperformed the SVM methods (80.7% F1 score) and the part-of ontologies were valuable knowledge sources for the task. It is notable that the mistakes made by the two approaches rarely overlapped. Additional tests will be conducted on the development version of the Explorer of Taxon Concepts toolkit before we make the functionality publicly available. Meanwhile, we will further investigate and leverage the complementary nature of the two approaches to further drive down the error rate, as in practical application, even a 1% error rate could lead to hundreds of errors.National Science Foundation [NSF DBI-1147266]OPEN ACCESSThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

    Bringing a Semantic MediaWiki Flora to Life

    No full text
    The existing web representation of the Flora of North America (FNA) project needs improvement. Despite being electronically available, it has little more functionality than its printed counterpart. Over the past few years, our team has been working diligently to build a new more effective online presence for the FNA. The main objective is to capitalize on modern Natural Language Processing (NLP) tools built for biodiversity data (Explorer of Taxon Concepts or ETC; Cui et al. 2016), and present the FNA online in both machine and human readable formats. With machine-comprehensible data, the mobilization and usability of flora treatments is enhanced and capabilities for data linkage to a Biodiversity Knowledge Graph (Page 2016) are enabled. For example, usability of treatments increases when morphological statements are parsed into finely grained pieces of data using ETC, because these data can be easily traversed across taxonomic groups to reveal trends. Additionally, the development of new features in our online FNA is facilitated by FNA data parsing and processing in ETC, including a feature to enable users to explore all treatments and illustrations generated by an author of interest. The current status of the ongoing project to develop a Semantic MediaWiki (SMW) platform for the FNA is presented here. New features recently implemented are introduced, challenges in assembling the Semantic MediaWiki are discussed, and future opportunities, which include the integration of additional floras and data sources, are explored. Furthermore, implications of standardization of taxonomic treatments, which work such as this entails, will be discussed
    corecore