20 research outputs found

    Ontology Approach in Lens Design

    Get PDF

    Using Drool rule-platform for the optical CAD web-application development

    Get PDF
    This paper describes a development of rule-based web application to the problem of starting point selection in optical design. The system architecture is shortly discussed. A formalization of the optical system representation based on formal syntax is given. Three examples of rules in natural language and Drools Rule Language syntax illustrate the work of the solution

    Geocontext extraction methods analysis for determining the new approach to automatic semantic places recognition

    Get PDF
    Goal of this paper is to determine actual trends in geocontext extraction methods and to understand which types of geocontext information are the most interesting for users. For this purposes comparison of recent researches about geocontext analysis was done. Researches were compared by the type of achieved result, used formalism, source data and limitations. As the main result of comparison new approach for automatic semantic places recognition was proposed. This approach is based on geotags markup with semantic user-defined tags. The solution allows extracting information (coordinates and a set of corresponding semantic tags on the natural language) about locations which are interesting for the location-based services users. The main advantage of the approach is its simplicity - the method does not rely on any syntax analysis algorithms during the semantic labeling stage. For illustrating the approach an example of the general purpose accidents monitoring service for the Geo2Tag platform was described

    Relation Extraction Datasets in the Digital Humanities Domain and their Evaluation with Word Embeddings

    Full text link
    In this research, we manually create high-quality datasets in the digital humanities domain for the evaluation of language models, specifically word embedding models. The first step comprises the creation of unigram and n-gram datasets for two fantasy novel book series for two task types each, analogy and doesn't-match. This is followed by the training of models on the two book series with various popular word embedding model types such as word2vec, GloVe, fastText, or LexVec. Finally, we evaluate the suitability of word embedding models for such specific relation extraction tasks in a situation of comparably small corpus sizes. In the evaluations, we also investigate and analyze particular aspects such as the impact of corpus term frequencies and task difficulty on accuracy. The datasets, and the underlying system and word embedding models are available on github and can be easily extended with new datasets and tasks, be used to reproduce the presented results, or be transferred to other domains

    RuLegalNER: a new dataset for Russian legal named entities recognition

    Get PDF
    We address the scarcity of datasets specifically tailored for legal NER in the Russian language and investigate the generalization capabilities of models towards unseen named entities. A rule-based program developed by legal experts at Tag-Consulting Company was employed to automatically annotate legal texts and create the RuLegalNER dataset. Part of the named entities only exists in the development and test splits, and they are unseen in the training set. RuBERT was utilized as the base architecture for experimental evaluation. Two different architectural extensions were explored: RuBERT with CRF and RuBERT with adapters. These architectures were used to train and evaluate NER models on the RuLegalNER dataset. Utilize RuLegalNER to train and evaluate legal NER models, enhancing performance in the legal domain and studying generalization on unseen entities. A published version of RuLegalNER is presented with detailed statistics and demonstration of the usefulness of RuLegalNER by evaluating modern architectures

    kOre: Using Linked Data for OpenScience Information Integration

    Get PDF
    ABSTRACT While the amount of data on the Web grows at 57 % per year, the Web of Science maintains a considerable amount of inertia, as yearly growth varies between 1.6 % and 14 %. On the other hand, the Web of Science consists of high quality information created and reviewed by the international community of researchers. While it is a complicated process to switch from traditional publishing methods to methods, which enable data publishing in machine-readable formats, the situation can be improved by at least exposing metadata about the scientific publications in machine-readable format. In this paper we aim at metadata, hidden inside universities' internal databases, reports and other hard to discover sources. We extend the VIVO ontology and create the VIVO+ ontology. We define and describe a framework for automatic conversion of university data to RDF. We showcase the VIVO+ ontology and the framework using the example of the ITMO university

    Development of the St. Petersburg's linked open data site using Information Workbench

    Get PDF
    This paper discusses the Russian projects publishing open government data. The article also describes the development of the open linked data portal and its approach to convert open government data in the open linked data. Information Workbench is used to build this system. It allows storing, visualizing and converting data files in Semantic Web formats
    corecore