18,326 research outputs found
Multi-layer Architecture For Storing Visual Data Based on WCF and Microsoft SQL Server Database
In this paper we present a novel architecture for storing visual data.
Effective storing, browsing and searching collections of images is one of the
most important challenges of computer science. The design of architecture for
storing such data requires a set of tools and frameworks such as SQL database
management systems and service-oriented frameworks. The proposed solution is
based on a multi-layer architecture, which allows to replace any component
without recompilation of other components. The approach contains five
components, i.e. Model, Base Engine, Concrete Engine, CBIR service and
Presentation. They were based on two well-known design patterns: Dependency
Injection and Inverse of Control. For experimental purposes we implemented the
SURF local interest point detector as a feature extractor and -means
clustering as indexer. The presented architecture is intended for content-based
retrieval systems simulation purposes as well as for real-world CBIR tasks.Comment: Accepted for the 14th International Conference on Artificial
Intelligence and Soft Computing, ICAISC, June 14-18, 2015, Zakopane, Polan
Identifying Data Sharing in Biomedical Literature
Many policies and projects now encourage investigators to share their raw research data with other scientists. Unfortunately, it is difficult to measure the effectiveness of these initiatives because data can be shared in such a variety of mechanisms and locations. We propose a novel approach to finding shared datasets: using NLP techniques to identify declarations of dataset sharing within the full text of primary research articles. Using regular expression patterns and machine learning algorithms on open access biomedical literature, our system was able to identify 61% of articles with shared datasets with 80% precision. A simpler version of our classifier achieved higher recall (86%), though lower precision (49%). We believe our results demonstrate the feasibility of this approach and hope to inspire further study of dataset retrieval techniques and policy evaluation.

Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization
Information retrieval (IR) for precision medicine (PM) often involves looking
for multiple pieces of evidence that characterize a patient case. This
typically includes at least the name of a condition and a genetic variation
that applies to the patient. Other factors such as demographic attributes,
comorbidities, and social determinants may also be pertinent. As such, the
retrieval problem is often formulated as ad hoc search but with multiple facets
(e.g., disease, mutation) that may need to be incorporated. In this paper, we
present a document reranking approach that combines neural query-document
matching and text summarization toward such retrieval scenarios. Our
architecture builds on the basic BERT model with three specific components for
reranking: (a). document-query matching (b). keyword extraction and (c).
facet-conditioned abstractive summarization. The outcomes of (b) and (c) are
used to essentially transform a candidate document into a concise summary that
can be compared with the query at hand to compute a relevance score. Component
(a) directly generates a matching score of a candidate document for a query.
The full architecture benefits from the complementary potential of
document-query matching and the novel document transformation approach based on
summarization along PM facets. Evaluations using NIST's TREC-PM track datasets
(2017--2019) show that our model achieves state-of-the-art performance. To
foster reproducibility, our code is made available here:
https://github.com/bionlproc/text-summ-for-doc-retrieval.Comment: Accepted to EMNLP 2020 Findings as Long Paper (11 page, 4 figures
- âŠ