23,058 research outputs found
Finding Academic Experts on a MultiSensor Approach using Shannon's Entropy
Expert finding is an information retrieval task concerned with the search for
the most knowledgeable people, in some topic, with basis on documents
describing peoples activities. The task involves taking a user query as input
and returning a list of people sorted by their level of expertise regarding the
user query. This paper introduces a novel approach for combining multiple
estimators of expertise based on a multisensor data fusion framework together
with the Dempster-Shafer theory of evidence and Shannon's entropy. More
specifically, we defined three sensors which detect heterogeneous information
derived from the textual contents, from the graph structure of the citation
patterns for the community of experts, and from profile information about the
academic experts. Given the evidences collected, each sensor may define
different candidates as experts and consequently do not agree in a final
ranking decision. To deal with these conflicts, we applied the Dempster-Shafer
theory of evidence combined with Shannon's Entropy formula to fuse this
information and come up with a more accurate and reliable final ranking list.
Experiments made over two datasets of academic publications from the Computer
Science domain attest for the adequacy of the proposed approach over the
traditional state of the art approaches. We also made experiments against
representative supervised state of the art algorithms. Results revealed that
the proposed method achieved a similar performance when compared to these
supervised techniques, confirming the capabilities of the proposed framework
The study of probability model for compound similarity searching
Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model
Interacting Attention-gated Recurrent Networks for Recommendation
Capturing the temporal dynamics of user preferences over items is important
for recommendation. Existing methods mainly assume that all time steps in
user-item interaction history are equally relevant to recommendation, which
however does not apply in real-world scenarios where user-item interactions can
often happen accidentally. More importantly, they learn user and item dynamics
separately, thus failing to capture their joint effects on user-item
interactions. To better model user and item dynamics, we present the
Interacting Attention-gated Recurrent Network (IARN) which adopts the attention
model to measure the relevance of each time step. In particular, we propose a
novel attention scheme to learn the attention scores of user and item history
in an interacting way, thus to account for the dependencies between user and
item dynamics in shaping user-item interactions. By doing so, IARN can
selectively memorize different time steps of a user's history when predicting
her preferences over different items. Our model can therefore provide
meaningful interpretations for recommendation results, which could be further
enhanced by auxiliary features. Extensive validation on real-world datasets
shows that IARN consistently outperforms state-of-the-art methods.Comment: Accepted by ACM International Conference on Information and Knowledge
Management (CIKM), 201
Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments
In this work, we present an extension of CORE [8], a tool for Collaborative Ontology Reuse and Evaluation. The system receives an informal description of a specific semantic domain and determines which ontologies from a repository are the most appropriate to describe the given domain. For this task, the environment is divided into three modules. The first component receives the problem description as a set of terms, and allows the user to refine and enlarge it using WordNet. The second module applies multiple automatic criteria to evaluate the ontologies of the repository, and determines which ones fit best the problem description. A ranked list of ontologies is returned for each criterion, and the lists are combined by means of rank fusion techniques. Finally, the third component uses manual user evaluations in order to incorporate a human, collaborative assessment of the ontologies. The new version of the system incorporates several novelties, such as its implementation as a web application; the incorporation of a NLP module to manage the problem definitions; modifications on the automatic ontology retrieval strategies; and a collaborative framework to find potential relevant terms according to previous user queries. Finally, we present some early experiments on ontology retrieval and evaluation, showing the benefits of our system
Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech
The searching spontaneous speech can be enhanced by combining automatic speech transcriptions with semantically
related metadata. An important question is what can be expected from search of such transcriptions and different
sources of related metadata in terms of retrieval effectiveness. The Cross-Language Speech Retrieval (CL-SR) track at recent CLEF workshops provides a spontaneous speech
test collection with manual and automatically derived metadata fields. Using this collection we investigate the comparative search effectiveness of individual fields comprising automated transcriptions and the available metadata. A further important question is how transcriptions and metadata should be combined for the greatest benefit to search accuracy. We compare simple field merging of individual fields with the extended BM25 model for weighted field combination (BM25F). Results indicate that BM25F can produce improved search accuracy, but that it is currently important to set its parameters suitably using a suitable training set
Evaluation of a Bayesian inference network for ligand-based virtual screening
Background
Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity.
Results
Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought.
Conclusion
A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening
Solving multiple-criteria R&D project selection problems with a data-driven evidential reasoning rule
In this paper, a likelihood based evidence acquisition approach is proposed
to acquire evidence from experts'assessments as recorded in historical
datasets. Then a data-driven evidential reasoning rule based model is
introduced to R&D project selection process by combining multiple pieces of
evidence with different weights and reliabilities. As a result, the total
belief degrees and the overall performance can be generated for ranking and
selecting projects. Finally, a case study on the R&D project selection for the
National Science Foundation of China is conducted to show the effectiveness of
the proposed model. The data-driven evidential reasoning rule based model for
project evaluation and selection (1) utilizes experimental data to represent
experts' assessments by using belief distributions over the set of final
funding outcomes, and through this historic statistics it helps experts and
applicants to understand the funding probability to a given assessment grade,
(2) implies the mapping relationships between the evaluation grades and the
final funding outcomes by using historical data, and (3) provides a way to make
fair decisions by taking experts' reliabilities into account. In the
data-driven evidential reasoning rule based model, experts play different roles
in accordance with their reliabilities which are determined by their previous
review track records, and the selection process is made interpretable and
fairer. The newly proposed model reduces the time-consuming panel review work
for both managers and experts, and significantly improves the efficiency and
quality of project selection process. Although the model is demonstrated for
project selection in the NSFC, it can be generalized to other funding agencies
or industries.Comment: 20 pages, forthcoming in International Journal of Project Management
(2019
- …