5,377 research outputs found
From Data Fusion to Knowledge Fusion
The task of {\em data fusion} is to identify the true values of data items
(eg, the true date of birth for {\em Tom Cruise}) among multiple observed
values drawn from different sources (eg, Web sites) of varying (and unknown)
reliability. A recent survey\cite{LDL+12} has provided a detailed comparison of
various fusion methods on Deep Web data. In this paper, we study the
applicability and limitations of different fusion techniques on a more
challenging problem: {\em knowledge fusion}. Knowledge fusion identifies true
subject-predicate-object triples extracted by multiple information extractors
from multiple information sources. These extractors perform the tasks of entity
linkage and schema alignment, thus introducing an additional source of noise
that is quite different from that traditionally considered in the data fusion
literature, which only focuses on factual errors in the original sources. We
adapt state-of-the-art data fusion techniques and apply them to a knowledge
base with 1.6B unique knowledge triples extracted by 12 extractors from over 1B
Web pages, which is three orders of magnitude larger than the data sets used in
previous data fusion papers. We show great promise of the data fusion
approaches in solving the knowledge fusion problem, and suggest interesting
research directions through a detailed error analysis of the methods.Comment: VLDB'201
Recommended from our members
Statistical methods for serially correlated zero-inflated proportions
Proportion data falling in the continuum (0, 1) are very common in practice. It can also happen that an inflated number of zeros (or ones) occur with proportion data. There are extensive studies of zero-inflated data in the literature. Almost all of them, however, focus on zero-inflated count data. Furthermore, because of sampling or experimental procedures, correlation commonly exists in data collected through time and/or space. The contribution of this research is to develop methods for analyzing zero-inflated proportion data with serial correlation.
We first propose multiple hypothesis tests for accessing homogeneity of two zero-inflated Beta populations under the assumption of independence of the observations. Fisher’s method is adopted to combine independent likelihood ratio tests and asymptotic independent score tests to assess the equivalence of the populations. We also develop non-parametric and semi-parametric permutation-based tests for simultaneously comparing two or three features of unknown populations. In Chapter 3, we develop a Hidden Markov Model with zero-inflated Beta emission densities. We show that the standard EM algorithm for Hidden Markov Model parameter estimation can be applied in this case with emission distributions that are mixtures of discrete and continuous parts. In Chapter 4, we develop a generalized linear mixed model with an autoregressive random effect. This model involves the non-standard distribution (zero-inflated Beta) as well as components to account for the dependence among observations. We use Bayesian methodology for generalized linear mixed model parameter estimation and statistical inferences. We examine our methods by simulation and by analyzing a real marine science dataset where interest lies in distinguishing two serially correlated samples. We provide code for simultaneous hypothesis testing and for the Hidden Markov Model under open-source software R, as well as for generalized linear mixed model in WinBUGS
Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs?
Since the recent prosperity of Large Language Models (LLMs), there have been
interleaved discussions regarding how to reduce hallucinations from LLM
responses, how to increase the factuality of LLMs, and whether Knowledge Graphs
(KGs), which store the world knowledge in a symbolic form, will be replaced
with LLMs. In this paper, we try to answer these questions from a new angle:
How knowledgeable are LLMs?
To answer this question, we constructed Head-to-Tail, a benchmark that
consists of 18K question-answer (QA) pairs regarding head, torso, and tail
facts in terms of popularity. We designed an automated evaluation method and a
set of metrics that closely approximate the knowledge an LLM confidently
internalizes. Through a comprehensive evaluation of 14 publicly available LLMs,
we show that existing LLMs are still far from being perfect in terms of their
grasp of factual knowledge, especially for facts of torso-to-tail entities
Perspectives on public involvement in health research from Singapore: The potential of a supported group model of involvement.
BACKGROUND: Singapore is an international research hub, with an emphasis on translational clinical research. Despite growing evidence of the positive impact of public involvement (PPI) in research, it remains rare in Singapore. AIMS: To investigate Singaporean public perspectives around the rationale, role and scope for being involved in health research To identify the potential, challenges, facilitators and strategies for implementing PPI in Singapore. DESIGN: Semi-structured qualitative interviews with members of the public, analysed using thematic framework analysis. RESULTS: Twenty people participated. Four main themes emerged: potential benefits; challenges; facilitators; and strategies for implementation. Whilst initially unfamiliar with the concept, all interviewees recognized potential benefits for the research itself and those involved, including researchers. PPI was seen to offer opportunities for public empowerment and strengthening of relationships and understanding between the public, academics and health professionals, resulting in more impactful research. Challenges included a Singaporean culture of passive citizenship and an education system that inculcates deferential attitudes. Facilitators comprised demographic and cultural changes, including trends towards greater individual openness and community engagement. Implementation strategies included formal government policies promoting involvement and informal community-based collaborative approaches. CONCLUSION: Given the socio-political framework in Singapore, a community-based approach has potential to address challenges to PPI and maximize impact. Careful consideration needs to be given to issues of resource and support to enable members of the public to engage in culturally sensitive and meaningful ways that will deliver research best placed to effectively address patient needs
Toward a Noninvasive Automatic Seizure Control System in Rats With Transcranial Focal Stimulations via Tripolar Concentric Ring Electrodes
Epilepsy affects approximately 1% of the world population. Antiepileptic drugs are ineffective in approximately 30% of patients and have side effects. We are developing a noninvasive, or minimally invasive, transcranial focal electrical stimulation system through our novel tripolar concentric ring electrodes to control seizures. In this study, we demonstrate feasibility of an automatic seizure control system in rats with pentylenetetrazole-induced seizures through single and multiple stimulations. These stimulations are automatically triggered by a real-time electrographic seizure activity detector based on a disjunctive combination of detections from a cumulative sum algorithm and a generalized likelihood ratio test. An average seizure onset detection accuracy of 76.14% was obtained for the test set (n = 13). Detection of electrographic seizure activity was accomplished in advance of the early behavioral seizure activity in 76.92% of the cases. Automatically triggered stimulation significantly (p = 0.001) reduced the electrographic seizure activity power in the once stimulated group compared to controls in 70% of the cases. To the best of our knowledge this is the first closed-loop automatic seizure control system based on noninvasive electrical brain stimulation using tripolar concentric ring electrode electrographic seizure activity as feedback
Tumor-associated macrophages in multiple myeloma: Advances in biology and therapy
Multiple myeloma (MM) is a cancer of plasma cells in the bone marrow (BM) and represents the second most common hematological malignancy in the world. The MM tumor microenvironment (TME) within the BM niche consists of a wide range of elements which play important roles in supporting MM disease progression, survival, proliferation, angiogenesis, as well as drug resistance. Together, the TME fosters an immunosuppressive environment in which immune recognition and response are repressed. Macrophages are a central player in the immune system with diverse functions, and it has been long established that macrophages play a critical role in both inducing direct and indirect immune responses in cancer. Tumor-associated macrophages (TAMs) are a major population of cells in the tumor site. Rather than contributing to the immune response against tumor cells, TAMs in many cancers are found to exhibit protumor properties including supporting chemoresistance, tumor proliferation and survival, angiogenesis, immunosuppression, and metastasis. Targeting TAM represents a novel strategy for cancer immunotherapy, which has potential to indirectly stimulate cytotoxic T cell activation and recruitment, and synergize with checkpoint inhibitors and chemotherapies. In this review, we will provide an updated and comprehensive overview into the current knowledge on the roles of TAMs in MM, as well as the therapeutic targets that are being explored as macrophage-targeted immunotherapy, which may hold key to future therapeutics against MM
- …