118,442 research outputs found
Handling uncertainty in information extraction
This position paper proposes an interactive approach for developing information extractors based on the ontology definition process with knowledge about possible (in)correctness of annotations. We discuss the problem of managing and manipulating probabilistic dependencies
Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art
Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover
Toponym extraction and disambiguation enhancement using loops of feedback
Toponym extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with toponym extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. In this paper we aim to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms. We show that the extraction confidence probabilities are useful in enhancing the effectiveness of disambiguation. Reciprocally, retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several automatic iterations
Improving named entity disambiguation by iteratively enhancing certainty of extraction
Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations
Information extraction for social media
The rapid growth in IT in the last two decades has led to a growth in the amount of information available online. A new style for sharing information is social media. Social media is a continuously instantly updated source of information. In this position paper, we propose a framework for Information Extraction (IE) from unstructured user generated contents on social media. The framework proposes solutions to overcome the IE challenges in this domain such as the short context, the noisy sparse contents and the uncertain contents. To overcome the challenges facing IE from social media, State-Of-The-Art approaches need to be adapted to suit the nature of social media posts. The key components and aspects of our proposed framework are noisy text filtering, named entity extraction, named entity disambiguation, feedback loops, and uncertainty handling
Encapsulation of Soft Computing Approaches within Itemset Mining a A Survey
Data Mining discovers patterns and trends by extracting knowledge from large databases. Soft Computing techniques such as fuzzy logic, neural networks, genetic algorithms, rough sets, etc. aims to reveal the tolerance for imprecision and uncertainty for achieving tractability, robustness and low-cost solutions. Fuzzy Logic and Rough sets are suitable for handling different types of uncertainty. Neural networks provide good learning and generalization. Genetic algorithms provide efficient search algorithms for selecting a model, from mixed media data. Data mining refers to information extraction while soft computing is used for information processing. For effective knowledge discovery from large databases, both Soft Computing and Data Mining can be merged. Association rule mining (ARM) and Itemset mining focus on finding most frequent item sets and corresponding association rules, extracting rare itemsets including temporal and fuzzy concepts in discovered patterns. This survey paper explores the usage of soft computing approaches in itemset utility mining
Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality
Real-time occlusion handling is a major problem in outdoor mixed reality
system because it requires great computational cost mainly due to the
complexity of the scene. Using only segmentation, it is difficult to accurately
render a virtual object occluded by complex objects such as trees, bushes etc.
In this paper, we propose a novel occlusion handling method for real-time,
outdoor, and omni-directional mixed reality system using only the information
from a monocular image sequence. We first present a semantic segmentation
scheme for predicting the amount of visibility for different type of objects in
the scene. We also simultaneously calculate a foreground probability map using
depth estimation derived from optical flow. Finally, we combine the
segmentation result and the probability map to render the computer generated
object and the real scene using a visibility-based rendering method. Our
results show great improvement in handling occlusions compared to existing
blending based methods
Risk-based evaluation for underground mine planning
As underground mine planning tools become more sophisticated, mine planners have the capacity to investigate numerous mine sequencing options to identify the best strategy for a given project, creating higher value for shareholders. The information required for mine planning decisions goes beyond the external sources of uncertainty recognised by typical evaluation techniques used in the mining industry, to include technical factors (e.g. mine development layout) and the ability of a mineral extraction project to achieve planned production levels. Due to the individual characteristics that define underground mining projects, each will exhibit its individual risk profile, and thus advanced evaluation techniques must capture this information.This paper describes a Riskâbased Evaluation Methodology that accounts for financial and technical scheduling risk in the evaluation of underground mining projects. It provides decisionâmakers with more information early in the mine planning cycle by combining planning and design methodologies with evaluation techniques to identify, optimise and evaluate strategies for mining extraction sequences. Standard evaluation practices used in the mining industry (Discounted Cash Flow, Real Options and Monte Carlo Simulation) are combined with the concepts of Modern Portfolio Theory to establish an evaluation methodology that recognises financial uncertainty in the context of technical scheduling factors. This paper will show that the Riskâbased Evaluation Methodology can be used at the tactical level, as it is applied in combination with the Schedule Optimisation Tool (SOT), for the purpose of recommending a materials handling system to be implemented in a mining project. For the case study, the inclusion of more information in the decisionâmaking process not only provides a more accurate valuation and allows for the recognition of risk, but it also alters the ultimate decision
- âŠ