2,218 research outputs found
A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web
Over the past decade, rapid advances in web technologies, coupled with
innovative models of spatial data collection and consumption, have generated a
robust growth in geo-referenced information, resulting in spatial information
overload. Increasing 'geographic intelligence' in traditional text-based
information retrieval has become a prominent approach to respond to this issue
and to fulfill users' spatial information needs. Numerous efforts in the
Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the
Linking Open Data initiative have converged in a constellation of open
knowledge bases, freely available online. In this article, we survey these open
knowledge bases, focusing on their geospatial dimension. Particular attention
is devoted to the crucial issue of the quality of geo-knowledge bases, as well
as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic
Network, is outlined as our contribution to this area. Research directions in
information integration and Geographic Information Retrieval (GIR) are then
reviewed, with a critical discussion of their current limitations and future
prospects
Mathematical practice, crowdsourcing, and social machines
The highest level of mathematics has traditionally been seen as a solitary
endeavour, to produce a proof for review and acceptance by research peers.
Mathematics is now at a remarkable inflexion point, with new technology
radically extending the power and limits of individuals. Crowdsourcing pulls
together diverse experts to solve problems; symbolic computation tackles huge
routine calculations; and computers check proofs too long and complicated for
humans to comprehend.
Mathematical practice is an emerging interdisciplinary field which draws on
philosophy and social science to understand how mathematics is produced. Online
mathematical activity provides a novel and rich source of data for empirical
investigation of mathematical practice - for example the community question
answering system {\it mathoverflow} contains around 40,000 mathematical
conversations, and {\it polymath} collaborations provide transcripts of the
process of discovering proofs. Our preliminary investigations have demonstrated
the importance of "soft" aspects such as analogy and creativity, alongside
deduction and proof, in the production of mathematics, and have given us new
ways to think about the roles of people and machines in creating new
mathematical knowledge. We discuss further investigation of these resources and
what it might reveal.
Crowdsourced mathematical activity is an example of a "social machine", a new
paradigm, identified by Berners-Lee, for viewing a combination of people and
computers as a single problem-solving entity, and the subject of major
international research endeavours. We outline a future research agenda for
mathematics social machines, a combination of people, computers, and
mathematical archives to create and apply mathematics, with the potential to
change the way people do mathematics, and to transform the reach, pace, and
impact of mathematics research.Comment: To appear, Springer LNCS, Proceedings of Conferences on Intelligent
Computer Mathematics, CICM 2013, July 2013 Bath, U
Hybrid intelligent framework for automated medical learning
This paper investigates the automated medical learning and proposes hybrid intelligent framework, called Hybrid Automated Medical Learning (HAML). The goal is the efficient combination of several intelligent components in order to automatically learn the medical data. Multi agents system is proposed by using distributed deep learning, and knowledge graph for learning medical data. The distributed deep learning is used for efficient learning of the different agents in the system, where the knowledge graph is used for dealing with heterogeneous medical data. To demonstrate the usefulness and accuracy of the HAML framework, intensive simulations on medical data were conducted. A wide range of experiments were conducted to verify the efficiency of the proposed system. Three case studies are discussed in this research, the first case study is related to process mining, and more precisely on the ability of HAML to detect relevant patterns from event medical data. The second case study is related to smart building, and the ability of HAML to recognize the different activities of the patients. The third one is related to medical image retrieval, and the ability of HAML to find the most relevant medical images according to the image query. The results show that the developed HAML achieves good performance compared to the most up-to-date medical learning models regarding both the computational and cost the quality of returned solutionspublishedVersio
Hybrid intelligent framework for automated medical learning
This paper investigates the automated medical learning and proposes hybrid intelligent framework, called Hybrid Automated Medical Learning (HAML). The goal is the efficient combination of several intelligent components in order to automatically learn the medical data. Multi agents system is proposed by using distributed deep learning, and knowledge graph for learning medical data. The distributed deep learning is used for efficient learning of the different agents in the system, where the knowledge graph is used for dealing with heterogeneous medical data. To demonstrate the usefulness and accuracy of the HAML framework, intensive simulations on medical data were conducted. A wide range of experiments were conducted to verify the efficiency of the proposed system. Three case studies are discussed in this research, the first case study is related to process mining, and more precisely on the ability of HAML to detect relevant patterns from event medical data. The second case study is related to smart building, and the ability of HAML to recognize the different activities of the patients. The third one is related to medical image retrieval, and the ability of HAML to find the most relevant medical images according to the image query. The results show that the developed HAML achieves good performance compared to the most up-to-date medical learning models regarding both the computational and cost the quality of returned solutions.publishedVersio
Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases
Recent years have witnessed the proliferation of large-scale Knowledge Bases (KBs). However, many entities in KBs have incomplete type information, and some are totally untyped. Even worse, fine-grained types (e.g., BasketballPlayer) containing rich semantic meanings are more likely to be incomplete, as they are more difficult to be obtained. Existing machine-based algorithms use predicates (e.g., birthPlace) of entities to infer their missing types, and they have limitations that the predicates may be insufficient to infer fine-grained types. In this paper, we utilize crowdsourcing to solve the problem, and address the challenge of controlling crowdsourcing cost. To this end, we propose a hybrid machine-crowdsourcing approach for fine-grained entity type completion. It firstly determines the types of some “representative” entities via crowdsourcing and then infers the types for remaining entities based on the crowdsourcing results. To support this approach, we first propose an embedding-based influence for type inference which considers not only the distance between entity embeddings but also the distances between entity and type embeddings. Second, we propose a new difficulty model for entity selection which can better capture the uncertainty of the machine algorithm when identifying the entity types. We demonstrate the effectiveness of our approach through experiments on real crowdsourcing platforms. The results show that our method outperforms the state-of-the-art algorithms by improving the effectiveness of fine-grained type completion at affordable crowdsourcing cost.Peer reviewe
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
SMAP: A Novel Heterogeneous Information Framework for Scenario-based Optimal Model Assignment
The increasing maturity of big data applications has led to a proliferation
of models targeting the same objectives within the same scenarios and datasets.
However, selecting the most suitable model that considers model's features
while taking specific requirements and constraints into account still poses a
significant challenge. Existing methods have focused on worker-task assignments
based on crowdsourcing, they neglect the scenario-dataset-model assignment
problem. To address this challenge, a new problem named the Scenario-based
Optimal Model Assignment (SOMA) problem is introduced and a novel framework
entitled Scenario and Model Associative percepts (SMAP) is developed. SMAP is a
heterogeneous information framework that can integrate various types of
information to intelligently select a suitable dataset and allocate the optimal
model for a specific scenario. To comprehensively evaluate models, a new score
function that utilizes multi-head attention mechanisms is proposed. Moreover, a
novel memory mechanism named the mnemonic center is developed to store the
matched heterogeneous information and prevent duplicate matching. Six popular
traffic scenarios are selected as study cases and extensive experiments are
conducted on a dataset to verify the effectiveness and efficiency of SMAP and
the score function
- …