2,152 research outputs found

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    The Digital Earth Observation Librarian: A Data Mining Approach for Large Satellite Images Archives

    Get PDF
    Throughout the years, various Earth Observation (EO) satellites have generated huge amounts of data. The extraction of latent information in the data repositories is not a trivial task. New methodologies and tools, being capable of handling the size, complexity and variety of data, are required. Data scientists require support for the data manipulation, labeling and information extraction processes. This paper presents our Earth Observation Image Librarian (EOLib), a modular software framework which offers innovative image data mining capabilities for TerraSAR-X and EO image data, in general. The main goal of EOLib is to reduce the time needed to bring information to end-users from Payload Ground Segments (PGS). EOLib is composed of several modules which offer functionalities such as data ingestion, feature extraction from SAR (Synthetic Aperture Radar) data, meta-data extraction, semantic definition of the image content through machine learning and data mining methods, advanced querying of the image archives based on content, meta-data and semantic categories, as well as 3-D visualization of the processed images. EOLib is operated by DLR’s (German Aerospace Center’s) Multi-Mission Payload Ground Segment of its Remote Sensing Data Center at Oberpfaffenhofen, Germany

    Semantic data mining and linked data for a recommender system in the AEC industry

    Get PDF
    Even though it can provide design teams with valuable performance insights and enhance decision-making, monitored building data is rarely reused in an effective feedback loop from operation to design. Data mining allows users to obtain such insights from the large datasets generated throughout the building life cycle. Furthermore, semantic web technologies allow to formally represent the built environment and retrieve knowledge in response to domain-specific requirements. Both approaches have independently established themselves as powerful aids in decision-making. Combining them can enrich data mining processes with domain knowledge and facilitate knowledge discovery, representation and reuse. In this article, we look into the available data mining techniques and investigate to what extent they can be fused with semantic web technologies to provide recommendations to the end user in performance-oriented design. We demonstrate an initial implementation of a linked data-based system for generation of recommendations

    Comprehensive Review of Opinion Summarization

    Get PDF
    The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

    How you move reveals who you are: understanding human behavior by analyzing trajectory data

    Get PDF
    The widespread use of mobile devices is producing a huge amount of trajectory data, making the discovery of movement patterns possible, which are crucial for understanding human behavior. Significant advances have been made with regard to knowledge discovery, but the process now needs to be extended bearing in mind the emerging field of behavior informatics. This paper describes the formalization of a semantic-enriched KDD process for supporting meaningful pattern interpretations of human behavior. Our approach is based on the integration of inductive reasoning (movement pattern discovery) and deductive reasoning (human behavior inference). We describe the implemented Athena system, which supports such a process, along with the experimental results on two different application domains related to traffic and recreation management

    SOME APPROACHES TO TEXT MINING AND THEIR POTENTIAL FOR SEMANTIC WEB APPLICATIONS

    Get PDF
    In this paper we describe some approaches to text mining, which are supported by an original software system developed in Java for support of information retrieval and text mining (JBowl), as well as its possible use in a distributed environment. The system JBowl1 is being developed as an open source software with the intention to provide an easily extensible, modular framework for pre-processing, indexing and further exploration of large text collections. The overall architecture of the system is described, followed by some typical use case scenarios, which have been used in some previous projects. Then, basic principles and technologies used for service-oriented computing, web services and semantic web services are presented. We further discuss how the JBowl system can be adopted into a distributed environment via technologies available already and what benefits can bring such an adaptation. This is in particular important in the context of a new integrated EU-funded project KP-Lab2 (Knowledge Practices Laboratory) that is briefly presented as well as the role of the proposed text mining services, which are currently being designed and developed there

    Semantic technologies for supporting KDD processes

    Get PDF
    209 p.Achieving a comfortable thermal situation within buildings with an efficient use of energy remains still an open challenge for most buildings. In this regard, IoT (Internet of Things) and KDD (Knowledge Discovery in Databases) processes may be combined to solve these problems, even though data analysts may feel overwhelmed by heterogeneity and volume of the data to be considered. Data analysts could benefit from an application assistant that supports them throughout the KDD process. This research work aims at supporting data analysts through the different KDD phases towards the achievement of energy efficiency and thermal comfort in tertiary buildings. To do so, the EEPSA (Energy Efficiency Prediction Semantic Assistant) is proposed, which aids data analysts discovering the most relevant variables for the matter at hand, and informs them about relationships among relevant data. This assistant leverages Semantic Technologies such as ontologies, ontology-driven rules and ontology-driven data access. More specifically, the EEPSA ontology is the cornerstone of the assistant. This ontology is developed on top of three ODPs (Ontology Design Patterns) and it is designed so that its customization to address similar problems in different types of buildings can be approached methodically
    corecore