Search CORE

35,006 research outputs found

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post-Genomic Clinical Trials

Author: Anguita Sanchez Alberto
Crespo del Arco Jose
Martín Martín Luis
Tsiknakis Manolis
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

The increasing amount of information available for biomedical research has led to issues related to knowledge discovery in large collections of data. Moreover, Information Retrieval techniques must consider heterogeneities present in databases, initially belonging to different domains—e.g. clinical and genetic data. One of the goals, among others, of the ACGT European is to provide seamless and homogeneous access to integrated databases. In this work, we describe an approach to overcome heterogeneities in identifiers inside queries. We present an ontology classifying the most common identifier semantic heterogeneities, and a service that makes use of it to cope with the problem using the described approach. Finally, we illustrate the solution by analysing a set of real queries

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Thematic Annotation: extracting concepts out of documents

Author: Andrews Pierre
Rajman Martin
Publication venue
Publication date: 29/12/2004
Field of study

Contrarily to standard approaches to topic annotation, the technique used in this work does not centrally rely on some sort of -- possibly statistical -- keyword extraction. In fact, the proposed annotation algorithm uses a large scale semantic database -- the EDR Electronic Dictionary -- that provides a concept hierarchy based on hyponym and hypernym relations. This concept hierarchy is used to generate a synthetic representation of the document by aggregating the words present in topically homogeneous document segments into a set of concepts best preserving the document's content. This new extraction technique uses an unexplored approach to topic selection. Instead of using semantic similarity measures based on a semantic resource, the later is processed to extract the part of the conceptual hierarchy relevant to the document content. Then this conceptual hierarchy is searched to extract the most relevant set of concepts to represent the topics discussed in the document. Notice that this algorithm is able to extract generic concepts that are not directly present in the document.Comment: Technical report EPFL/LIA. 81 pages, 16 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Comparison of System Call Representations for Intrusion Detection

Author: A Sharma
AP Kosoresow
E Eskin
FA Gers
G Creech
H He
J McHugh
N Srivastava
S Hochreiter
SA Hofmeyr
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2019
Field of study

Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related non-continuous data like system calls. This work focuses on four different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encoding and learning word2vec or GloVe representations of system calls. As an additional option, we analyze if the mapping of system calls to their respective kernel modules is an adequate generalization step for (a) replacing system calls or (b) enhancing system call data with additional information regarding their context. However, when performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call based intrusion detection is to categorize sequences of system calls as benign or malicious behavior. Therefore, this scenario is used to evaluate the different input options as a classification task. The results show, that each of the four different methods is a valid option when preprocessing input data, but the use of kernel modules only is not recommended because too much information is being lost during the mapping process.Comment: 12 pages, 1 figure, submitted to CISIS 201

arXiv.org e-Print Archive

Crossref

Predictive Maintenance on the Machining Process and Machine Tool

Author: Box
Breiman
Djeziri
Extrapolation
Farokhzad
Galar
Guyon
Jin
Maronna
Mills
Mobley
Tsymbal
Tyagi
Winkler
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

This paper presents the process required to implement a data driven Predictive Maintenance (PdM) not only in the machine decision making, but also in data acquisition and processing. A short review of the different approaches and techniques in maintenance is given. The main contribution of this paper is a solution for the predictive maintenance problem in a real machining process. Several steps are needed to reach the solution, which are carefully explained. The obtained results show that the Preventive Maintenance (PM), which was carried out in a real machining process, could be changed into a PdM approach. A decision making application was developed to provide a visual analysis of the Remaining Useful Life (RUL) of the machining tool. This work is a proof of concept of the methodology presented in one process, but replicable for most of the process for serial productions of pieces

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

TECNALIA Publications

Face recognition in different subspaces - A comparative study

Author: Batagelj Borut
Solina Franc
Publication venue: Insticc Press, cop. 2006
Publication date: 01/01/2006
Field of study

Face recognition is one of the most successful applications of image analysis and understanding and has gained much attention in recent years. Among many approaches to the problem of face recognition, appearance-based subspace analysis still gives the most promising results. In this paper we study the three most popular appearance-based face recognition projection methods (PCA, LDA and ICA). All methods are tested in equal working conditions regarding preprocessing and algorithm implementation on the FERET data set with its standard tests. We also compare the ICA method with its whitening preprocess and find out that there is no significant difference between them. When we compare different projection with different metrics we found out that the LDA+COS combination is the most promising for all tasks. The L1 metric gives the best results in combination with PCA and ICA1, and COS is superior to any other metric when used with LDA and ICA2. Our results are compared to other studies and some discrepancies are pointed ou

ePrints.FRI