Search CORE

193,797 research outputs found

Novel Metaknowledge-based Processing Technique for Multimedia Big Data clustering challenges

Author: Bari Nima
Berkovich Simon Y.
Kowsari Kamran
Vichr Roman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2015
Field of study

Past research has challenged us with the task of showing relational patterns between text-based data and then clustering for predictive analysis using Golay Code technique. We focus on a novel approach to extract metaknowledge in multimedia datasets. Our collaboration has been an on-going task of studying the relational patterns between datapoints based on metafeatures extracted from metaknowledge in multimedia datasets. Those selected are significant to suit the mining technique we applied, Golay Code algorithm. In this research paper we summarize findings in optimization of metaknowledge representation for 23-bit representation of structured and unstructured multimedia data in order toComment: IEEE Multimedia Big Data (BigMM 2015

arXiv.org e-Print Archive

Crossref

Ensuring Cyber-Security in Smart Railway Surveillance with SHIELD

Author: DELLI PRISCOLI Francesco
DI GIORGIO Alessandro
Esposito Mariana
Fiaschetti Andrea
Flammini Francesco
Mignanti Silvano
Pragliola Concetta
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2017
Field of study

Modern railways feature increasingly complex embedded computing systems for surveillance, that are moving towards fully wireless smart-sensors. Those systems are aimed at monitoring system status from a physical-security viewpoint, in order to detect intrusions and other environmental anomalies. However, the same systems used for physical-security surveillance are vulnerable to cyber-security threats, since they feature distributed hardware and software architectures often interconnected by ‘open networks’, like wireless channels and the Internet. In this paper, we show how the integrated approach to Security, Privacy and Dependability (SPD) in embedded systems provided by the SHIELD framework (developed within the EU funded pSHIELD and nSHIELD research projects) can be applied to railway surveillance systems in order to measure and improve their SPD level. SHIELD implements a layered architecture (node, network, middleware and overlay) and orchestrates SPD mechanisms based on ontology models, appropriate metrics and composability. The results of prototypical application to a real-world demonstrator show the effectiveness of SHIELD and justify its practical applicability in industrial settings

Archivio della ricerca- Università di Roma La Sapienza

Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications

Author: Clement Lieven
Dudoit Sandrine
Love Michael I
Perraudeau Fanny
Risso Davide
Robinson Mark D
Soneson Charlotte
Van den Berge Koen
Vert Jean-Philippe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Dropout events in single-cell RNA sequencing (scRNA-seq) cause many transcripts to go undetected and induce an excess of zero read counts, leading to power issues in differential expression (DE) analysis. This has triggered the development of bespoke scRNA-seq DE methods to cope with zero inflation. Recent evaluations, however, have shown that dedicated scRNA-seq tools provide no advantage compared to traditional bulk RNA-seq tools. We introduce a weighting strategy, based on a zero-inflated negative binomial model, that identifies excess zero counts and generates gene-and cell-specific weights to unlock bulk RNA-seq DE pipelines for zero-inflated data, boosting performance for scRNA-seq

Ghent University Academic Bibliography

Directory of Open Access Journals

Carolina Digital Repository

eScholarship - University of California

Archivsystem Ask23

ZORA

HAL-MINES ParisTech

Archivio istituzionale della ricerca - Università di Padova

Theory and Practice of Data Citation

Author: Silvello Gianmaria
Publication venue: 'Wiley'
Publication date: 24/06/2017
Field of study

Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Assessing Causation in Breast Implant Litigation: The Role of Science Panels

Author: Cecil Joe S.
Hooper Laural L.
Willging Thomas E.
Publication venue: Duke University School of Law
Publication date: 01/10/2001
Field of study

In two recent cases, federal judges appointed panels of scientific experts to help assess conflicting scientific testimony regarding causation of systemic injuries by silicone gel breast implants. This article will describe the circumstances that gave rise to the appointments, the procedures followed in making the appointments and reporting to the courts, and the reactions of the participants in the proceedings

bepress Legal Repository

Duke Law Scholarship Repository

Recommended from our members

Characterisation of FAD-family folds using a machine learning approach

Author: Gilbert D
Tan A C
Tuson A
Publication venue: INCOB
Publication date: 01/01/2002
Field of study

Flavin adenine dinucleotide (FAD) and its derivatives play a crucial role in biological processes. They are major organic cofactors and electron carriers in both enzymatic activities and biochemical pathways. We have analysed the relationships between sequence and structure of FAD-containing proteins using a machine learning approach. Decision trees were generated using the C4.5 algorithm as a means of automatically generating rules from biological databases (TOPS, CATH and PDB). These rules were then used as background knowledge for an ILP system to characterise the four different classes of FAD-family folds classified in Dym and Eisenberg (2001). These FAD-family folds are: glutathione reductase (GR), ferredoxin reductase (FR), p-cresol methylhydroxylase (PCMH) and pyruvate oxidase (PO). Each FADfamily was characterised by a set of rules. The “knowledge patterns” generated from this approach are a set of rules containing conserved sequence motifs, secondary structure sequence elements and folding information. Every rule was then verified using statistical evaluation on the measured significance of each rule. We show that this machine learning approach is capable of learning and discovering interesting patterns from large biological databases and can generate “knowledge patterns” that characterise the FADcontaining proteins, and at the same time classify these proteins into four different families

Brunel University Research Archive

Knowledge Discovery in Online Repositories: A Text Mining Approach

Author: Afolabi I. T.
Ayo C. K.
Musa G. A.
Sofoluwe A. B.
Publication venue: EuroJournals Publishing
Publication date: 01/01/2008
Field of study

Before the advent of the Internet, the newspapers were the prominent instrument of mobilization for independence and political struggles. Since independence in Nigeria, the political class has adopted newspapers as a medium of Political Competition and Communication. Consequently, most political information exists in unstructured form and hence the need to tap into it using text mining algorithm. This paper implements a text mining algorithm on some unstructured data format in some newspapers. The algorithm involves the following natural language processing techniques: tokenization, text filtering and refinement. As a follow-up to the natural language techniques, association rule mining technique of data mining is used to extract knowledge using the Modified Generating Association Rules based on Weighting scheme (GARW). The main contributions of the technique are that it integrates information retrieval scheme (Term Frequency Inverse Document Frequency) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) with Data Mining technique for association rules discovery. The program is applied to Pre-Election information gotten from the website of the Nigerian Guardian newspaper. The extracted association rules contained important features and described the informative news included in the documents collection when related to the concluded 2007 presidential election. The system presented useful information that could help sanitize the polity as well as protect the nascent democracy

Covenant University Repository

The Genomic HyperBrowser: inferential genomics at the sequence level

Author: Clancy Trevor
Ferkingstad Egil
Frigessi Arnoldo
Glad Ingrid K.
Gundersen Sveinung
Holden Lars
Holden Marit
Hovig Eivind
Johansen Morten
Liestøl Knut
Nygaard Vegard
Rydbeck Halfdan
Sandve Geir K.
Tøstesen Eivind
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The immense increase in the generation of genomic scale data poses an unmet analytical challenge, due to a lack of established methodology with the required flexibility and power. We propose a first principled approach to statistical analysis of sequence-level genomic information. We provide a growing collection of generic biological investigations that query pairwise relations between tracks, represented as mathematical objects, along the genome. The Genomic HyperBrowser implements the approach and is available at http://hyperbrowser.uio.no

arXiv.org e-Print Archive

Springer - Publisher Connector

PubMed Central

NORA - Norwegian Open Research Archives