Search CORE

243,265 research outputs found

Cloud-based platform for intelligent healthcare monitoring and risk prevention in hazardous manufacturing contexts

Author: Boun L.
Caggiano A.
Grant R.
Simeone A.
Publication venue: Elsevier
Publication date: 01/01/2021
Field of study

This paper presents an intelligent cloud-based platform for workers healthcare monitoring and risk prevention in potentially hazardous manufacturing contexts. The platform is structured according to sequential modules dedicated to data acquisition, processing and decision-making support. Several sensors and data sources, including smart wearables, machine tool embedded sensors and environmental sensors, are employed for data collection, comprising information on offline clinical background, operational and environmental data. The cloud data processing module is responsible for extracting relevant features from the acquired data in order to feed a machine learning-based decision-making support system. The latter provides a classification of workers’ health status so that a prompt intervention can be performed in particularly challenging scenarios

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The Influence of Data Mining in Increasing Benefits of Libraries in Jordanian Governmental Universities

Author: Niqresh Mohammad
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 21/05/2021
Field of study

This current study aimed at examining the impact of data mining on increasing benefits of university library in Jordanian governmental university. Researcher adopted techniques of data mining which included (Association, Classification, Clustering, Prediction, Sequential Patterns and Decision Trees). Through employing the quantitative approach and utilizing a questionnaire as a study tool, (412) responded to an online survey which primary data later on was screened and processed using SPSS v. 27th. Results of study accepted the main hypothesis as there appeared an influence of data mining in better organization flow and accumulation of library data and better develop library\u27s services for users. Among data mining techniques, it appeared that (Sequential pattern, decision trees and Prediction techniques) were the most influential techniques on library services followed by librarians in developing library services, this was noticed through the high correlation which connected them to the dependent variable, and the remaining variables also appeared to be positive in influence with a medium correlation. Study recommended to better data mining application by responsible parties within Jordanian universities as there appeared an acceptable level of application; however, the application isn\u27t used to its maximum capacity

DigitalCommons@University of Nebraska

Feature selection and nearest centroid classification for protein mass spectrometry

Author: Levner Ilya
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. RESULTS: This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. CONCLUSION: This study tested a number of popular feature selection methods using the nearest centroid classifier and found that several reportedly state-of-the-art algorithms in fact perform rather poorly when tested via stratified cross-validation. The revealed inconsistencies provide clear evidence that algorithm evaluation should be performed on several data sets using a consistent (i.e., non-randomized, stratified) cross-validation procedure in order for the conclusions to be statistically sound

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Out-of-core KNN Awakens: The light side of computation force on large datasets

Author: Chiluka Nitin
Kermarrec Anne-Marie
Olivares Javier
Publication venue: HAL CCSD
Publication date: 18/05/2016
Field of study

International audienceK-Nearest Neighbors (KNN) is a crucial tool for many applications , e.g. recommender systems, image classification and web-related applications. However, KNN is a resource greedy operation particularly for large datasets. We focus on the challenge of KNN computation over large datasets on a single commodity PC with limited memory. We propose a novel approach to compute KNN on large datasets by leveraging both disk and main memory efficiently. The main rationale of our approach is to minimize random accesses to disk, maximize sequential accesses to data and efficient usage of only the available memory. We evaluate our approach on large datasets, in terms of performance and memory consumption. The evaluation shows that our approach requires only 7% of the time needed by an in-memory baseline to compute a KNN graph

HAL-CentraleSupelec

HAL-Inserm

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Recommended from our members

Temporal learning using echo state network for human activity recognition

Author: Basterrech Sebastián
Ojha Varun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2016
Field of study

Several works have been applied to non-temporal classification techniques in the Human Activity Recognition area. Instead of that, we present an approach for modelling human activities using a temporal learning tool. Here, the activities are considered as time-dependent events, and we use a temporal learning method for their classification. We employ a well-known learning tool named Echo State Network (ESN). An ESN is a specific type of Recurrent Neural Networks, which has proven well performances for solving benchmark problems with sequential and time-series data. Another advantage is that the method is very robust and fast during the learning algorithm. Therefore, it is a good tool for being applied in real-time contexts. We apply the proposed approach for analyzing a well-know benchmark dataset, and we obtain promising results

Central Archive at the University of Reading

Crossref

Design and implementation of a cyberinfrastructure for RNA motif search, prediction and analysis

Author: Wen Dongrong
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2012
Field of study

RNA secondary and tertiary structure motifs play important roles in cells. However, very few web servers are available for RNA motif search and prediction. In this dissertation, a cyberinfrastructure, named RNAcyber, capable of performing RNA motif search and prediction, is proposed, designed and implemented. The first component of RNAcyber is a web-based search engine, named RmotifDB. This web-based tool integrates an RNA secondary structure comparison algorithm with the secondary structure motifs stored in the Rfam database. With a user-friendly interface, RmotifDB provides the ability to search for ncRNA structure motifs in both structural and sequential ways. The second component of RNAcyber is an enhanced version of RmotifDB. This enhanced version combines data from multiple sources, incorporates a variety of well-established structure-based search methods, and is integrated with the Gene Ontology. To display RmotifDB’s search results, a software tool, called RSview, is developed. RSview is able to display the search results in a graphical manner. Finally, RNAcyber contains a web-based tool called Junction-Explorer, which employs a data mining method for predicting tertiary motifs in RNA junctions. Specifically, the tool is trained on solved RNA tertiary structures obtained from the Protein Data Bank, and is able to predict the configuration of coaxial helical stacks and families (topologies) in RNA junctions at the secondary structure level. Junction-Explorer employs several algorithms for motif prediction, including a random forest classification algorithm, a pseudoknot removal algorithm, and a feature ranking algorithm based on the gini impurity measure. A series of experiments including 10-fold cross- validation has been conducted to evaluate the performance of the Junction-Explorer tool. Experimental results demonstrate the effectiveness of the proposed algorithms and the superiority of the tool over existing methods. The RNAcyber infrastructure is fully operational, with all of its components accessible on the Internet

Digital Commons @ New Jersey Institute of Technology (NJIT)