269 research outputs found
Mixing and non-mixing local minima of the entropy contrast for blind source separation
In this paper, both non-mixing and mixing local minima of the entropy are
analyzed from the viewpoint of blind source separation (BSS); they correspond
respectively to acceptable and spurious solutions of the BSS problem. The
contribution of this work is twofold. First, a Taylor development is used to
show that the \textit{exact} output entropy cost function has a non-mixing
minimum when this output is proportional to \textit{any} of the non-Gaussian
sources, and not only when the output is proportional to the lowest entropic
source. Second, in order to prove that mixing entropy minima exist when the
source densities are strongly multimodal, an entropy approximator is proposed.
The latter has the major advantage that an error bound can be provided. Even if
this approximator (and the associated bound) is used here in the BSS context,
it can be applied for estimating the entropy of any random variable with
multimodal density.Comment: 11 pages, 6 figures, To appear in IEEE Transactions on Information
Theor
Feature Selection for Interpatient Supervised Heart Beat Classification
Supervised and interpatient classification of heart beats is primordial in many applications requiring long-term monitoring of the cardiac function. Several classification models able to cope with the strong class unbalance and a large variety of feature sets have been proposed for this task. In practice, over 200 features are often considered, and the features retained in the final model are either chosen using domain knowledge or an exhaustive search in the feature sets without evaluating the relevance of each individual feature included in the classifier. As a consequence, the results obtained by these models can be suboptimal and difficult to interpret. In this work, feature selection techniques are considered to extract optimal feature subsets for state-of-the-art ECG classification models. The performances are evaluated on real ambulatory recordings and compared to previously reported feature choices using the same models. Results indicate that a small number of individual features actually serve the classification and that better performances can be achieved by removing useless features
Machine Learning and Data Analysis in Astroinformatics
Astroinformatics is a new discipline at the cross-road of astronomy, advanced statistics and computer science. With next generation sky surveys, space missions and modern instrumentation astronomy will enter the Petascale regime raising the demand for advanced computer science techniques with hard- and software solutions for data management, analysis, efficient automation and knowledge discovery. This tutorial reviews important developments in astroinformatics over the past years and discusses some relevant research questions and concrete problems. The contribution ends with a short review of the special session papers in these proceedings, as well as perspectives and challenges for the near future
Geographical trends in research: a preliminary analysis on authors' affiliations
In the last decade, research literature reached an enormous volume with an unprecedented current annual increase of 1.5 million new publications. As research gets ever more global and new countries and institutions, either from academia or corporate environment, start to contribute with their share, it is important to monitor this complex scenario and understand its dynamics.
We present a study on a conference proceedings dataset extracted from Springer Nature Scigraph that illustrates insightful geographical trends and highlights the unbalanced growth of competitive research institutions worldwide. Results emerged from our micro and macro analysis show that the distributions among countries of institutions and papers follow a power law, and thus very few countries keep producing most of the papers accepted by high-tier conferences. In addition, we found that the annual and overall turnover rate of the top 5, 10 and 25 countries is extremely low, suggesting a very static landscape in which new entries struggle to emerge. Finally, we highlight the presence of an increasing gap between the number of institutions initiating and overseeing research endeavours (i.e. first and last authors' affiliations) and the total number of institutions participating in research. As a consequence of our analysis, the paper also discusses our experience in working with affiliations: an utterly simple matter at first glance, that is instead revealed to be a complex research and technical challenge yet far from being solved
Temperature dependence of the static and dynamic behaviour in a quenching and partitioning processed low-Si Steel
Because of their excellent combination of strength and ductility, quenching and partitioning (Q & P) steels have a great chance of being added to the third generation of advanced high strength steels. The large ductility of Q & P steels arises from the presence of 10% to 15% of retained austenite which postpones necking due to the transformation induced plasticity (TRIP) effect. Moreover, Q & P steels show promising forming properties with favourable Lankford coefficients, while their planar anisotropy is low due to a weak texture. The stability of the metastable austenite is the key to obtain tailored properties for these steels. To become part of the newest generation of advanced high strength steels, Q & P steels have to preserve their mechanical properties at dynamic strain rates and over a wide range of temperatures. Therefore, in the present study, a low-Si Q & P steel was tested at temperatures from -40 degrees C to 80 degrees C and strain rates from 0.001 s(-1) to 500 s(-1). Results show that the mechanical properties are well-preserved at the lowest temperatures. Indeed, at -40 degrees C and room temperature, no significant loss of the deformation capacity is observed even at dynamic strain rates. This is attributed to the presence of a large fraction of austenite that is so (thermally) stable that it does not transform in the absence of deformation. In addition, the high stability of the austenite decreases the elongation at high test temperatures (80 degrees C). The additional adiabatic heating in the dynamic tests causes the largest reduction of the uniform strain for the samples tested at 80 degrees C. Quantification of the retained austenite fraction in the samples after testing confirmed that, at the highest temperature and strain rate, the TRIP effect is suppressed
Median topographic maps for biomedical data sets
Median clustering extends popular neural data analysis methods such as the
self-organizing map or neural gas to general data structures given by a
dissimilarity matrix only. This offers flexible and robust global data
inspection methods which are particularly suited for a variety of data as
occurs in biomedical domains. In this chapter, we give an overview about median
clustering and its properties and extensions, with a particular focus on
efficient implementations adapted to large scale data analysis
A data-driven functional projection approach for the selection of feature ranges in spectra with ICA or cluster analysis
Prediction problems from spectra are largely encountered in chemometry. In
addition to accurate predictions, it is often needed to extract information
about which wavelengths in the spectra contribute in an effective way to the
quality of the prediction. This implies to select wavelengths (or wavelength
intervals), a problem associated to variable selection. In this paper, it is
shown how this problem may be tackled in the specific case of smooth (for
example infrared) spectra. The functional character of the spectra (their
smoothness) is taken into account through a functional variable projection
procedure. Contrarily to standard approaches, the projection is performed on a
basis that is driven by the spectra themselves, in order to best fit their
characteristics. The methodology is illustrated by two examples of functional
projection, using Independent Component Analysis and functional variable
clustering, respectively. The performances on two standard infrared spectra
benchmarks are illustrated.Comment: A paraitr
Resampling methods for parameter-free and robust feature selection with mutual information
Combining the mutual information criterion with a forward feature selection
strategy offers a good trade-off between optimality of the selected feature
subset and computation time. However, it requires to set the parameter(s) of
the mutual information estimator and to determine when to halt the forward
procedure. These two choices are difficult to make because, as the
dimensionality of the subset increases, the estimation of the mutual
information becomes less and less reliable. This paper proposes to use
resampling methods, a K-fold cross-validation and the permutation test, to
address both issues. The resampling methods bring information about the
variance of the estimator, information which can then be used to automatically
set the parameter and to calculate a threshold to stop the forward procedure.
The procedure is illustrated on a synthetic dataset as well as on real-world
examples
- …