4,177 research outputs found
Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing
In recent years, the huge expansion of digital technologies has vastly increased the volume of data to be explored, such that reducing the dimensionality of data is an essential step in data exploration. The integrity of a dimensionality reduction technique relates to the goodness of maintaining the data structure. Dimensionality reduction techniques such as Principal Component Analyses (PCA) and Multidimensional
Scaling (MDS) globally preserve the distance ranking at the expense of neglecting small-distance preservation. Conversely, the structure capturing of some other methods such as Isomap, Locally Linear Embedding (LLE), Laplacian Eigenmaps t-Stochastic Neighbour Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and TriMap rely on the number of neighbours considered. This paper presents a dimensionality reduction technique, Same Degree Distribution (SDD) that does not rely on the number of neighbours, thanks to using degree-distributions in both high and low dimensional spaces. Degree-distribution is similar to Student-t distribution and is less expensive than Gaussian distribution. As such, it enables better global data preservation in less computational time. Moreover, to improve the data structure capturing, SDD has been extended to Multi-SDDs (MSDD), which employs various degree distributions on top of SDD. The proposed approach and its extension demonstrated a greater performance compared with eight other benchmark methods, tested in several popular synthetics and real datasets such as Iris, Breast Cancer, Swiss Roll, MNIST, and Make Blob evaluated by the co-ranking matrix and Kendall’s Tau coefficient. For further work, we aim to approximate the number of distributions and their degrees in relation to the given dataset. Reducing the computational complexity is another objective for further work
On the modulation instability development in optical fiber systems
Extensive numerical simulations were performed to investigate all stages of
modulation instability development from the initial pulse of pico-second
duration in photonic crystal fiber: quasi-solitons and dispersive waves
formation, their interaction stage and the further propagation. Comparison
between 4 different NLS-like systems was made: the classical NLS equation, NLS
system plus higher dispersion terms, NLS plus higher dispersion and
self-steepening and also fully generalized NLS equation with Raman scattering
taken into account. For the latter case a mechanism of energy transfer from
smaller quasi-solitons to the bigger ones is proposed to explain the dramatical
increase of rogue waves appearance frequency in comparison to the systems when
the Raman scattering is not taken into account.Comment: 9 pages, 54 figure
Hydro-Physicochemical Changes in Domasi River Associated with Outbreak of Blackflies (Diptera; Simuliidae) in Zomba, Malawi
Blackflies impact human and animal health due to their biting nuisance and transmission of Ochocerca volvulus. This study presents an attempt to analyze hydro physicochemical changes associated with outbreak of black flies in Zomba, Malawi. The study compared historical data of hydro physicochemical parameters before (1985-2002) and after (2008) the outbreak to deduce the changes associated with mass occurrence of these flies. Changes in water quality between these two periods were assessed using T-tests. To establish the relationship between the black fly larval densities and water quality parameters data was subjected to both principal component and correlation analysis. Three principal components before the outbreak and two principal components during the outbreak (both dry and wet season) accounted for most of the variation in water quality in this river system. Nutrient load, increases in Total Suspended Solids (TSS) and Total Hardness (TH) were the main factors that had high loadings on these principal components over the years. A significant correlation was established between black fly larval densities and total hardness (r=0.86, p<0.05) as well as total suspended solids (r = 0.755, p<0.02). The potential role of anthropogenic influences on water quality and its cascading effect on black fly population dynamics is discussed
A simple and surprisingly accurate approach to the chemical bond obtained from dimensional scaling
We present a new dimensional scaling transformation of the Schrodinger
equation for the two electron bond. This yields, for the first time, a good
description of the two electron bond via D-scaling. There also emerges, in the
large-D limit, an intuitively appealing semiclassical picture, akin to a
molecular model proposed by Niels Bohr in 1913. In this limit, the electrons
are confined to specific orbits in the scaled space, yet the uncertainty
principle is maintained because the scaling leaves invariant the
position-momentum commutator. A first-order perturbation correction,
proportional to 1/D, substantially improves the agreement with the exact ground
state potential energy curve. The present treatment is very simple
mathematically, yet provides a strikingly accurate description of the potential
energy curves for the lowest singlet, triplet and excited states of H_2. We
find the modified D-scaling method also gives good results for other molecules.
It can be combined advantageously with Hartree-Fock and other conventional
methods.Comment: 4 pages, 5 figures, to appear in Phys. Rev. Letter
Ensemble Habitat Suitability Modeling to Guide Conservation of Black-Backed Woodpeckers
Conservation of black-backed woodpecker (Picoides arcticus), a burned-forest specialist, is challenged by the unpredictable availability of suitable habitat. Habitat models calibrated with data from previous wildfires can be used to predict habitat suitability in newly fire-affected areas. Predictive accuracy of habitat models depends on how well statistical relationships reflect actual ecological relationships. We predicted habitat suitability for Black-backed Woodpecker at Montana post-wildfire forests (? 6 years postfire) east of the continental divide using models calibrated with nest location data from wildlfire locations in Idaho, Oregon, and Washington. We developed 6 habitat models, including one partitioned Mahalanobis model, two Maxent models, and 3 weighted logistic regression models with combinations of seven environmental variables describing burn severity, topography, and pre-fire canopy cover. We converted continuous habitat suitability indices (HSIs) into binary predictions (suitable or unsuitable) and combined predictions using and ensemble approach; we compiled the number of models (0–6) predicting locations (30×30-m pixels) as suitable. Habitat models represented different hypotheses regarding true ecological relationships, making inferences from ensemble predictions robust to uncertainties in the form of these relationships. Thirty-five percent of the area burned by eastside Montana wildfires was predicted suitable by either all seven habitat models or none of them (i.e. complete agreement among models). We recommend conservation of areas (e.g., exclusion of post-fire salvage logging) that were consistently predicted suitable by most models, e.g., 32 percent of burned areas predicted suitable by ? 5 models. Additionally, we recommend surveying areas where models disagree to help validate and refine models
Non-Invasive Driver Drowsiness Detection System.
Drowsiness when in command of a vehicle leads to a decline in cognitive performance that affects driver behavior, potentially causing accidents. Drowsiness-related road accidents lead to severe trauma, economic consequences, impact on others, physical injury and/or even death. Real-time and accurate driver drowsiness detection and warnings systems are necessary schemes to reduce tiredness-related driving accident rates. The research presented here aims at the classification of drowsy and non-drowsy driver states based on respiration rate detection by non-invasive, non-touch, impulsive radio ultra-wideband (IR-UWB) radar. Chest movements of 40 subjects were acquired for 5 m using a lab-placed IR-UWB radar system, and respiration per minute was extracted from the resulting signals. A structured dataset was obtained comprising respiration per minute, age and label (drowsy/non-drowsy). Different machine learning models, namely, Support Vector Machine, Decision Tree, Logistic regression, Gradient Boosting Machine, Extra Tree Classifier and Multilayer Perceptron were trained on the dataset, amongst which the Support Vector Machine shows the best accuracy of 87%. This research provides a ground truth for verification and assessment of UWB to be used effectively for driver drowsiness detection based on respiration
A Novel Approach to Railway Track Faults Detection Using Acoustic Analysis.
Regular inspection of railway track health is crucial for maintaining safe and reliable train operations. Factors, such as cracks, ballast issues, rail discontinuity, loose nuts and bolts, burnt wheels, superelevation, and misalignment developed on the rails due to non-maintenance, pre-emptive investigations and delayed detection, pose a grave danger and threats to the safe operation of rail transport. The traditional procedure of manually inspecting the rail track using a railway cart is both inefficient and prone to human error and biases. In a country like Pakistan where train accidents have taken many lives, it is not unusual to automate such approaches to avoid such accidents and save countless lives. This study aims at enhancing the traditional railway cart system to address these issues by introducing an automatic railway track fault detection system using acoustic analysis. In this regard, this study makes two important contributions: data collection on Pakistan railway tracks using acoustic signals and the application of various classification techniques to the collected data. Initially, three types of tracks are considered, including normal track, wheel burnt and superelevation, due to their common occurrence. Several well-known machine learning algorithms are applied such as support vector machines, logistic regression, random forest and decision tree classifier, in addition to deep learning models like multilayer perceptron and convolutional neural networks. Results suggest that acoustic data can help determine the track faults successfully. Results indicate that the best results are obtained by RF and DT with an accuracy of 97%
Statistical Consequences of Devroye Inequality for Processes. Applications to a Class of Non-Uniformly Hyperbolic Dynamical Systems
In this paper, we apply Devroye inequality to study various statistical
estimators and fluctuations of observables for processes. Most of these
observables are suggested by dynamical systems. These applications concern the
co-variance function, the integrated periodogram, the correlation dimension,
the kernel density estimator, the speed of convergence of empirical measure,
the shadowing property and the almost-sure central limit theorem. We proved in
\cite{CCS} that Devroye inequality holds for a class of non-uniformly
hyperbolic dynamical systems introduced in \cite{young}. In the second appendix
we prove that, if the decay of correlations holds with a common rate for all
pairs of functions, then it holds uniformly in the function spaces. In the last
appendix we prove that for the subclass of one-dimensional systems studied in
\cite{young} the density of the absolutely continuous invariant measure belongs
to a Besov space.Comment: 33 pages; companion of the paper math.DS/0412166; corrected version;
to appear in Nonlinearit
Enhancing Cricket Performance Analysis with Human Pose Estimation and Machine Learning
Cricket has a massive global following and is ranked as the second most popular sport globally, with an estimated 2.5 billion fans. Batting requires quick decisions based on ball speed, trajectory, fielder positions, etc. Recently, computer vision and machine learning techniques have gained attention as potential tools to predict cricket strokes played by batters. This study presents a cutting-edge approach to predicting batsman strokes using computer vision and machine learning. The study analyzes eight strokes: pull, cut, cover drive, straight drive, backfoot punch, on drive, flick, and sweep. The study uses the MediaPipe library to extract features from videos and several machine learning and deep learning algorithms, including random forest (RF), support vector machine, k-nearest neighbors, decision tree, linear regression, and long short-term memory to predict the strokes. The study achieves an outstanding accuracy of 99.77% using the RF algorithm, outperforming the other algorithms used in the study. The k-fold validation of the RF model is 95.0% with a standard deviation of 0.07, highlighting the potential of computer vision and machine learning techniques for predicting batsman strokes in cricket. The study’s results could help improve coaching techniques and enhance batsmen’s performance in cricket, ultimately improving the game’s overall quality
- …