18,738 research outputs found
Box Drawings for Learning with Imbalanced Data
The vast majority of real world classification problems are imbalanced,
meaning there are far fewer data from the class of interest (the positive
class) than from other classes. We propose two machine learning algorithms to
handle highly imbalanced classification problems. The classifiers constructed
by both methods are created as unions of parallel axis rectangles around the
positive examples, and thus have the benefit of being interpretable. The first
algorithm uses mixed integer programming to optimize a weighted balance between
positive and negative class accuracies. Regularization is introduced to improve
generalization performance. The second method uses an approximation in order to
assist with scalability. Specifically, it follows a \textit{characterize then
discriminate} approach, where the positive class is characterized first by
boxes, and then each box boundary becomes a separate discriminative classifier.
This method has the computational advantages that it can be easily
parallelized, and considers only the relevant regions of feature space
A semismooth newton method for the nearest Euclidean distance matrix problem
The Nearest Euclidean distance matrix problem (NEDM) is a fundamentalcomputational problem in applications such asmultidimensional scaling and molecularconformation from nuclear magnetic resonance data in computational chemistry.Especially in the latter application, the problem is often large scale with the number ofatoms ranging from a few hundreds to a few thousands.In this paper, we introduce asemismooth Newton method that solves the dual problem of (NEDM). We prove that themethod is quadratically convergent.We then present an application of the Newton method to NEDM with -weights.We demonstrate the superior performance of the Newton method over existing methodsincluding the latest quadratic semi-definite programming solver.This research also opens a new avenue towards efficient solution methods for the molecularembedding problem
Observation of Landau quantization and standing waves in HfSiS
Recently, HfSiS was found to be a new type of Dirac semimetal with a line of
Dirac nodes in the band structure. Meanwhile, Rashba-split surface states are
also pronounced in this compound. Here we report a systematic study of HfSiS by
scanning tunneling microscopy/spectroscopy at low temperature and high magnetic
field. The Rashba-split surface states are characterized by measuring Landau
quantization and standing waves, which reveal a quasi-linear dispersive band
structure. First-principles calculations based on density-functional theory are
conducted and compared with the experimental results. Based on these
investigations, the properties of the Rashba-split surface states and their
interplay with defects and collective modes are discussed.Comment: 6 pages, 5 figure
A resource aware MapReduce based parallel SVM for large scale image classification
Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large.
This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments.
The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classification
Recommended from our members
Complex macrocycle exploration: parallel, heuristic, and constraint-based conformer generation using ForceGen.
ForceGen is a template-free, non-stochastic approach for 2D to 3D structure generation and conformational elaboration for small molecules, including both non-macrocycles and macrocycles. For conformational search of non-macrocycles, ForceGen is both faster and more accurate than the best of all tested methods on a very large, independently curated benchmark of 2859 PDB ligands. In this study, the primary results are on macrocycles, including results for 431 unique examples from four separate benchmarks. These include complex peptide and peptide-like cases that can form networks of internal hydrogen bonds. By making use of new physical movements ("flips" of near-linear sub-cycles and explicit formation of hydrogen bonds), ForceGen exhibited statistically significantly better performance for overall RMS deviation from experimental coordinates than all other approaches. The algorithmic approach offers natural parallelization across multiple computing-cores. On a modest multi-core workstation, for all but the most complex macrocycles, median wall-clock times were generally under a minute in fast search mode and under 2 min using thorough search. On the most complex cases (roughly cyclic decapeptides and larger) explicit exploration of likely hydrogen bonding networks yielded marked improvements, but with calculation times increasing to several minutes and in some cases to roughly an hour for fast search. In complex cases, utilization of NMR data to constrain conformational search produces accurate conformational ensembles representative of solution state macrocycle behavior. On macrocycles of typical complexity (up to 21 rotatable macrocyclic and exocyclic bonds), design-focused macrocycle optimization can be practically supported by computational chemistry at interactive time-scales, with conformational ensemble accuracy equaling what is seen with non-macrocyclic ligands. For more complex macrocycles, inclusion of sparse biophysical data is a helpful adjunct to computation
Current status of research and application in vascular stents
Cardiovascular diseases have been the leading cause of death in modern society. Using vascular stents to treat these coronary and peripheral artery diseases has been one of the most effective and rapidly adopted medical interventions. During the twenty-five years' development of vascular stents, revolutionary cardiovascular stents like drug eluting stents and endothelial progenitor cells capture stents have emerged. In this review, the evolution of vascular stents is summarized, aiming to provide a glimpse into the future of vascular stents. Advanced designs, focusing on the investigations of new substrates, new platforms, new drugs and new biomolecules are currently under evaluation with promising clinical studies. The concept of "time sequence functional stent" has been raised in this paper. It presents anti-proliferative properties in the first phase after implantation and subsequently support endothelialization. It also shows long-term inertness without release of toxic ions or toxic degradation products. The success of this concept is briefly presented with a clinical study in this model stents
Near-bandgap wavelength-dependent studies of long-lived traveling coherent longitudinal acoustic phonon oscillations in GaSb/GaAs systems
We report first studies of long-lived oscillations in optical pump-probe
measurements on GaSb-GaAs heterostructures. The oscillations arise from a
photogenerated coherent longitudinal acoustic phonon wave, which travels from
the top surface of GaSb across the interface into the GaAs substrate, providing
information on the optical properties of the material as a function of
time/depth. Wavelength-dependent studies of the oscillations near the bandgap
of GaAs indicate strong correlations to the optical properties of GaAs.Comment: 11 pages, 4 figure
- …