1,439 research outputs found
Time Series Trend Analysis Based on K-Means and Support Vector Machine
In this paper, we apply both supervised and unsupervised machine learning techniques to predict the trend of financial time series based on trading rules. These techniques are K-means for clustering the similar group of data and support vector machine for training and testing historical data to perform a one-day-ahead trend prediction. To evaluate the method, we compare the proposed method with traditional back-propagation neural network and a standalone support vector machine. In addition, to implement this combination method, we use the financial time series data obtained from Yahoo Finance website and the experimental results also validate the effectiveness of the method
Molecular geometric deep learning
Geometric deep learning (GDL) has demonstrated huge power and enormous
potential in molecular data analysis. However, a great challenge still remains
for highly efficient molecular representations. Currently, covalent-bond-based
molecular graphs are the de facto standard for representing molecular topology
at the atomic level. Here we demonstrate, for the first time, that molecular
graphs constructed only from non-covalent bonds can achieve similar or even
better results than covalent-bond-based models in molecular property
prediction. This demonstrates the great potential of novel molecular
representations beyond the de facto standard of covalent-bond-based molecular
graphs. Based on the finding, we propose molecular geometric deep learning
(Mol-GDL). The essential idea is to incorporate a more general molecular
representation into GDL models. In our Mol-GDL, molecular topology is modeled
as a series of molecular graphs, each focusing on a different scale of atomic
interactions. In this way, both covalent interactions and non-covalent
interactions are incorporated into the molecular representation on an equal
footing. We systematically test Mol-GDL on fourteen commonly-used benchmark
datasets. The results show that our Mol-GDL can achieve a better performance
than state-of-the-art (SOTA) methods. Source code and data are available at
https://github.com/CS-BIO/Mol-GDL
Transient Sources and How to Study Them: Selected Topics in Multi-Messenger Astronomy
The discovery of cosmic neutrino flux by IceCube, and the multi-messenger observations of gravitational event GW170817 ushered in the era of multi-messenger astronomy. Since the Universe itself is a natural laboratory, multi-messenger astronomy can help us study the most extreme physics processes in great detail. In this dissertation, we touch on some of the currently unanswered questions involving different types of transient sources and different āmessengersā of multi-messenger astronomy. We employ a variety of analysis methods, including machine learning, a method that has not yet been widely adopted in astronomy but is rapidly gaining momentum.We start this dissertation with Chapter 1 and a brief introduction to transient sources and multi-messenger astronomy, as well as machine learning. The subsequent chapters are organized as follows: In Chapter 2, we use publicly released IceCube neutrino data and blazar locations to test the spatial correlation between neutrino events and blazars. We also scrutinize the correlation between Ī³-ray flux and neutrino flux of blazars, and we find no compelling evidence to prove blazars as the main source of cosmic neutrinos. In Chapter 3, We utilize IceCube neutrino and CHIME fast radio burst (FRB) catalogs to examine the possibility of an association between neutrinos and FRBs. Our results rule out the above-mentioned association. We find the upper limit of the contribution of FRBs to the diffuse cosmic neutrino flux at 100TeV to be ā¼7.95Ć10ā21GeVā1cmā2sā1srā1, or ā¼0.55% of the 10-year diffuse neutrino flux observed by IceCube. In Chapter 4, we conduct a global test of delay and jet models of binary neutron star mergers with short gamma-ray bursts (SRGBs) simulated with a Las Vegas algorithm. Our simulations suggest that all SGRBs can be understood with a universal structured jet viewed at different angles. Furthermore, models invoking a jet plus cocoon structure with a lognormal delay timescale are most favored, while the Gaussian delay with the Gaussian jet model and the entire set of power-law delay models are disfavored. In Chapter 5, we train machine learning algorithms with FRBs in the first CHIME/FRB catalog, telling them the repetitiveness of each FRB. We find that the models can predict most FRBs correctly, hinting toward distinct mechanisms behind repeating and non-repeating FRBs. The two most important distinguishing factors between non-repeating and repeating FRBs are brightness temperature and rest-frame frequency bandwidth. We also identify some potentially repeating FRBs currently reported as non-repeating. In Chapter 6, we seek to build a GRB multi-parameter classification scheme with supervised machine learning methods. We utilize the GRB Big Table and Greinerās GRB catalog, and we divide the input feature into three subgroups: prompt emission, afterglow, and host galaxy. We find that the prompt emission subgroup performs the best. We also find the most important distinguishing feature in prompt emission to be T90, hardness ratio, and fluence. After building the machine learning model, we apply it to the classification of currently unclassified GRBs
On the Iteration Complexity of Smoothed Proximal ALM for Nonconvex Optimization Problem with Convex Constraints
It is well-known that the lower bound of iteration complexity for solving
nonconvex unconstrained optimization problems is , which
can be achieved by standard gradient descent algorithm when the objective
function is smooth. This lower bound still holds for nonconvex constrained
problems, while it is still unknown whether a first-order method can achieve
this lower bound. In this paper, we show that a simple single-loop first-order
algorithm called smoothed proximal augmented Lagrangian method (ALM) can
achieve such iteration complexity lower bound. The key technical contribution
is a strong local error bound for a general convex constrained problem, which
is of independent interest
- ā¦