482 research outputs found
GROUP-LASSO ESTIMATION IN HIGH-DIMENSIONAL FACTOR MODELS WITH STRUCTURAL BREAKS
In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We also present the asymptotic properties of penalized least square estimator with both the cross-sections and the time dimensions tend to infinity.
Further, we develop a cross-validation procedure to obtain the tuning parameters to fine-tune the penalty terms and use the least square approach to estimate the break date after the number of factors is obtained. We also present a Monte Carlo simulation to evaluate the performance of the proposed procedure and analyze real data from 2007-09 of Great Recession. The proposed procedure generally detects the break date correctly during the Great Recession while the procedure performs relatively poorly in estimating the number of factors in the pre- and post-break date case
Solving the boolean satisfiability problem using multilevel techniques
There are many complex problems in computer science that occur in knowledge-representation (artificial thinking), artificial learning, Very Large Scale Integration (VLSI) design, security protocols and other areas. These complex problems may be deduced into satisfiability problems where the Boolean Satisfiability Problem (SAT) may be applied. This deduction is made in order to simplify complex problems into a specific propositional logic problem. The SAT problem is the most well-known nondeterministic polynomial time (NP) complete problem in computer science. It is a Boolean expression which is composed of a specific amount of variables (literals), clauses that contain disjunctions of the literals and conjunctions of the clauses. The literals have the logical values TRUE and FALSE, the task is to find a truth assignment that makes the entire expression TRUE. The main goal of the thesis is to solve the SAT problem using a clustering technique - Multilevel - combined first with Tabu Search and combined thereafter with finite Learning Automata. Tabu Search and finite Learning Automata are two very efficient approaches that have been used to solve SAT. Benchmark experiments are conducted in order to disclose whether combining Multilevel with existing solutions to solve SAT will provide better results - than the two mentioned approaches alone - mainly in terms of computational efficienc
Two-Stage Bagging Pruning for Reducing the Ensemble Size and Improving the Classification Performance
Ensemble methods, such as the traditional bagging algorithm, can usually improve the performance of a single classifier. However, they usually require large storage space as well as relatively time-consuming predictions. Many approaches were developed to reduce the ensemble size and improve the classification performance by pruning the traditional bagging algorithms. In this article, we proposed a two-stage strategy to prune the traditional bagging algorithm by combining two simple approaches: accuracy-based pruning (AP) and distance-based pruning (DP). These two methods, as well as their two combinations, “AP+DP” and “DP+AP” as the two-stage pruning strategy, were all examined. Comparing with the single pruning methods, we found that the two-stage pruning methods can furthermore reduce the ensemble size and improve the classification. “AP+DP” method generally performs better than the “DP+AP” method when using four base classifiers: decision tree, Gaussian naive Bayes, K-nearest neighbor, and logistic regression. Moreover, as compared to the traditional bagging, the two-stage method “AP+DP” improved the classification accuracy by 0.88%, 4.06%, 1.26%, and 0.96%, respectively, averaged over 28 datasets under the four base classifiers. It was also observed that “AP+DP” outperformed other three existing algorithms Brag, Nice, and TB assessed on 8 common datasets. In summary, the proposed two-stage pruning methods are simple and promising approaches, which can both reduce the ensemble size and improve the classification accuracy
A novel fault diagnosis for hydraulic pump based on EEMD-LTSA and PNN
The hydraulic pump is the core part of the hydraulic system and impacts the performance of hydraulic directly, thus the diagnosis for hydraulic is crucial. To realize the diagnosis for hydraulic pump, a method utilizing the vibration signal which varies with the performance is proposed. First, ensemble empirical mode decomposition (EEMD) is used to decompose the original signal into finite intrinsic mode functions (IMFs), and then the energy values are extracted to form the feature vector. Second, local tangent space alignment (LTSA), a manifold learning method, is applied in dimension reduction. Third, probabilistic neural network (PNN) is employed as the classifier to recognize the fault pattern. Finally, the effectiveness of the proposed method is validated by the experimental data with different faults
Enhancing Low-Precision Sampling via Stochastic Gradient Hamiltonian Monte Carlo
Low-precision training has emerged as a promising low-cost technique to
enhance the training efficiency of deep neural networks without sacrificing
much accuracy. Its Bayesian counterpart can further provide uncertainty
quantification and improved generalization accuracy. This paper investigates
low-precision sampling via Stochastic Gradient Hamiltonian Monte Carlo (SGHMC)
with low-precision and full-precision gradient accumulators for both strongly
log-concave and non-log-concave distributions. Theoretically, our results show
that, to achieve -error in the 2-Wasserstein distance for
non-log-concave distributions, low-precision SGHMC achieves quadratic
improvement
()
compared to the state-of-the-art low-precision sampler, Stochastic Gradient
Langevin Dynamics (SGLD)
().
Moreover, we prove that low-precision SGHMC is more robust to the quantization
error compared to low-precision SGLD due to the robustness of the
momentum-based update w.r.t. gradient noise. Empirically, we conduct
experiments on synthetic data, and {MNIST, CIFAR-10 \& CIFAR-100} datasets,
which validate our theoretical findings. Our study highlights the potential of
low-precision SGHMC as an efficient and accurate sampling method for
large-scale and resource-limited machine learning
Extrinsic Factors Affecting the Accuracy of Biomedical NER
Biomedical named entity recognition (NER) is a critial task that aims to
identify structured information in clinical text, which is often replete with
complex, technical terms and a high degree of variability. Accurate and
reliable NER can facilitate the extraction and analysis of important biomedical
information, which can be used to improve downstream applications including the
healthcare system. However, NER in the biomedical domain is challenging due to
limited data availability, as the high expertise, time, and expenses are
required to annotate its data. In this paper, by using the limited data, we
explore various extrinsic factors including the corpus annotation scheme, data
augmentation techniques, semi-supervised learning and Brill transformation, to
improve the performance of a NER model on a clinical text dataset (i2b2 2012,
\citet{sun-rumshisky-uzuner:2013}). Our experiments demonstrate that these
approaches can significantly improve the model's F1 score from original 73.74
to 77.55. Our findings suggest that considering different extrinsic factors and
combining these techniques is a promising approach for improving NER
performance in the biomedical domain where the size of data is limited
Identification of Benzo[a]pyrene-metabolizing bacteria in forest soils by using DNA-based stable-isotope probing
DNA-based stable-isotope probing (DNA-SIP) was used in this study to investigate the uncultivated bacteria with benzo[a]pyrene (BaP) metabolism capacities in two Chinese forest soils (Mt. Maoer in Heilongjiang Province and Mt. Baicaowa in Hubei Province). We characterized three different phylotypes with responsibility for BaP degradation, none of which were previously reported as BaP-degrading microorganisms by SIP. In Mt. Maoer soil microcosms, the putative BaP degraders were classified as belonging to the genus Terrimonas (family Chitinophagaceae, order Sphingobacteriales), whereas Burkholderia spp. were the key BaP degraders in Mt. Baicaowa soils. The addition of metabolic salicylate significantly increased BaP degradation efficiency in Mt. Maoer soils, and the BaP-metabolizing bacteria shifted to the microorganisms in the family Oxalobacteraceae (genus unclassified). Meanwhile, salicylate addition did not change either BaP degradation or putative BaP degraders in Mt. Baicaowa. Polycyclic aromatic hydrocarbon ring-hydroxylating dioxygenase (PAH-RHD) genes were amplified, sequenced, and quantified in the DNA-SIP (13)C heavy fraction to further confirm the BaP metabolism. By illuminating the microbial diversity and salicylate additive effects on BaP degradation across different soils, the results increased our understanding of BaP natural attenuation and provided a possible approach to enhance the bioremediation of BaP-contaminated soils
Interfacing Nickel Nitride and Nickel Boosts Both Electrocatalytic Hydrogen Evolution and Oxidation Reactions
Electrocatalysts of the hydrogen evolution and oxidation reactions (HER and HOR) are of critical importance for the realization of future hydrogen economy. In order to make electrocatalysts economically competitive for large-scale applications, increasing attention has been devoted to developing noble metal-free HER and HOR electrocatalysts especially for alkaline electrolytes due to the promise of emerging hydroxide exchange membrane fuel cells. Herein, we report that interface engineering of Ni3N and Ni results in a unique Ni3N/Ni electrocatalyst which exhibits exceptional HER/HOR activities in aqueous electrolytes. A systematic electrochemical study was carried out to investigate the superior hydrogen electrochemistry catalyzed by Ni3N/Ni, including nearly zero overpotential of catalytic onset, robust long-term durability, unity Faradaic efficiency, and excellent CO tolerance. Density functional theory computations were performed to aid the understanding of the electrochemical results and suggested that the real active sites are located at the interface between Ni3N and Ni
- …