478 research outputs found
Human pol II promoter prediction: time series descriptors and machine learning
Although several in silico promoter prediction methods have been developed to date, they are still limited in predictive performance. The limitations are due to the challenge of selecting appropriate features of promoters that distinguish them from non-promoters and the generalization or predictive ability of the machine-learning algorithms. In this paper we attempt to define a novel approach by using unique descriptors and machine-learning methods for the recognition of eukaryotic polymerase II promoters. In this study, non-linear time series descriptors along with non-linear machine-learning algorithms, such as support vector machine (SVM), are used to discriminate between promoter and non-promoter regions. The basic idea here is to use descriptors that do not depend on the primary DNA sequence and provide a clear distinction between promoter and non-promoter regions. The classification model built on a set of 1000 promoter and 1500 non-promoter sequences, showed a 10-fold cross-validation accuracy of 87% and an independent test set had an accuracy >85% in both promoter and non-promoter identification. This approach correctly identified all 20 experimentally verified promoters of human chromosome 22. The high sensitivity and selectivity indicates that n-mer frequencies along with non-linear time series descriptors, such as Lyapunov component stability and Tsallis entropy, and supervised machine-learning methods, such as SVMs, can be useful in the identification of pol II promoters
A Novel Method for Intelligent Single Fault Detection of Bearings Using SAE and Improved D–S Evidence Theory
In order to realize single fault detection (SFD) from the multi-fault coupling bearing data and further research on the multi-fault situation of bearings, this paper proposes a method based on features self-extraction of a Sparse Auto-Encoder (SAE) and results fusion of improved Dempster–Shafer evidence theory (D–S). Multi-fault signal compression features of bearings were extracted by SAE on multiple vibration sensors’ data. Data sets were constructed by the extracted compression features to train the Support Vector Machine (SVM) according to the rule of single fault detection (R-SFD) this paper proposed. Fault detection results were obtained by the improved D–S evidence theory, which was implemented via correcting the 0 factor in the Basic Probability Assignment (BPA) and modifying the evidence weight by Pearson Correlation Coefficient (PCC). Extensive evaluations of the proposed method on the experiment platform datasets showed that the proposed method could realize single fault detection from multi-fault bearings. Fault detection accuracy increases as the output feature dimension of SAE increases; when the feature dimension reached 200, the average detection accuracy of the three sensors for bearing inner, outer, and ball faults achieved 87.36%, 87.86% and 84.46%, respectively. The three types’ fault detection accuracy—reached to 99.12%, 99.33% and 98.46% by the improved Dempster–Shafer evidence theory (IDS) to fuse the sensors’ results—is respectively 0.38%, 2.06% and 0.76% higher than the traditional D–S evidence theory. That indicated the effectiveness of improving the D–S evidence theory by evidence weight calculation of PCC
An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity
We study the problem of learning to rank from pairwise preferences, and solve
a long-standing open problem that has led to development of many heuristics but
no provable results for our particular problem. Given a set of
elements, we wish to linearly order them given pairwise preference labels. A
pairwise preference label is obtained as a response, typically from a human, to
the question "which if preferred, u or v?u,v\in V{n\choose 2}$ possibilities only. We present an active learning algorithm for
this problem, with query bounds significantly beating general (non active)
bounds for the same error guarantee, while almost achieving the information
theoretical lower bound. Our main construct is a decomposition of the input
s.t. (i) each block incurs high loss at optimum, and (ii) the optimal solution
respecting the decomposition is not much worse than the true opt. The
decomposition is done by adapting a recent result by Kenyon and Schudy for a
related combinatorial optimization problem to the query efficient setting. We
thus settle an open problem posed by learning-to-rank theoreticians and
practitioners: What is a provably correct way to sample preference labels? To
further show the power and practicality of our solution, we show how to use it
in concert with an SVM relaxation.Comment: Fixed a tiny error in theorem 3.1 statemen
An Ensemble Deep Convolutional Neural Network Model with Improved D-S Evidence Fusion for Bearing Fault Diagnosis
Intelligent machine health monitoring and fault diagnosis are becoming increasingly important for modern manufacturing industries. Current fault diagnosis approaches mostly depend on expert-designed features for building prediction models. In this paper, we proposed IDSCNN, a novel bearing fault diagnosis algorithm based on ensemble deep convolutional neural networks and an improved Dempster–Shafer theory based evidence fusion. The convolutional neural networks take the root mean square (RMS) maps from the FFT (Fast Fourier Transformation) features of the vibration signals from two sensors as inputs. The improved D-S evidence theory is implemented via distance matrix from evidences and modified Gini Index. Extensive evaluations of the IDSCNN on the Case Western Reserve Dataset showed that our IDSCNN algorithm can achieve better fault diagnosis performance than existing machine learning methods by fusing complementary or conflicting evidences from different models and sensors and adapting to different load conditions
The Survey, Taxonomy, and Future Directions of Trustworthy AI: A Meta Decision of Strategic Decisions
When making strategic decisions, we are often confronted with overwhelming
information to process. The situation can be further complicated when some
pieces of evidence are contradicted each other or paradoxical. The challenge
then becomes how to determine which information is useful and which ones should
be eliminated. This process is known as meta-decision. Likewise, when it comes
to using Artificial Intelligence (AI) systems for strategic decision-making,
placing trust in the AI itself becomes a meta-decision, given that many AI
systems are viewed as opaque "black boxes" that process large amounts of data.
Trusting an opaque system involves deciding on the level of Trustworthy AI
(TAI). We propose a new approach to address this issue by introducing a novel
taxonomy or framework of TAI, which encompasses three crucial domains:
articulate, authentic, and basic for different levels of trust. To underpin
these domains, we create ten dimensions to measure trust:
explainability/transparency, fairness/diversity, generalizability, privacy,
data governance, safety/robustness, accountability, reproducibility,
reliability, and sustainability. We aim to use this taxonomy to conduct a
comprehensive survey and explore different TAI approaches from a strategic
decision-making perspective
An Ensemble Deep Convolutional Neural Network Model with Improved D-S Evidence Fusion for Bearing Fault Diagnosis
Intelligent machine health monitoring and fault diagnosis are becoming increasingly important for modern manufacturing industries. Current fault diagnosis approaches mostly depend on expert-designed features for building prediction models. In this paper, we proposed IDSCNN, a novel bearing fault diagnosis algorithm based on ensemble deep convolutional neural networks and an improved Dempster–Shafer theory based evidence fusion. The convolutional neural networks take the root mean square (RMS) maps from the FFT (Fast Fourier Transformation) features of the vibration signals from two sensors as inputs. The improved D-S evidence theory is implemented via distance matrix from evidences and modified Gini Index. Extensive evaluations of the IDSCNN on the Case Western Reserve Dataset showed that our IDSCNN algorithm can achieve better fault diagnosis performance than existing machine learning methods by fusing complementary or conflicting evidences from different models and sensors and adapting to different load conditions
Understanding confounding effects in linguistic coordination: an information-theoretic approach
We suggest an information-theoretic approach for measuring stylistic
coordination in dialogues. The proposed measure has a simple predictive
interpretation and can account for various confounding factors through proper
conditioning. We revisit some of the previous studies that reported strong
signatures of stylistic accommodation, and find that a significant part of the
observed coordination can be attributed to a simple confounding effect - length
coordination. Specifically, longer utterances tend to be followed by longer
responses, which gives rise to spurious correlations in the other stylistic
features. We propose a test to distinguish correlations in length due to
contextual factors (topic of conversation, user verbosity, etc.) and
turn-by-turn coordination. We also suggest a test to identify whether stylistic
coordination persists even after accounting for length coordination and
contextual factors
Sentiment Paradoxes in Social Networks: Why Your Friends Are More Positive Than You?
Most people consider their friends to be more positive than themselves,
exhibiting a Sentiment Paradox. Psychology research attributes this paradox to
human cognition bias. With the goal to understand this phenomenon, we study
sentiment paradoxes in social networks. Our work shows that social connections
(friends, followees, or followers) of users are indeed (not just illusively)
more positive than the users themselves. This is mostly due to positive users
having more friends. We identify five sentiment paradoxes at different network
levels ranging from triads to large-scale communities. Empirical and
theoretical evidence are provided to validate the existence of such sentiment
paradoxes. By investigating the relationships between the sentiment paradox and
other well-developed network paradoxes, i.e., friendship paradox and activity
paradox, we find that user sentiments are positively correlated to their number
of friends but rarely to their social activity. Finally, we demonstrate how
sentiment paradoxes can be used to predict user sentiments.Comment: The 14th International AAAI Conference on Web and Social Media (ICWSM
2020
- …