110 research outputs found
Lifelong Machine Learning for Topic Modeling and Classification
Machine Learning (ML) has been successfully used as a prevalent approach for many computational tasks and applications. However, most ML algorithms are designed to address a specific problem using a single dataset. That is, given a dataset, an ML algorithm is run on the dataset to build a model. Although this one-shot learning is very important and useful, it can never make an AI system intelligent, and its accuracy is also limited.
Lifelong Machine Learning (LML), on the other hand, aims to design and develop computational systems and algorithms that learn as humans do, i.e., retaining the results learned in the past, abstracting knowledge from them, and using the knowledge to help future learning and problem solving. The rationale is that when faced with a new situation, we humans use our previous experience and knowledge to help deal with and learn from the new situation. It is essential to incorporate such a capability into a computational system to make it more versatile, holistic, and intelligent.
This thesis presents my Ph.D. research work on designing lifelong machine learning approaches for both unsupervised learning and supervised learning. For unsupervised learning, we focus on the area of topic modeling, which aims to discover coherent semantic topics from the documents. For supervised learning, we propose to improve the problem of classification with the integration of lifelong machine learning.
Topic modeling has been widely used to uncover topics from document collections. Such topics are important in many text mining and machine learning tasks such as classification, retrieval, clustering and summarization. However, classic unsupervised topic models can generate many incoherent topics. To address them, we proposed several knowledge-based topic models (Chen et al., 2013d; Chen et al., 2013b; Chen et al., 2013c) which require the knowledge to be provided by domain experts. To further ameliorate the topic quality from topic models, in (Chen and Liu, 2014b; Chen and Liu, 2014a), we proposed to automatically extract, accumulate and filter knowledge with the idea of LML, i.e., lifelong machine learning. The experimental results shown in these papers demonstrate the effectiveness of the proposed LML approaches.
We also apply LML for supervised learning, specifically classification. Classification is a widely studied machine learning task. The goal is to classify certain objects into a fixed set of categories. Deviated from traditional classification problem which focuses on a single domain, we proposed our Lifelong Sentiment Classification (LSC) model (Chen et al., 2015) which automatically extracts and accumulates sentiment oriented knowledge. Such knowledge is utilized using regularization under the Naive Bayesian optimization framework. The experimental results demonstrate that our proposed LSC model is able to accomplish better and better classification performance with knowledge accumulated from an increasing number of domains, which shows the advantages of having LML.
Based on this thesis, we believe that the Lifelong Machine Learning (LML) capability can lead to more robust computational systems to overcome the dynamics and complexity of real-world problems to produce better predictability
Lifelong Machine Learning for Topic Modeling and Classification
Machine Learning (ML) has been successfully used as a prevalent approach for many computational tasks and applications. However, most ML algorithms are designed to address a specific problem using a single dataset. That is, given a dataset, an ML algorithm is run on the dataset to build a model. Although this one-shot learning is very important and useful, it can never make an AI system intelligent, and its accuracy is also limited.
Lifelong Machine Learning (LML), on the other hand, aims to design and develop computational systems and algorithms that learn as humans do, i.e., retaining the results learned in the past, abstracting knowledge from them, and using the knowledge to help future learning and problem solving. The rationale is that when faced with a new situation, we humans use our previous experience and knowledge to help deal with and learn from the new situation. It is essential to incorporate such a capability into a computational system to make it more versatile, holistic, and intelligent.
This thesis presents my Ph.D. research work on designing lifelong machine learning approaches for both unsupervised learning and supervised learning. For unsupervised learning, we focus on the area of topic modeling, which aims to discover coherent semantic topics from the documents. For supervised learning, we propose to improve the problem of classification with the integration of lifelong machine learning.
Topic modeling has been widely used to uncover topics from document collections. Such topics are important in many text mining and machine learning tasks such as classification, retrieval, clustering and summarization. However, classic unsupervised topic models can generate many incoherent topics. To address them, we proposed several knowledge-based topic models (Chen et al., 2013d; Chen et al., 2013b; Chen et al., 2013c) which require the knowledge to be provided by domain experts. To further ameliorate the topic quality from topic models, in (Chen and Liu, 2014b; Chen and Liu, 2014a), we proposed to automatically extract, accumulate and filter knowledge with the idea of LML, i.e., lifelong machine learning. The experimental results shown in these papers demonstrate the effectiveness of the proposed LML approaches.
We also apply LML for supervised learning, specifically classification. Classification is a widely studied machine learning task. The goal is to classify certain objects into a fixed set of categories. Deviated from traditional classification problem which focuses on a single domain, we proposed our Lifelong Sentiment Classification (LSC) model (Chen et al., 2015) which automatically extracts and accumulates sentiment oriented knowledge. Such knowledge is utilized using regularization under the Naive Bayesian optimization framework. The experimental results demonstrate that our proposed LSC model is able to accomplish better and better classification performance with knowledge accumulated from an increasing number of domains, which shows the advantages of having LML.
Based on this thesis, we believe that the Lifelong Machine Learning (LML) capability can lead to more robust computational systems to overcome the dynamics and complexity of real-world problems to produce better predictability
Efficient Generation of Biologically Active <i>H</i>-Pyrazolo[5,1-<i>a</i>]isoquinolines via Multicomponent Reaction
A highly efficient multicomponent reaction of 2-alkynylbenzaldehyde, sulfonohydrazide, alcohol, and α,β-unsaturated aldehyde or ketone is disclosed, which generates the diverse H-pyrazolo[5,1-a]isoquinolines in good yields. This reaction proceeds with good functional group tolerance under mild conditions with high efficiency and excellent selectivity. Preliminary biological assays show that some of these compounds display promising activities as CDC25B inhibitor, TC-PTP inhibitor, and PTP1B inhibitor
One-Pot Assembly of the Highly Branched Tetradecasaccharide from <i>Ganoderma lucidum</i> Glycan GLSWA‑1 with Immune-Enhancing Activities
The highly branched tetradecasaccharide repeating unit
and shorter
sequences of GLSWA-1 with immune-enhancing activities from Ganoderma lucidum have been prepared via a one-pot glycan
assembly strategy. The synthetic route features (1) orthogonal one-pot
glycosylation on the basis of PVB glycosylation to streamline glycan
synthesis avoiding such issues as aglycone transfer, (2) one-pot assembly
of oligosaccharides with up to four different glycosyl linkages, and
(3) modular and convergent [4+5+5] one-pot assembly of the highly
branched tetradecasaccharide
Efficient Generation of Biologically Active <i>H</i>-Pyrazolo[5,1-<i>a</i>]isoquinolines via Multicomponent Reaction
A highly efficient multicomponent reaction of 2-alkynylbenzaldehyde, sulfonohydrazide, alcohol, and α,β-unsaturated aldehyde or ketone is disclosed, which generates the diverse H-pyrazolo[5,1-a]isoquinolines in good yields. This reaction proceeds with good functional group tolerance under mild conditions with high efficiency and excellent selectivity. Preliminary biological assays show that some of these compounds display promising activities as CDC25B inhibitor, TC-PTP inhibitor, and PTP1B inhibitor
DDQ-Mediated Oxidative Radical Cycloisomerization of 1,5-Diynols: Regioselective Synthesis of Benzo[<i>b</i>]fluorenones under Metal-Free Conditions
A regio-
and chemoselective oxidative cycloisomerization reaction
of acyclic 1,5-diynols has been developed. The reaction proceeds under
metal-free reaction conditions with high efficiency and broad functional
group tolerance, which offers a general and straightforward access
to benzoÂ[<i>b</i>]Âfluorenones under metal-free conditions.
The preliminary mechanistic studies revealed the possible involvement
of a Meyer–Schuster rearrangement combined with an oxidative
radical cyclization
Pd/Norbornene Collaborative Catalysis on the Divergent Preparation of Heterocyclic Sulfoximine Frameworks
Pd/Norbornene cocatalyzed tandem
C–H activation/annulation
reactions of free NH-sulfoximines with aryl iodides
to produce diverse polyheterocyclic sulfoximines in highly chemoselective
models are reported. The reaction tolerated a broad range of functional
groups under external oxidant-free conditions. The preliminary mechanistic
studies using density functional theory (DFT) calculations highlighted
the key role of a PdIV intermediate
Synthesis of Biaryl Tertiary Amines through Pd/Norbornene Joint Catalysis in a Remote C–H Amination/Suzuki Coupling Reaction
Here,
we report on an efficient palladium/norbornene-catalyzed
domino reaction of aryl halide, O-acyl hydroxylamine
(R1R2N-OBz), and aromatic pinacol boronate (R-Bpin),
selectively affording a series of biaryl tertiary amines in good to
high yields with excellent functional group tolerance. This catalytic
reaction provides a new opportunity for the convergent synthesis of
biaryl amines from easily accessible starting materials
A Facile Route to Polysubstituted Indoles via Three-Component Reaction of 2-Ethynylaniline, Sulfonyl Azide, and Nitroolefin
A copper-catalyzed three-component reaction of 2-ethynylaniline, sulfonyl azide, and nitroolefin is reported. This reaction generates functionalized indoles in good yields and proceeds smoothly under mild conditions. Some hits as an HCT-116 inhibitor are found from the preliminary biological screening
A Facile Route to Polysubstituted Indoles via Three-Component Reaction of 2-Ethynylaniline, Sulfonyl Azide, and Nitroolefin
A copper-catalyzed three-component reaction of 2-ethynylaniline, sulfonyl azide, and nitroolefin is reported. This reaction generates functionalized indoles in good yields and proceeds smoothly under mild conditions. Some hits as an HCT-116 inhibitor are found from the preliminary biological screening
- …