Search CORE

11 research outputs found

Deep heterogeneous ensemble.

Author: Dang Manh Truong
Dao Lan Phuong
Liew Alan Wee Chung
Luong Anh Vu
McCall John
Nguyen Tien Thanh
Pham Tien Dung
Publication venue: Australian Journal of Intelligent Processing Systems
Publication date: 31/12/2019
Field of study

In recent years, deep neural networks (DNNs) have emerged as a powerful technique in many areas of machine learning. Although DNNs have achieved great breakthrough in processing images, video, audio and text, it also has some limitations such as needing a large number of labeled data for training and having a large number of parameters. Ensemble learning, meanwhile, provides a learning model by combining many different classifiers such that an ensemble of classifiers is better than using single classifier. In this study, we propose a deep ensemble framework called Deep Heterogeneous Ensemble (DHE) for supervised learning tasks. In each layer of our algorithm, the input data is passed through a feature selection method to remove irrelevant features and prevent overfitting. The cross-validation with K learning algorithms is applied to the selected data, in order to obtain the meta-data and the K base classifiers for the next layer. In this way, one layer will output the meta-data as the input data for the next layer, the base classifiers, and the indices of the selected meta-data. A combining algorithm is then applied on the meta-data of the last layer to obtain the final class prediction. Experiments on 30 datasets confirm that the proposed DHE is better than a number of well-known benchmark algorithms

Open Access Institutional Repository at Robert Gordon University

Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification

Author: Low Cheng-Yaw
Park Jaewoo
Teoh Andrew Beng-Jin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2019
Field of study

Stacking-based deep neural network (S-DNN) is aggregated with pluralities of basic learning modules, one after another, to synthesize a deep neural network (DNN) alternative for pattern classification. Contrary to the DNNs trained end to end by backpropagation (BP), each S-DNN layer, i.e., a self-learnable module, is to be trained decisively and independently without BP intervention. In this paper, a ridge regression-based S-DNN, dubbed deep analytic network (DAN), along with its kernelization (K-DAN), are devised for multilayer feature re-learning from the pre-extracted baseline features and the structured features. Our theoretical formulation demonstrates that DAN/K-DAN re-learn by perturbing the intra/inter-class variations, apart from diminishing the prediction errors. We scrutinize the DAN/K-DAN performance for pattern classification on datasets of varying domains - faces, handwritten digits, generic objects, to name a few. Unlike the typical BP-optimized DNNs to be trained from gigantic datasets by GPU, we disclose that DAN/K-DAN are trainable using only CPU even for small-scale training sets. Our experimental results disclose that DAN/K-DAN outperform the present S-DNNs and also the BP-trained DNNs, including multiplayer perceptron, deep belief network, etc., without data augmentation applied.Comment: 14 pages, 7 figures, 11 table

arXiv.org e-Print Archive

SHDL@MMU Digital Repository

Improving deep forest by confidence screening

Author: Pang Ming
Ting Kaiming
Zhao Peng
Zhou Zhi-Hua
Publication venue: IEEE
Publication date: 01/01/2018
Field of study

Most studies about deep learning are based on neural network models, where many layers of parameterized nonlinear differentiable modules are trained by backpropagation. Recently, it has been shown that deep learning can also be realized by non-differentiable modules without backpropagation training called deep forest. The developed representation learning process is based on a cascade of cascades of decision tree forests, where the high memory requirement and the high time cost inhibit the training of large models. In this paper, we propose a simple yet effective approach to improve the efficiency of deep forest. The key idea is to pass the instances with high confidence directly to the final stage rather than passing through all the levels. We also provide a theoretical analysis suggesting a means to vary the model complexity from low to high as the level increases in the cascade, which further reduces the memory requirement and time cost. Our experiments show that the proposed approach achieves highly competitive predictive performance with significantly reduced time cost and memory requirement by up to one order of magnitude

Crossref

Federation ResearchOnline

НОВЫЕ ПОДХОДЫ К РАЗРАБОТКЕ АЛГОРИТМОВ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА В ДИАГНОСТИКЕ РАКА ЛЕГКОГО

Author: A. Meldo A.
A. Мелдо A.
K. Shelekhova V.
L. Utkin V.
M. Ryabinin A.
T. Trofimova N.
T. Трофимова Н.
V. Moiseenko M.
В. Моисеенко M.
К. Шелехова В.
Л. Уткин В.
М. Рябинин A.
Publication venue: 'Baltic Medical Education Center'
Publication date: 08/04/2019
Field of study

The relevance of developing an intelligent automated diagnostic system (IADS) for lung cancer (LC) detection stems from the social significance of this disease and its leading position among all cancer diseases. Theoretically, the use of IADS is possible at a stage of screening as well as at a stage of adjusted diagnosis of LC. The recent approaches to training the IADS do not take into account the clinical and radiological classification as well as peculiarities of the LC clinical forms, which are used by the medical community. This defines difficulties and obstacles of using the available IADS. The authors are of the opinion that the closeness of a developed IADS to the «doctor’s logic» contributes to a better reproducibility and interpretability of the IADS usage results. Most IADS described in the literature have been developed on the basis of neural networks, which have several disadvantages that affect reproducibility when using the system. This paper proposes a composite algorithm using machine learning methods such as Deep Forest and Siamese neural network, which can be regarded as a more efficient approach for dealing with a small amount of training data and optimal from the reproducibility point of view. The open datasets used for training IADS include annotated objects which in some cases are not confirmed morphologically. The paper provides a description of the LIRA dataset developed by using the diagnostic results of St. Petersburg Clinical Research Center of Specialized Types of Medical Care (Oncology), which includes only computed tomograms of patients with the verified diagnosis. The paper considers stages of the machine learning process on the basis of the shape features, of the internal structure features as well as a new developed system of differential diagnosis of LC based on the Siamese neural networks. A new approach to the feature dimension reduction is also presented in the paper, which aims more efficient and faster learning of the system.Актуальность разработки интеллектуальной автоматизированной системы диагностики (ИАСД) рака легкого (РЛ) связана с социальной значимостью этого заболевания и его лидирующей позицией в структуре онкологической заболеваемости. Теоретически применение ИАСД возможно как на этапе скрининга, так и в уточненной диагностике РЛ. Применяемые подходы к обучению ИАСД не учитывают клинико-рентгенологическую классификацию и особенности клинических форм РЛ, используемые медицинским сообществом. С этим связаны трудности применения разрабатываемых в настоящее время систем. Авторы придерживаются мнения, что приближенность разрабатываемой ИАСД к «логике врача» способствует лучшей воспроизводимости и интерпретируемости результатов при ее использовании. Большинство описанных в литературе ИАСД созданы на основе нейронных сетей, которые обладают рядом недостатков, влияющих на воспроизводимость при использовании системы. Данная работа отражает применение комбинированного алгоритма с использованием методов машинного обучения, таких как глубокий лес и сиамская нейронная сеть, что является более эффективным подходом при малой выборке обучающих данных и оптимальным с точки зрения воспроизводимости. Открытые базы данных, применяемые при разработке ИАСД, включают размеченные, но в ряде случаев не подтвержденные морфологически находки. В статье приводится описание базы данных LIRA, созданной на материале Санкт-Петербургского клинического научно-практического центра специализированных видов медицинской помощи (онкологический), которая включает только компьютерные томограммы пациентов с верифицированным диагнозом. В статье описаны этапы машинного обучения по признакам формы, внутренней структуры, а также новая разработанная архитектура дифференциальной диагностики образований на основе сиамских нейронных сетей. Также отражен способ понижения размерности данных для более эффективного и быстрого обучения системы

National Open Repository Aggregator (NORA)

Diagnostic radiology and radiotherapy (E-Journal) / Лучевая диагностика и терапия