5 research outputs found
Two-stage Gene Selection and Classification for a High-Dimensional Microarray Data
Microarray technology has provided benefits for cancer diagnosis and classification. However, classifying cancer using microarray data is confronted with difficulty since the dataset has high dimensions. One strategy for dealing with the dimensionality problem is to make a feature selection before modeling. Lasso is a common regularization method to reduce the number of features or predictors. However, Lasso remains too many features at the optimum regularization parameter. Therefore, feature selection can be continued to the second stage. We proposed Classification and Regression Tree (CART) for feature selection on the second stage which can also produce a classification model. We used a dataset which comparing gene expression in breast tumor tissues and other tumor tissues. This dataset has 10,936 predictor variables and 1,545 observations. The results of this study were the proposed method able to produce a few numbers of selected genes but gave high accuracy. The model also acquired in line with the Oncogenomics Theory by the obtained of GATA3 to split the root node of the decision tree model. GATA3 has become an important marker for breast tumors
Evaluating the Effectiveness of 2D and 3D Features for Predicting Tumor Response to Chemotherapy
2D and 3D tumor features are widely used in a variety of medical image
analysis tasks. However, for chemotherapy response prediction, the
effectiveness between different kinds of 2D and 3D features are not
comprehensively assessed, especially in ovarian cancer-related applications.
This investigation aims to accomplish such a comprehensive evaluation. For this
purpose, CT images were collected retrospectively from 188 advanced-stage
ovarian cancer patients. All the metastatic tumors that occurred in each
patient were segmented and then processed by a set of six filters. Next, three
categories of features, namely geometric, density, and texture features, were
calculated from both the filtered results and the original segmented tumors,
generating a total of 1595 and 1403 features for the 3D and 2D tumors,
respectively. In addition to the conventional single-slice 2D and full-volume
3D tumor features, we also computed the incomplete-3D tumor features, which
were achieved by sequentially adding one individual CT slice and calculating
the corresponding features. Support vector machine (SVM) based prediction
models were developed and optimized for each feature set. 5-fold
cross-validation was used to assess the performance of each individual model.
The results show that the 2D feature-based model achieved an AUC (area under
the ROC curve [receiver operating characteristic]) of 0.84+-0.02. When adding
more slices, the AUC first increased to reach the maximum and then gradually
decreased to 0.86+-0.02. The maximum AUC was yielded when adding two adjacent
slices, with a value of 0.91+-0.01. This initial result provides meaningful
information for optimizing machine learning-based decision-making support tools
in the future
Victoria Amazonica Optimization (VAO): An Algorithm Inspired by the Giant Water Lily Plant
The Victoria Amazonica plant, often known as the Giant Water Lily, has the
largest floating spherical leaf in the world, with a maximum leaf diameter of 3
meters. It spreads its leaves by the force of its spines and creates a large
shadow underneath, killing any plants that require sunlight. These water
tyrants use their formidable spines to compel each other to the surface and
increase their strength to grab more space from the surface. As they spread
throughout the pond or basin, with the earliest-growing leaves having more room
to grow, each leaf gains a unique size. Its flowers are transsexual and when
they bloom, Cyclocephala beetles are responsible for the pollination process,
being attracted to the scent of the female flower. After entering the flower,
the beetle becomes covered with pollen and transfers it to another flower for
fertilization. After the beetle leaves, the flower turns into a male and
changes color from white to pink. The male flower dies and sinks into the
water, releasing its seed to help create a new generation. In this paper, the
mathematical life cycle of this magnificent plant is introduced, and each leaf
and blossom are treated as a single entity. The proposed bio-inspired algorithm
is tested with 24 benchmark optimization test functions, such as Ackley, and
compared to ten other famous algorithms, including the Genetic Algorithm. The
proposed algorithm is tested on 10 optimization problems: Minimum Spanning
Tree, Hub Location Allocation, Quadratic Assignment, Clustering, Feature
Selection, Regression, Economic Dispatching, Parallel Machine Scheduling, Color
Quantization, and Image Segmentation and compared to traditional and
bio-inspired algorithms. Overall, the performance of the algorithm in all tasks
is satisfactory.Comment: 45 page