444 research outputs found
Recommended from our members
Evolutionary and deep mining models for effective biomarker discovery
With the advent of high-throughput biology, large amounts of molecular data are available for purposeful analysis and evaluation. Extracting relevant knowledge from high-throughput biomedical datasets has become a common goal of current approaches to personalised cancer medicine and understanding cancer genotype and phenotype. However, the datasets are characterised by high dimensionality and relatively small sample sizes with small signal-to-noise ratios. Extracting and interpreting relevant knowledge from such complex datasets therefore remains a significant challenge for the fields of machine learning and data mining. This is evidenced by the limited success these methods have had in detecting robust and reliable biomarkers for cancers and other complicated diseases. This could also explain the lack of finding generic biomarkers among the identified published genes for identical diseases or clinical conditions.
This thesis proposes and evaluates the efficacy of two novel feature mining models established on the basis of the evolutionary computation and deep learning paradigms to position and solve biomarker discovery as an optimisation problem. Deep learning methods lack the transparency and interpretability found in the evolutionary paradigm. To overcome the inherent issue of poor explanatory power associated with the deep learning, this research also introduces a novel deep mining model that helps to deconstruct the internal state of such deep learning models to reveal key determinants underlying its latent representations to aid feature selection. As a result, salient biomarkers for breast cancer and the positivity of the Estrogen and Progesterone receptors are discovered robustly and validated reliably across a wide range of independently generated breast cancer data samples
Analysis of Qualitative Behavior of Fifth Order Difference Equations
The main aim of this paper is to investigate the stability, global attractivity and periodic nature of the solutions of the difference equationsThe main aim of this paper is to investigate the stability, global attractivity and periodic nature of the solutions of the difference equations x_{n+1}=ax_{n-1}±((bx_{n-1}x_{n-2})/(cx_{n-2}±dx_{n-4})), n=0,1,2,..., where the initial conditions x₋₄, x₋₃ ,x₋₂, x₋₁ and x₀ are arbitrary positive real numbers and a, b, c, d are constants
A Comparative Study on Statistical and Machine Learning Forecasting Methods for an FMCG Company
Demand forecasting has been an area of study among scholars and businessmen ever since the start of the industrial revolution and has only gained focus in recent years with the advancements in AI. Accurate forecasts are no longer a luxury, but a necessity to have for effective decisions made in planning production and marketing. Many aspects of the business depend on demand, and this is particularly true for the Fast-Moving Consumer Goods industry where the high volume and demand volatility poses a challenge for planners to generate accurate forecasts as consumer demand complexity rises. Inaccurate demand forecasts lead to multiple issues such as high holding costs on excess inventory, shortages on certain SKUs in the market leading to sales loss and a significant impact on both top line and bottom line for the business. Researchers have attempted to look at the performance of statistical time series models in comparison to machine learning methods to evaluate their robustness, computational time and power. In this paper, a comparative study was conducted using statistical and machine learning techniques to generate an accurate forecast using shipment data of an FMCG company. Naïve method was used as a benchmark to evaluate performance of other forecasting techniques, and was compared to exponential smoothing, ARIMA, KNN, Facebook Prophet and LSTM using past 3 years shipments. Methodology followed was CRISP-DM from data exploration, pre-processing and transformation before applying different forecasting algorithms and evaluation. Moreover, secondary goals behind this paper include understanding associations between SKUs through market basket analysis, and clustering using KNN based on brand, customer, order quantity and value to propose a product segmentation strategy. The results of both clustering and forecasting models are then evaluated to choose the optimal forecasting technique, and a visual representation of the forecast and exploratory analysis conducted is displayed using R
Measurement of key factors affecting employee extra-role behaviour in Ministry of Municipalities and Public Works in Iraq
This study is about the employee extra role behavior and trust within an organization. The
researcher has examined many variables including the psychological support, trust in
management, reward expectation, management value and other motivational aspects of the
research. The researcher has also made use of quantitative research methods to gather the
information required for the research study which is limited to the Ministry of
Municipalities and Public Works (MMPW) in Iraq. The gathered information was tested
using Statistical Package for the Social Sciences (SPSS) for various tests such as
regression and reliability scale to check for the reliability and validity and the results were
very significant when the hypothesis were tested which were laid down earlier in the
research study by the researcher
A case study on cumulative logit models with low frequency and mixed effects
Master of ScienceDepartment of StatisticsPerla E. Reyes CuellarData with ordinal responses may be encountered in many research fields, such as social, medical, agriculture or financial sciences. In this paper, we present a case study on cumulative logit models with low frequency and mixed effects and discuss some strengths and limitations of the current methodology. Two plant pathologists requested our statistical advice to fit a cumulative logit mixed model seeking for the effect of six commercial products on the control of a seed and seedling disease in soybeans in vitro. In their attempt to estimate the model parameters using a generalized linear mixed model approach with PROC GLIMMIX, the model failed to converge. Three alternative approaches to solve the problem were examined: 1) stratifying the data searching for the random effect; 2) assuming the random effect would be small and reducing the model to a fixed model; and 3) combining the original categories of the response variable to a lower number of categories. In addition, we conducted a power analysis to evaluate the required sample size to detect treatment differences. The results of all the proposed solutions were similar. Collapsing categories for a cumulative/proportional odds model has little effect on estimation. The sample size used in the case study is enough to detect a large shift of frequencies between categories, but not for moderated changes. Moreover, we do not have enough information to estimate a random effect. Even when it is present, the results regarding the fixed factors: pathogen, evaluation day, and treatment effects are the same as the obtained by the fixed model alternatives. All six products had a significant effect in slowing the effect of the pathogen, but the effects vary between pathogen species and assessment timing or date
Estimation of Causal Effects Under K-Nearest Neighbors Interference
Considerable recent work has focused on methods for analyzing experiments
which exhibit treatment interference -- that is, when the treatment status of
one unit may affect the response of another unit. Such settings are common in
experiments on social networks. We consider a model of treatment interference
-- the K-nearest neighbors interference model (KNNIM) -- for which the response
of one unit depends not only on the treatment status given to that unit, but
also the treatment status of its ``closest'' neighbors. We derive causal
estimands under KNNIM in a way that allows us to identify how each of the
-nearest neighbors contributes to the indirect effect of treatment. We
propose unbiased estimators for these estimands and derive conservative
variance estimates for these unbiased estimators. We then consider extensions
of these estimators under an assumption of no weak interaction between direct
and indirect effects. We perform a simulation study to determine the efficacy
of these estimators under different treatment interference scenarios. We apply
our methodology to an experiment designed to assess the impact of a
conflict-reducing program in middle schools in New Jersey, and we give evidence
that the effect of treatment propagates primarily through a unit's closest
connection
- …