27,975 research outputs found
A Multi-Gene Genetic Programming Application for Predicting Students Failure at School
Several efforts to predict student failure rate (SFR) at school accurately
still remains a core problem area faced by many in the educational sector. The
procedure for forecasting SFR are rigid and most often times require data
scaling or conversion into binary form such as is the case of the logistic
model which may lead to lose of information and effect size attenuation. Also,
the high number of factors, incomplete and unbalanced dataset, and black boxing
issues as in Artificial Neural Networks and Fuzzy logic systems exposes the
need for more efficient tools. Currently the application of Genetic Programming
(GP) holds great promises and has produced tremendous positive results in
different sectors. In this regard, this study developed GPSFARPS, a software
application to provide a robust solution to the prediction of SFR using an
evolutionary algorithm known as multi-gene genetic programming. The approach is
validated by feeding a testing data set to the evolved GP models. Result
obtained from GPSFARPS simulations show its unique ability to evolve a suitable
failure rate expression with a fast convergence at 30 generations from a
maximum specified generation of 500. The multi-gene system was also able to
minimize the evolved model expression and accurately predict student failure
rate using a subset of the original expressionComment: 14 pages, 9 figures, Journal paper. arXiv admin note: text overlap
with arXiv:1403.0623 by other author
Statistical significance of variables driving systematic variation
There are a number of well-established methods such as principal components
analysis (PCA) for automatically capturing systematic variation due to latent
variables in large-scale genomic data. PCA and related methods may directly
provide a quantitative characterization of a complex biological variable that
is otherwise difficult to precisely define or model. An unsolved problem in
this context is how to systematically identify the genomic variables that are
drivers of systematic variation captured by PCA. Principal components (and
other estimates of systematic variation) are directly constructed from the
genomic variables themselves, making measures of statistical significance
artificially inflated when using conventional methods due to over-fitting. We
introduce a new approach called the jackstraw that allows one to accurately
identify genomic variables that are statistically significantly associated with
any subset or linear combination of principal components (PCs). The proposed
method can greatly simplify complex significance testing problems encountered
in genomics and can be utilized to identify the genomic variables significantly
associated with latent variables. Using simulation, we demonstrate that our
method attains accurate measures of statistical significance over a range of
relevant scenarios. We consider yeast cell-cycle gene expression data, and show
that the proposed method can be used to straightforwardly identify
statistically significant genes that are cell-cycle regulated. We also analyze
gene expression data from post-trauma patients, allowing the gene expression
data to provide a molecularly-driven phenotype. We find a greater enrichment
for inflammatory-related gene sets compared to using a clinically defined
phenotype. The proposed method provides a useful bridge between large-scale
quantifications of systematic variation and gene-level significance analyses.Comment: 35 pages, 1 table, 6 main figures, 7 supplementary figure
Systems approaches and algorithms for discovery of combinatorial therapies
Effective therapy of complex diseases requires control of highly non-linear
complex networks that remain incompletely characterized. In particular, drug
intervention can be seen as control of signaling in cellular networks.
Identification of control parameters presents an extreme challenge due to the
combinatorial explosion of control possibilities in combination therapy and to
the incomplete knowledge of the systems biology of cells. In this review paper
we describe the main current and proposed approaches to the design of
combinatorial therapies, including the empirical methods used now by clinicians
and alternative approaches suggested recently by several authors. New
approaches for designing combinations arising from systems biology are
described. We discuss in special detail the design of algorithms that identify
optimal control parameters in cellular networks based on a quantitative
characterization of control landscapes, maximizing utilization of incomplete
knowledge of the state and structure of intracellular networks. The use of new
technology for high-throughput measurements is key to these new approaches to
combination therapy and essential for the characterization of control
landscapes and implementation of the algorithms. Combinatorial optimization in
medical therapy is also compared with the combinatorial optimization of
engineering and materials science and similarities and differences are
delineated.Comment: 25 page
A hybrid algorithm for Bayesian network structure learning with application to multi-label learning
We present a novel hybrid algorithm for Bayesian network structure learning,
called H2PC. It first reconstructs the skeleton of a Bayesian network and then
performs a Bayesian-scoring greedy hill-climbing search to orient the edges.
The algorithm is based on divide-and-conquer constraint-based subroutines to
learn the local structure around a target variable. We conduct two series of
experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is
currently the most powerful state-of-the-art algorithm for Bayesian network
structure learning. First, we use eight well-known Bayesian network benchmarks
with various data sizes to assess the quality of the learned structure returned
by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in
terms of goodness of fit to new data and quality of the network structure with
respect to the true dependence structure of the data. Second, we investigate
H2PC's ability to solve the multi-label learning problem. We provide
theoretical results to characterize and identify graphically the so-called
minimal label powersets that appear as irreducible factors in the joint
distribution under the faithfulness condition. The multi-label learning problem
is then decomposed into a series of multi-class classification problems, where
each multi-class variable encodes a label powerset. H2PC is shown to compare
favorably to MMHC in terms of global classification accuracy over ten
multi-label data sets covering different application domains. Overall, our
experiments support the conclusions that local structural learning with H2PC in
the form of local neighborhood induction is a theoretically well-motivated and
empirically effective learning framework that is well suited to multi-label
learning. The source code (in R) of H2PC as well as all data sets used for the
empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author
Causality, Information and Biological Computation: An algorithmic software approach to life, disease and the immune system
Biology has taken strong steps towards becoming a computer science aiming at
reprogramming nature after the realisation that nature herself has reprogrammed
organisms by harnessing the power of natural selection and the digital
prescriptive nature of replicating DNA. Here we further unpack ideas related to
computability, algorithmic information theory and software engineering, in the
context of the extent to which biology can be (re)programmed, and with how we
may go about doing so in a more systematic way with all the tools and concepts
offered by theoretical computer science in a translation exercise from
computing to molecular biology and back. These concepts provide a means to a
hierarchical organization thereby blurring previously clear-cut lines between
concepts like matter and life, or between tumour types that are otherwise taken
as different and may not have however a different cause. This does not diminish
the properties of life or make its components and functions less interesting.
On the contrary, this approach makes for a more encompassing and integrated
view of nature, one that subsumes observer and observed within the same system,
and can generate new perspectives and tools with which to view complex diseases
like cancer, approaching them afresh from a software-engineering viewpoint that
casts evolution in the role of programmer, cells as computing machines, DNA and
genes as instructions and computer programs, viruses as hacking devices, the
immune system as a software debugging tool, and diseases as an
information-theoretic battlefield where all these forces deploy. We show how
information theory and algorithmic programming may explain fundamental
mechanisms of life and death.Comment: 30 pages, 8 figures. Invited chapter contribution to Information and
Causality: From Matter to Life. Sara I. Walker, Paul C.W. Davies and George
Ellis (eds.), Cambridge University Pres
Data based identification and prediction of nonlinear and complex dynamical systems
We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin
Political pressures and exchange rate stability in emerging market economies
This paper presents a political economy model of exchange rate policy. The theory is based on a common agency approach with rational expectations. Financial and exporter lobbies exert political pressures to influence the governmentâs choice of exchange rate policy, before shocks to the economy are realized. The model shows that political pressures affect exchange rate policy and create an over-commitment to exchange rate stability. This helps to rationalize the empirical evidence on fear of large currency swings that characterizes exchange rate policy of many emerging market economies. Moreover, the model suggests that the effects of political pressures on the exchange rate are lower if the quality of institutions is higher. Empirical evidence for a large sample of emerging market economies is consistent with these findings.exporters and financial lobbies, exchange rate stability
Receptor uptake arrays for vitamin B12, siderophores and glycans shape bacterial communities
Molecular variants of vitamin B12, siderophores and glycans occur. To take up
variant forms, bacteria may express an array of receptors. The gut microbe
Bacteroides thetaiotaomicron has three different receptors to take up variants
of vitamin B12 and 88 receptors to take up various glycans. The design of
receptor arrays reflects key processes that shape cellular evolution.
Competition may focus each species on a subset of the available nutrient
diversity. Some gut bacteria can take up only a narrow range of carbohydrates,
whereas species such as B.~thetaiotaomicron can digest many different complex
glycans. Comparison of different nutrients, habitats, and genomes provide
opportunity to test hypotheses about the breadth of receptor arrays. Another
important process concerns fluctuations in nutrient availability. Such
fluctuations enhance the value of cellular sensors, which gain information
about environmental availability and adjust receptor deployment. Bacteria often
adjust receptor expression in response to fluctuations of particular
carbohydrate food sources. Some species may adjust expression of uptake
receptors for specific siderophores. How do cells use sensor information to
control the response to fluctuations? That question about regulatory wiring
relates to problems that arise in control theory and artificial intelligence.
Control theory clarifies how to analyze environmental fluctuations in relation
to the design of sensors and response systems. Recent advances in deep learning
studies of artificial intelligence focus on the architecture of regulatory
wiring and the ways in which complex control networks represent and classify
environmental states. I emphasize the similar design problems that arise in
cellular evolution, control theory, and artificial intelligence. I connect
those broad concepts to testable hypotheses for bacterial uptake of B12,
siderophores and glycans.Comment: Added many new references, edited throughou
From Bounded Rationality to Behavioral Economics
The paper provides an brief overview of the âstate of the artâ in the theory of rational decision making since the 1950âs, and focuses specially on the evolutionary justification of rationality. It is claimed that this justification, and more generally the economic methodology inherited from the Chicago school, becomes untenable once taking into account Kauffmanâs Nk model, showing that if evolution it is based on trial-and-error search process, it leads generally to sub- optimal stable solutions: the âas ifâ justification of perfect rationality proves therefore to be a fallacious metaphor. The normative interpretation of decision-making theory is therefore questioned, and the two challenging views against this approach , Simonâs bounded rationality and Allaisâ criticism to expected utility theory are discussed. On this ground it is shown that the cognitive characteristics of choice processes are becoming more and more important for explanation of economic behavior and of deviations from rationality. In particular, according to Kahnemanâs Nobel Lecture, it is suggested that the distinction between two types of cognitive processes â the effortful process of deliberate reasoning on the one hand, and the automatic process of unconscious intuition on the other â can provide a different map with which to explain a broad class of deviations from pure âolympianâ rationality. This view requires re-establishing and revising connections between psychology and economics: an on-going challenge against the normative approach to economic methodology.Bounded Rationality, Behavioral Economics, Evolution, As If
- âŠ