2,633 research outputs found
Chains of infinite order, chains with memory of variable length, and maps of the interval
We show how to construct a topological Markov map of the interval whose
invariant probability measure is the stationary law of a given stochastic chain
of infinite order. In particular we caracterize the maps corresponding to
stochastic chains with memory of variable length. The problem treated here is
the converse of the classical construction of the Gibbs formalism for Markov
expanding maps of the interval
Ensembles of probability estimation trees for customer churn prediction
Customer churn prediction is one of the most, important elements tents of a company's Customer Relationship Management, (CRM) strategy In tins study, two strategies are investigated to increase the lift. performance of ensemble classification models, i.e (1) using probability estimation trees (PETs) instead of standard decision trees as base classifiers; and (n) implementing alternative fusion rules based on lift weights lot the combination of ensemble member's outputs Experiments ale conducted lot font popular ensemble strategics on five real-life chin n data sets In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion tides However: the effect vanes lot the (Idol cut ensemble strategies lit particular, the results indicate an increase of lift performance of (1) Bagging by implementing C4 4 base classifiets. (n) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (in) AdaBoost, by implementing both
Recommended from our members
A scalable expressive ensemble learning using Random Prism: a MapReduce approach
The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the user’s trust in the system
Healthcare-use for Major Infectious Disease Syndromes in an Informal Settlement in Nairobi, Kenya
A healthcare-use survey was conducted in the Kibera informal settlement in Nairobi, Kenya, in July 2005 to inform subsequent surveillance in the site for infectious diseases. Sets of standardized questionnaires were administered to 1,542 caretakers and heads of households with one or more child(ren) aged less than five years. The average household-size was 5.1 (range 1-15) persons. Most (90%) resided in a single room with monthly rents of US$ 4.50-7.00. Within the previous two weeks, 49% of children (n=1,378) aged less than five years (under-five children) and 18% of persons (n=1,139) aged ≥5 years experienced febrile, diarrhoeal or respiratory illnesses. The large majority (>75%) of illnesses were associated with healthcare-seeking. While licensed clinics were the most-frequently visited settings, kiosks, unlicensed care providers, and traditional healers were also frequently visited. Expense was cited most often (50%) as the reason for not seeking healthcare. Of those who sought healthcare, 34-44% of the first and/or the only visits were made with non-licensed care providers, potentially delaying opportunities for early optimal intervention. The proportions of patients accessing healthcare facilities were higher with diarrhoeal disease and fever (but not for respiratory diseases in under-five children) than those reported from a contemporaneous study conducted in a rural area in Kenya. The findings support community-based rather than facility-based surveillance in this setting to achieve objectives for comprehensive assessment of the burden of disease
Comparative performance of some popular ANN algorithms on benchmark and function approximation problems
We report an inter-comparison of some popular algorithms within the
artificial neural network domain (viz., Local search algorithms, global search
algorithms, higher order algorithms and the hybrid algorithms) by applying them
to the standard benchmarking problems like the IRIS data, XOR/N-Bit parity and
Two Spiral. Apart from giving a brief description of these algorithms, the
results obtained for the above benchmark problems are presented in the paper.
The results suggest that while Levenberg-Marquardt algorithm yields the lowest
RMS error for the N-bit Parity and the Two Spiral problems, Higher Order
Neurons algorithm gives the best results for the IRIS data problem. The best
results for the XOR problem are obtained with the Neuro Fuzzy algorithm. The
above algorithms were also applied for solving several regression problems such
as cos(x) and a few special functions like the Gamma function, the
complimentary Error function and the upper tail cumulative
-distribution function. The results of these regression problems
indicate that, among all the ANN algorithms used in the present study,
Levenberg-Marquardt algorithm yields the best results. Keeping in view the
highly non-linear behaviour and the wide dynamic range of these functions, it
is suggested that these functions can be also considered as standard benchmark
problems for function approximation using artificial neural networks.Comment: 18 pages 5 figures. Accepted in Pramana- Journal of Physic
Recommended from our members
Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach
The detection of anatomical landmarks in bioimages is a necessary but tedious step for geometric morphometrics studies in many research domains. We propose variants of a multi-resolution tree-based approach to speed-up the detection of landmarks in bioimages. We extensively evaluate our method variants on three different datasets (cephalometric, zebrafish, and drosophila images). We identify the key method parameters (notably the multi-resolution) and report results with respect to human ground truths and existing methods. Our method achieves recognition performances competitive with current existing approaches while being generic and fast. The algorithms are integrated in the open-source Cytomine software and we provide parameter configuration guidelines so that they can be easily exploited by end-users. Finally, datasets are readily available through a Cytomine server to foster future research
High prevalence of <i>Rickettsia africae</i> variants in <i>Amblyomma variegatum</i> ticks from domestic mammals in rural western Kenya: implications for human health
Tick-borne spotted fever group (SFG) rickettsioses are emerging human diseases caused by obligate intracellular Gram-negative bacteria of the genus Rickettsia. Despite being important causes of systemic febrile illnesses in travelers returning from sub-Saharan Africa, little is known about the reservoir hosts of these pathogens. We conducted surveys for rickettsiae in domestic animals and ticks in a rural setting in western Kenya. Of the 100 serum specimens tested from each species of domestic ruminant 43% of goats, 23% of sheep, and 1% of cattle had immunoglobulin G (IgG) antibodies to the SFG rickettsiae. None of these sera were positive for IgG against typhus group rickettsiae. We detected Rickettsia africae–genotype DNA in 92.6% of adult Amblyomma variegatum ticks collected from domestic ruminants, but found no evidence of the pathogen in blood specimens from cattle, goats, or sheep. Sequencing of a subset of 21 rickettsia-positive ticks revealed R. africae variants in 95.2% (20/21) of ticks tested. Our findings show a high prevalence of R. africae variants in A. variegatum ticks in western Kenya, which may represent a low disease risk for humans. This may provide a possible explanation for the lack of African tick-bite fever cases among febrile patients in Kenya
Nutritional Status of Under-five Children Living in an Informal Urban Settlement in Nairobi, Kenya
Malnutrition in sub-Saharan Africa contributes to high rates of childhood morbidity and mortality. However, little information on the nutritional status of children is available from informal settlements. During the period of post-election violence in Kenya during December 2007–March 2008, food shortages were widespread within informal settlements in Nairobi. To investigate whether food insecurity due to post-election violence resulted in high prevalence of acute and chronic malnutrition in children, a nutritional survey was undertaken among children aged 6-59 months within two villages in Kibera, where the Kenya Medical Research Institute/Centers for Disease Control and Prevention conducts population-based surveillance for infectious disease syndromes. During 25 March–4 April 2008, a structured questionnaire was administered to caregivers of 1,310 children identified through surveillance system databases to obtain information on household demographics, food availability, and child-feeding practices. Anthropometric measurements were recorded on all participating children. Indices were reported in z-scores and compared with the World Health Organization (WHO) 2005 reference population to determine the nutritional status of children. Data were analyzed using the Anthro software of WHO and the SAS. Stunting was found in 47.0% of the children; 11.8% were underweight, and 2.6% were wasted. Severe stunting was found in 23.4% of the children; severe underweight in 3.1%, and severe wasting in 0.6%. Children aged 36-47 months had the highest prevalence (58.0%) of stunting while the highest prevalence (4.1%) of wasting was in children aged 6-11 months. Boys were more stunted than girls (p<0.01), and older children were significantly (p<0.0001) stunted compared to younger children. In the third year of life, girls were more likely than boys to be wasted (p<0.01). The high prevalence of chronic malnutrition suggests that stunting is a sustained problem within this urban informal settlement, not specifically resulting from the relatively brief political crisis. The predominance of stunting in older children indicates failure in growth and development during the first two years of life. Food programmes in Kenya have traditionally focused on rural areas and refugee camps. The findings of the study suggest that tackling childhood stunting is a high priority, and there should be fostered efforts to ensure that malnutrition-prevention strategies include the urban poor
Gaussian Fluctuation in Random Matrices
Let be the number of eigenvalues, in an interval of length , of a
matrix chosen at random from the Gaussian Orthogonal, Unitary or Symplectic
ensembles of by matrices, in the limit . We prove that has a Gaussian distribution when . This theorem, which
requires control of all the higher moments of the distribution, elucidates
numerical and exact results on chaotic quantum systems and on the statistics of
zeros of the Riemann zeta function. \noindent PACS nos. 05.45.+b, 03.65.-wComment: 13 page
- …