21,040 research outputs found
Efficient selection of globally optimal rules on large imbalanced data based on rule coverage relationship analysis
Copyright © SIAM. Rule-based anomaly and fraud detection systems often suffer from massive false alerts against a huge number of enterprise transactions. A crucial and challenging problem is to effectively select a globally optimal rule set which can capture very rare anomalies dispersed in large-scale background transactions. The existing rule selection methods which suffer significantly from complex rule interactions and overlapping in large imbalanced data, often lead to very high false positive rate. In this paper, we analyze the interactions and relationships between rules and their coverage on transactions, and propose a novel metric, Max Coverage Gain. Max Coverage Gain selects the optimal rule set by evaluating the contribution of each rule in terms of overall performance to cut out those locally significant but globally redundant rules, without any negative impact on the recall. An effective algorithm, MCGminer, is then designed with a series of built-in mechanisms and pruning strategies to handle complex rule interactions and reduce computational complexity towards identifying the globally optimal rule set. Substantial experiments on 13 UCI data sets and a real time online banking transactional database demonstrate that MCGminer achieves significant improvement on both accuracy, scalability, stability and efficiency on large imbalanced data compared to several state-of-the-art rule selection techniques
Interactions between landscape changes and host communities can regulate echinococcus multilocularis transmission
An area close to the Qinghai-Tibet plateau region and subject to intensive deforestation contains a large focus of human alveolar echinococcosis while sporadic human cases occur in the Doubs region of eastern France. The current review analyses and compares epidemiological and ecological results obtained in both regions. Analysis of rodent species assemblages within quantified rural landscapes in central China and eastern France shows a significant association between host species for the pathogenic helminth Echinococcus multilocularis, with prevalences of human alveolar echinococcosis and with land area under shrubland or grassland. This suggests that at the regional scale landscape can affect human disease distribution through interaction with small mammal communities and their population dynamics. Lidicker's ROMPA hypothesis helps to explain this association and provides a novel explanation of how landscape changes may result in increased risk of a rodent-borne zoonotic disease
A novel process for preparing PZT thick films
2000-2001 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe
Serosurvey of Coxiella burnetii (Q fever) in Dromedary Camels (Camelus dromedarius) in Laikipia County, Kenya
Dromedary camels (Camelus dromedarius) are an important protein source for people in semi-arid and arid regions of Africa. In Kenya, camel populations have grown dramatically in the past few decades resulting in the potential for increased disease transmission between humans and camels. An estimated four million Kenyans drink unpasteurized camel milk, which poses a disease risk. We evaluated the seroprevalence of a significant zoonotic pathogen, Coxiella burnetii (Q fever), among 334 camels from nine herds in Laikipia County, Kenya. Serum testing revealed 18.6% positive seroprevalence of Coxiella burnetii (n = 344). Increasing camel age was positively associated with C. burnetii seroprevalence (OR = 5.36). Our study confirmed that camels living in Laikipia County, Kenya, have been exposed to the zoonotic pathogen, C. burnetii. Further research to evaluate the role of camels in disease transmission to other livestock, wildlife and humans in Kenya should be conducted
Quantitative model for inferring dynamic regulation of the tumour suppressor gene p53
Background: The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets.
Results: This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region.
Conclusions: The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes
Anti-epileptic effect of Ganoderma lucidum polysaccharides by inhibition of intracellular calcium accumulation and stimulation of expression of CaMKII a in epileptic hippocampal neurons
Purpose: To investigate the mechanism of the anti-epileptic effect of Ganoderma lucidum polysaccharides (GLP), the changes of intracellular calcium and CaMK II a expression in a model of epileptic neurons were investigated.
Method: Primary hippocampal neurons were divided into: 1) Control group, neurons were cultured with Neurobasal medium, for 3 hours; 2) Model group I: neurons were incubated with Mg2+ free medium for 3 hours; 3) Model group II: neurons were incubated with Mg2+ free medium for 3 hours then cultured with the normal medium for a further 3 hours; 4) GLP group I: neurons were incubated with Mg2+ free medium containing GLP (0.375 mg/ml) for 3 hours; 5) GLP group II: neurons were incubated with Mg2+ free medium for 3 hours then cultured with a normal culture medium containing GLP for a further 3 hours. The CaMK II a protein expression was assessed by Western-blot. Ca2+ turnover in neurons was assessed using Fluo-3/AM which was added into the replacement medium and Ca2+ turnover was observed under a laser scanning confocal microscope.
Results: The CaMK II a expression in the model groups was less than in the control groups, however, in the GLP groups, it was higher than that observed in the model group. Ca2+ fluorescence intensity in GLP group I was significantly lower than that in model group I after 30 seconds, while in GLP group II, it was reduced significantly compared to model group II after 5 minutes.
Conclusion: GLP may inhibit calcium overload and promote CaMK II a expression to protect epileptic neuron
Promyelocytic leukemia nuclear bodies associate with transcriptionally active genomic regions
Cancer Research UK
Heterotic Sigma Models with N=2 Space-Time Supersymmetry
We study the non-linear sigma model realization of a heterotic vacuum with
N=2 space-time supersymmetry. We examine the requirements of (0,2) + (0,4)
world-sheet supersymmetry and show that a geometric vacuum must be described by
a principal two-torus bundle over a K3 manifold.Comment: 20 pages, uses xy-pic; v3: typos corrected, reference added,
discussion of constraints on Hermitian form modifie
SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods
In the last few years thousands of scientific papers have investigated
sentiment analysis, several startups that measure opinions on real data have
emerged and a number of innovative products related to this theme have been
developed. There are multiple methods for measuring sentiments, including
lexical-based and supervised machine learning methods. Despite the vast
interest on the theme and wide popularity of some methods, it is unclear which
one is better for identifying the polarity (i.e., positive or negative) of a
message. Accordingly, there is a strong need to conduct a thorough
apple-to-apple comparison of sentiment analysis methods, \textit{as they are
used in practice}, across multiple datasets originated from different data
sources. Such a comparison is key for understanding the potential limitations,
advantages, and disadvantages of popular methods. This article aims at filling
this gap by presenting a benchmark comparison of twenty-four popular sentiment
analysis methods (which we call the state-of-the-practice methods). Our
evaluation is based on a benchmark of eighteen labeled datasets, covering
messages posted on social networks, movie and product reviews, as well as
opinions and comments in news articles. Our results highlight the extent to
which the prediction performance of these methods varies considerably across
datasets. Aiming at boosting the development of this research area, we open the
methods' codes and datasets used in this article, deploying them in a benchmark
system, which provides an open API for accessing and comparing sentence-level
sentiment analysis methods
- …
