238 research outputs found
Identifying network communities with a high resolution
Community structure is an important property of complex networks. An
automatic discovery of such structure is a fundamental task in many
disciplines, including sociology, biology, engineering, and computer science.
Recently, several community discovery algorithms have been proposed based on
the optimization of a quantity called modularity (Q). However, the problem of
modularity optimization is NP-hard, and the existing approaches often suffer
from prohibitively long running time or poor quality. Furthermore, it has been
recently pointed out that algorithms based on optimizing Q will have a
resolution limit, i.e., communities below a certain scale may not be detected.
In this research, we first propose an efficient heuristic algorithm, Qcut,
which combines spectral graph partitioning and local search to optimize Q.
Using both synthetic and real networks, we show that Qcut can find higher
modularities and is more scalable than the existing algorithms. Furthermore,
using Qcut as an essential component, we propose a recursive algorithm, HQcut,
to solve the resolution limit problem. We show that HQcut can successfully
detect communities at a much finer scale and with a higher accuracy than the
existing algorithms. Finally, we apply Qcut and HQcut to study a
protein-protein interaction network, and show that the combination of the two
algorithms can reveal interesting biological results that may be otherwise
undetectable.Comment: 14 pages, 5 figures. 1 supplemental file at
http://cic.cs.wustl.edu/qcut/supplemental.pd
Multifactorial mortality in bongos and other wild ungulates in the north of the Congo Republic
Wildlife mortality involving bongos, Tragelaphus eurycerus, and other
ungulates was investigated in the north of the Congo Republic in 1997.
Four bongos, one forest buffalo, Syncerus caffer nanus, and one domestic
sheep were examined and sampled. Although an outbreak of rinderpest had
been suspected, it was found that the animals, which had been weakened
by an Elaeophora sagitta infection and possibly also by adverse
climatic conditions, had been exsanguinated and driven to exhaustion by
an unusual plague of Stomoxys omega.The articles have been scanned in colour with a HP Scanjet 5590; 600dpi.
Adobe Acrobat v.9 was used to OCR the text and also for the merging and conversion to the final presentation PDF-format.FAO.mn201
Inhibition in multiclass classification
The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions,
that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a
classification function that embodies unselective inhibition and train it in the large margin classifier framework. Inhibition leads to more robust classifiers in the sense that they perform better on larger areas of appropriate hyperparameters when assessed with leave-one-out strategies. We also show that the classifier with inhibition is a tight bound to probabilistic exponential models and is Bayes consistent for 3-class problems.
These properties make this approach useful for data sets with a limited number of labeled examples. For larger data sets, there is no significant comparative advantage to other multiclass SVM approaches
Inhibition in multiclass classification
The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions,
that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a
classification function that embodies unselective inhibition and train it in the large margin classifier framework. Inhibition leads to more robust classifiers in the sense that they perform better on larger areas of appropriate hyperparameters when assessed with leave-one-out strategies. We also show that the classifier with inhibition is a tight bound to probabilistic exponential models and is Bayes consistent for 3-class problems.
These properties make this approach useful for data sets with a limited number of labeled examples. For larger data sets, there is no significant comparative advantage to other multiclass SVM approaches
CSNL: A cost-sensitive non-linear decision tree algorithm
This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification.
The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes.
The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are
compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date.
The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster.
The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes
Inducing safer oblique trees without costs
Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the
distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification.
Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety.
This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Continent-Wide Survey Reveals Massive Decline in African Savannah Elephants
African elephants (Loxodonta africana) are imperiled by poaching and habitat loss. Despite global attention to the plight of elephants, their population sizes and trends are uncertain or unknown over much of Africa. To conserve this iconic species, conservationists need timely, accurate data on elephant populations. Here, we report the results of the Great Elephant Census (GEC), the first continent-wide, standardized survey of African savannah elephants. We also provide the first quantitative model of elephant population trends across Africa. We estimated a population of 352,271 savannah elephants on study sites in 18 countries, representing approximately 93% of all savannah elephants in those countries. Elephant populations in survey areas with historical data decreased by an estimated 144,000 from 2007 to 2014, and populations are currently shrinking by 8% per year continent-wide, primarily due to poaching. Though 84% of elephants occurred in protected areas, many protected areas had carcass ratios that indicated high levels of elephant mortality. Results of the GEC show the necessity of action to end the African elephants’ downward trajectory by preventing poaching and protecting habitat
- …