8,676 research outputs found
Understanding Urban Demand for Wild Meat in Vietnam: Implications for Conservation Actions
Vietnam is a significant consumer of wildlife, particularly wild meat, in urban restaurant settings. To meet this demand, poaching of wildlife is widespread, threatening regional and international biodiversity. Previous interventions to tackle illegal and potentially unsustainable consumption of wild meat in Vietnam have generally focused on limiting supply. While critical, they have been impeded by a lack of resources, the presence of increasingly organised criminal networks and corruption. Attention is, therefore, turning to the consumer, but a paucity of research investigating consumer demand for wild meat will impede the creation of effective consumer-centred interventions. Here we used a mixed-methods research approach comprising a hypothetical choice modelling survey and qualitative interviews to explore the drivers of wild meat consumption and consumer preferences among residents of Ho Chi Minh City, Vietnam. Our findings indicate that demand for wild meat is heterogeneous and highly context specific. Wild-sourced, rare, and expensive wild meat-types are eaten by those situated towards the top of the societal hierarchy to convey wealth and status and are commonly consumed in lucrative business contexts. Cheaper, legal and farmed substitutes for wild-sourced meats are also consumed, but typically in more casual consumption or social drinking settings. We explore the implications of our results for current conservation interventions in Vietnam that attempt to tackle illegal and potentially unsustainable trade in and consumption of wild meat and detail how our research informs future consumer-centric conservation actions
A Winnow-Based Approach to Context-Sensitive Spelling Correction
A large class of machine-learning problems in natural language require the
characterization of linguistic context. Two characteristic properties of such
problems are that their feature space is of very high dimensionality, and their
target concepts refer to only a small subset of the features in the space.
Under such conditions, multiplicative weight-update algorithms such as Winnow
have been shown to have exceptionally good theoretical properties. We present
an algorithm combining variants of Winnow and weighted-majority voting, and
apply it to a problem in the aforementioned class: context-sensitive spelling
correction. This is the task of fixing spelling errors that happen to result in
valid words, such as substituting "to" for "too", "casual" for "causal", etc.
We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a
statistics-based method representing the state of the art for this task. We
find: (1) When run with a full (unpruned) set of features, WinSpell achieves
accuracies significantly higher than BaySpell was able to achieve in either the
pruned or unpruned condition; (2) When compared with other systems in the
literature, WinSpell exhibits the highest performance; (3) The primary reason
that WinSpell outperforms BaySpell is that WinSpell learns a better linear
separator; (4) When run on a test set drawn from a different corpus than the
training set was drawn from, WinSpell is better able than BaySpell to adapt,
using a strategy we will present that combines supervised learning on the
training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language
Learning, 1999. 25 page
Routes for breaching and protecting genetic privacy
We are entering the era of ubiquitous genetic information for research,
clinical care, and personal curiosity. Sharing these datasets is vital for
rapid progress in understanding the genetic basis of human diseases. However,
one growing concern is the ability to protect the genetic privacy of the data
originators. Here, we technically map threats to genetic privacy and discuss
potential mitigation strategies for privacy-preserving dissemination of genetic
data.Comment: Draft for comment
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparison of Classifier Performance
Public health care systems routinely collect health-related data from the population. This data can be analyzed using data mining techniques to find novel, interesting patterns, which could help formulate effective public health policies and interventions. The occurrence of chronic illness is rare in the population and the effect of this class imbalance, on the performance of various classifiers was studied. The objective of this work is to identify the best classifiers for class imbalanced health datasets through a cost-based comparison of classifier performance. The popular, open-source data mining tool WEKA, was used to build a variety of core classifiers as well as classifier ensembles, to evaluate the classifiers’ performance. The unequal misclassification costs were represented in a cost matrix, and cost-benefit analysis was also performed.  In another experiment, various sampling methods such as under-sampling, over-sampling, and SMOTE was performed to balance the class distribution in the dataset, and the costs were compared. The Bayesian classifiers performed well with a high recall, low number of false negatives and were not affected by the class imbalance. Results confirm that total cost of Bayesian classifiers can be further reduced using cost-sensitive learning methods. Classifiers built using the random under-sampled dataset showed a dramatic drop in costs and high classification accuracy
- …