Search CORE

34,129 research outputs found

Population-aware Hierarchical Bayesian Domain Adaptation

Author: Chunara Rumi
Mhasawade Vishwali
Rehman Nabeel Abdur
Publication venue
Publication date: 20/11/2018
Field of study

Population attributes are essential in health for understanding who the data represents and precision medicine efforts. Even within disease infection labels, patients can exhibit significant variability; "fever" may mean something different when reported in a doctor's office versus from an online app, precluding directly learning across different datasets for the same prediction task. This problem falls into the domain adaptation paradigm. However, research in this area has to-date not considered who generates the data; symptoms reported by a woman versus a man, for example, could also have different implications. We propose a novel population-aware domain adaptation approach by formulating the domain adaptation task as a multi-source hierarchical Bayesian framework. The model improves prediction in the case of largely unlabelled target data by harnessing both domain and population invariant information.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.0721

arXiv.org e-Print Archive

Adaptation of WASH Services Delivery to Climate Change and Other Sources of Risk and Uncertainty

Author: A. J. James
Charles Batchelor
Stef Smits
Publication venue: IRC International Water and Sanitation Center
Publication date: 07/07/2011
Field of study

This report urges WASH sector practitioners to take more seriously the threat of climate change and the consequences it could have on their work. By considering climate change within a risk and uncertainty framework, the field can use the multitude of approaches laid out here to adequately protect itself against a range of direct and indirect impacts. Eleven methods and tools for this specific type of risk management are described, including practical advice on how to implement them successfully

Hybrid Recommender Systems: A Systematic Literature Review

Author: Morisio Maurizio
Çano Erion
Publication venue: 'IOS Press'
Publication date: 12/01/2019
Field of study

Recommender systems are software tools used to generate and provide suggestions for items and other entities to the users by exploiting various strategies. Hybrid recommender systems combine two or more recommendation strategies in different ways to benefit from their complementary advantages. This systematic literature review presents the state of the art in hybrid recommender systems of the last decade. It is the first quantitative review work completely focused in hybrid recommenders. We address the most relevant problems considered and present the associated data mining and recommendation techniques used to overcome them. We also explore the hybridization classes each hybrid recommender belongs to, the application domains, the evaluation process and proposed future research directions. Based on our findings, most of the studies combine collaborative filtering with another technique often in a weighted way. Also cold-start and data sparsity are the two traditional and top problems being addressed in 23 and 22 studies each, while movies and movie datasets are still widely used by most of the authors. As most of the studies are evaluated by comparisons with similar methods using accuracy metrics, providing more credible and user oriented evaluations remains a typical challenge. Besides this, newer challenges were also identified such as responding to the variation of user context, evolving user tastes or providing cross-domain recommendations. Being a hot topic, hybrid recommenders represent a good basis with which to respond accordingly by exploring newer opportunities such as contextualizing recommendations, involving parallel hybrid algorithms, processing larger datasets, etc.Comment: 38 pages, 9 figures, 14 tables. The final authenticated version is available online at https://content.iospress.com/articles/intelligent-data-analysis/ida16320

arXiv.org e-Print Archive

"In vivo" spam filtering: A challenge problem for data mining

Author: Fawcett Tom
Publication venue
Publication date: 01/01/2003
Field of study

Spam, also known as Unsolicited Commercial Email (UCE), is the bane of email communication. Many data mining researchers have addressed the problem of detecting spam, generally by treating it as a static text classification problem. True in vivo spam filtering has characteristics that make it a rich and challenging domain for data mining. Indeed, real-world datasets with these characteristics are typically difficult to acquire and to share. This paper demonstrates some of these characteristics and argues that researchers should pursue in vivo spam filtering as an accessible domain for investigating them

arXiv.org e-Print Archive

CiteSeerX

A decision support methodology to enhance the competitiveness of the Turkish automotive industry

Author: Aktas Emel
Aktaş Emel
Kabak Ozgur
Kabak Özgür
Onsel Sule
Ozaydin Ozay
Ulengin Fusun
Önsel Şule
Özaydın Özay
Ülengin Füsun
Publication venue: 'Elsevier BV'
Publication date: 01/05/2014
Field of study

This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2013 Elsevier B.V. All rights reserved.Three levels of competitiveness affect the success of business enterprises in a globally competitive environment: the competitiveness of the company, the competitiveness of the industry in which the company operates and the competitiveness of the country where the business is located. This study analyses the competitiveness of the automotive industry in association with the national competitiveness perspective using a methodology based on Bayesian Causal Networks. First, we structure the competitiveness problem of the automotive industry through a synthesis of expert knowledge in the light of the World Economic Forum’s competitiveness indicators. Second, we model the relationships among the variables identified in the problem structuring stage and analyse these relationships using a Bayesian Causal Network. Third, we develop policy suggestions under various scenarios to enhance the national competitive advantages of the automotive industry. We present an analysis of the Turkish automotive industry as a case study. It is possible to generalise the policy suggestions developed for the case of Turkish automotive industry to the automotive industries in other developing countries where country and industry competitiveness levels are similar to those of Turkey

CiteSeerX

Dogus University Institutional Repository

Brunel University Research Archive

The Application of Data Mining to Build Classification Model for Predicting Graduate Employment

Author: Jantawan Bangsuk
Tsai Cheng-Fa
Publication venue
Publication date: 26/12/2013
Field of study

Data mining has been applied in various areas because of its ability to rapidly analyze vast amounts of data. This study is to build the Graduates Employment Model using classification task in data mining, and to compare several of data-mining approaches such as Bayesian method and the Tree method. The Bayesian method includes 5 algorithms, including AODE, BayesNet, HNB, NaviveBayes, WAODE. The Tree method includes 5 algorithms, including BFTree, NBTree, REPTree, ID3, C4.5. The experiment uses a classification task in WEKA, and we compare the results of each algorithm, where several classification models were generated. To validate the generated model, the experiments were conducted using real data collected from graduate profile at the Maejo University in Thailand. The model is intended to be used for predicting whether a graduate was employed, unemployed, or in an undetermined situation

arXiv.org e-Print Archive

The future of statistical disclosure control

Author: Domingo-Ferrer Josep
Elliot Mark
Publication venue
Publication date: 21/12/2018
Field of study

Statistical disclosure control (SDC) was not created in a single seminal paper nor following the invention of a new mathematical technique, rather it developed slowly in response to the practical challenges faced by data practitioners based at national statistical institutes (NSIs). SDC's subsequent emergence as a specialised academic field was an outcome of three interrelated socio-technical changes: (i) the advent of accessible computing as a research tool in the 1980s meant that it became possible - and then increasingly easy - for researchers to process larger quantities of data automatically; this naturally increased demand for such data; (ii) it became possible for data holders to process and disseminate detailed data as digital files and (iii) the number of organisations holding data about individuals proliferated. This also meant the number of potential adversaries with the resources to attack any given dataset increased exponentially. In this article, we describe the state of the art for SDC and then discuss the core issues and future challenges. In particular, we touch on SDC and big data, on SDC and machine learning, and on SDC and anti-discrimination.Comment: A contributing article to the National Statistician's Quality Review into Privacy and Data Confidentiality Method

arXiv.org e-Print Archive

Uncertainty Aware AI ML: Why and How

Author: Cerutti Federico
Kaplan Lance
Preece Alun
Sensoy Murat
Sullivan Paul
Publication venue
Publication date: 20/09/2018
Field of study

This paper argues the need for research to realize uncertainty-aware artificial intelligence and machine learning (AI\&ML) systems for decision support by describing a number of motivating scenarios. Furthermore, the paper defines uncertainty-awareness and lays out the challenges along with surveying some promising research directions. A theoretical demonstration illustrates how two emerging uncertainty-aware ML and AI technologies could be integrated and be of value for a route planning operation.Comment: Presented at AAAI FSS-18: Artificial Intelligence in Government and Public Sector, Arlington, Virginia, US

arXiv.org e-Print Archive

How scientific research changes the Vietnamese higher education landscape: Evidence from social sciences and humanities between 2008 and 2019

Author: Dau The-Tung
Ho Manh-Toan
Nguyen Thanh-Hung
Nguyen Thi-Huyen-Trang
Nguyen Thi-Song-Ha
Tran Trung
Publication venue
Publication date: 01/01/2020
Field of study

Background: In the context of globalization, Vietnamese universities, whose primary function is teaching, there is a need to improve research performance. Methods: Based on SSHPA data, an exclusive database of Vietnamese social sciences and humanities researchers’ productivity, between 2008 and 2019 period, this study analyzes the research output of Vietnamese universities in the field of social sciences and humanities. Results: Vietnamese universities have been steadily producing a high volume of publications in the 2008-2019 period, with a peak of 598 articles in 2019. Moreover, many private universities and institutions are also joining the publication race, pushing competitiveness in the country. Conclusions: Solutions to improve both quantity and quality of Vietnamese universities’ research practice in the context of the industrial revolution 4.0 could be applying international criteria in Vietnamese higher education, developing scientific and critical thinking for general and STEM education, and promoting science communication

PhilPapers

Predictive Situation Awareness for Ebola Virus Disease using a Collective Intelligence Multi-Model Integration Platform: Bayes Cloud

Author: Ha Jubyung
Matsumoto Shou
Park Cheol Young
Park YoungWon
Publication venue
Publication date: 04/05/2019
Field of study

The humanity has been facing a plethora of challenges associated with infectious diseases, which kill more than 6 million people a year. Although continuous efforts have been applied to relieve the potential damages from such misfortunate events, it is unquestionable that there are many persisting challenges yet to overcome. One related issue we particularly address here is the assessment and prediction of such epidemics. In this field of study, traditional and ad-hoc models frequently fail to provide proper predictive situation awareness (PSAW), characterized by understanding the current situations and predicting the future situations. Comprehensive PSAW for infectious disease can support decision making and help to hinder disease spread. In this paper, we develop a computing system platform focusing on collective intelligence causal modeling, in order to support PSAW in the domain of infectious disease. Analyses of global epidemics require integration of multiple different data and models, which can be originated from multiple independent researchers. These models should be integrated to accurately assess and predict the infectious disease in terms of holistic view. The system shall provide three main functions: (1) collaborative causal modeling, (2) causal model integration, and (3) causal model reasoning. These functions are supported by subject-matter expert and artificial intelligence (AI), with uncertainty treatment. Subject-matter experts, as collective intelligence, develop causal models and integrate them as one joint causal model. The integrated causal model shall be used to reason about: (1) the past, regarding how the causal factors have occurred; (2) the present, regarding how the spread is going now; and (3) the future, regarding how it will proceed. Finally, we introduce one use case of predictive situation awareness for the Ebola virus disease

arXiv.org e-Print Archive