Search CORE

1,793 research outputs found

Combinatorial algorithm for counting small induced graphs and orbits

Author: Demšar Janez
Hočevar Tomaž
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 25/01/2016
Field of study

Graphlet analysis is an approach to network analysis that is particularly popular in bioinformatics. We show how to set up a system of linear equations that relate the orbit counts and can be used in an algorithm that is significantly faster than the existing approaches based on direct enumeration of graphlets. The algorithm requires existence of a vertex with certain properties; we show that such vertex exists for graphlets of arbitrary size, except for complete graphs and

C_4

, which are treated separately. Empirical analysis of running time agrees with the theoretical results

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Attribute Interactions in Medical Data Analysis

Author: Bratko Ivan
Demšar Janez
Jakulin Aleks
Smrke Dragica
Zupan Blaz
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the “interactions” between attributes become critical. We propose an approach to handling attribute interactions within the framework of “voting” classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifier’s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty

CiteSeerX

Crossref

ePrints.FRI

Seed selection for information cascade in multilayer networks

Author: E Omodei
F Erlandsson
J Demšar
M Kitsak
M Salehi
MJ Zaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/10/2017
Field of study

Information spreading is an interesting field in the domain of online social media. In this work, we are investigating how well different seed selection strategies affect the spreading processes simulated using independent cascade model on eighteen multilayer social networks. Fifteen networks are built based on the user interaction data extracted from Facebook public pages and tree of them are multilayer networks downloaded from public repository (two of them being Twitter networks). The results indicate that various state of the art seed selection strategies for single-layer networks like K-Shell or VoteRank do not perform so well on multilayer networks and are outperformed by Degree Centrality

arXiv.org e-Print Archive

Blekinge Institute of Technology

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Machine learning for content based image retrieving

Author: Demšar Janez
Solina Franc
Publication venue
Publication date
Field of study

GenePath: a System for Automated Construction of Genetic Networks from Mutant Data

Author: Bratko Ivan
Demšar Janez
Halter John
Juvan Peter
Kuspa Adam
Shaulsky Gad
Zupan Blaz
Publication venue
Publication date: 01/01/2003
Field of study

Motivation: Genetic pathways are often used in the analysis of biological phenomena. In classical genetics, they are constructed manually from experimental data on mutants. The field lacks formalism to guide such analysis, and accounting for all the data becomes complicated when large amounts of data are considered. Results: We have developed GenePath, an intelligent assistant that mimics expert geneticists in the analysis of genetic data. GenePath employs expert-defined patterns to uncover gene relations from the data, and uses these relations as constraints that guide the search for a plausible genetic network. GenePath provides formalism to genetic data analysis, facilitates the consideration of all the available data in a consistent and systematic manner, and aids in the examination of the large number of possible consequences of a planned experiment. It also provides an explanation mechanism that traces back every finding to the pertinent data. GenePath was successfully tested on several genetic problems. Availability: GenePath can be accessed at http://genepath.org. Supplementary information: Supplementary material is available at http://genepath.org/bi-supp

ePrints.FRI

Web-enabled knowledge-based analysis of genetic data

Author: Bratko Ivan
Demšar Janez
Halter John A.
Juvan Peter
Kuspa Adam
Shaulsky Gad
Zupan Blaz
Publication venue: Springer-Verlag Heidelberg
Publication date: 01/01/2001
Field of study

We present a web-based implementation of GenePath, an intelligent assistant tool for data analysis in functional genomics. GenePath considers mutant data and uses expert-defined patterns to find gene-to-gene or gene-to-outcome relations. It presents the results of analysis as genetic networks, wherein a set of genes has various influence on one another and on a biological outcome. In the paper, we particularly focus on its web-based interface and explanation mechanisms

CiteSeerX

Crossref

ePrints.FRI

Sequential Symbolic Regression with Genetic Programming

Author: D White
GY Lee
J Demšar
JA Walker
L Vanneschi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators

Crossref

Kent Academic Repository

Tekmovalno financiranje raziskovalne dejavnosti

Author: Bervar Aleš
Demšar Franci
Publication venue
Publication date: 15/10/2013
Field of study

Repository of University of Primorska

Automatic detection of potentially illegal online sales of elephant ivory via data mining

Author: Broad
Clark
Demšar
Haken
McNeill
O’Brien
Roe
Roe
Shepherd
Van Balen
Wittemyer
Publication venue: 'PeerJ'
Publication date: 01/07/2015
Field of study

In this work, we developed an automated system to detect potentially illegal elephant ivory items for sale on eBay. Two law enforcement experts, with specific knowledge of elephant ivory identification, manually classified items on sale in the Antiques section of eBay UK over an 8 week period. This set the “Gold Standard” that we aim to emulate using data-mining. We achieved close to 93% accuracy with less data than the experts, as we relied entirely on metadata, but did not employ item descriptions or associated images, thus proving the potential and generality of our approach. The reported accuracy may be improved with the addition of text mining techniques for the analysis of the item description, and by applying image classification for the detection of Schreger lines, indicative of elephant ivory. However, any solution relying on images or text description could not be employed on other wildlife illegal markets where pictures can be missing or misleading and text absent (e.g., Instagram). In our setting, we gave human experts all available information while only using minimal information for our analysis. Despite this, we succeeded at achieving a very high accuracy. This work is an important first step in speeding up the laborious, tedious and expensive task of expert discovery of illegal trade over the internet. It will also allow for faster reporting to law enforcement and better accountability. We hope this will also contribute to reducing poaching, by making this illegal trade harder and riskier for those involved

Crossref

Directory of Open Access Journals

Kent Academic Repository

Machine learning for content based image retrieving

Author: Demšar Janez
Solina Franc
Publication venue
Publication date
Field of study

ePrints.FRI