Search CORE

47,452 research outputs found

Phytoplankton Hotspot Prediction With an Unsupervised Spatial Community Model

Author: Dudek Gregory
Girdhar Yogesh
Kalmbach Arnold
Sosik Heidi M.
Publication venue
Publication date: 21/03/2017
Field of study

Many interesting natural phenomena are sparsely distributed and discrete. Locating the hotspots of such sparsely distributed phenomena is often difficult because their density gradient is likely to be very noisy. We present a novel approach to this search problem, where we model the co-occurrence relations between a robot's observations with a Bayesian nonparametric topic model. This approach makes it possible to produce a robust estimate of the spatial distribution of the target, even in the absence of direct target observations. We apply the proposed approach to the problem of finding the spatial locations of the hotspots of a specific phytoplankton taxon in the ocean. We use classified image data from Imaging FlowCytobot (IFCB), which automatically measures individual microscopic cells and colonies of cells. Given these individual taxon-specific observations, we learn a phytoplankton community model that characterizes the co-occurrence relations between taxa. We present experiments with simulated robot missions drawn from real observation data collected during a research cruise traversing the US Atlantic coast. Our results show that the proposed approach outperforms nearest neighbor and k-means based methods for predicting the spatial distribution of hotspots from in-situ observations.Comment: To appear in ICRA 2017, Singapor

arXiv.org e-Print Archive

Crossref

Transcription Factor-DNA Binding Via Machine Learning Ensembles

Author: DeLisi Charles
Fan Yue
Kon Mark
Publication venue
Publication date: 09/05/2018
Field of study

We present ensemble methods in a machine learning (ML) framework combining predictions from five known motif/binding site exploration algorithms. For a given TF the ensemble starts with position weight matrices (PWM's) for the motif, collected from the component algorithms. Using dimension reduction, we identify significant PWM-based subspaces for analysis. Within each subspace a machine classifier is built for identifying the TF's gene (promoter) targets (Problem 1). These PWM-based subspaces form an ML-based sequence analysis tool. Problem 2 (finding binding motifs) is solved by agglomerating k-mer (string) feature PWM-based subspaces that stand out in identifying gene targets. We approach Problem 3 (binding sites) with a novel machine learning approach that uses promoter string features and ML importance scores in a classification algorithm locating binding sites across the genome. For target gene identification this method improves performance (measured by the F1 score) by about 10 percentage points over the (a) motif scanning method and (b) the coexpression-based association method. Top motif outperformed 5 component algorithms as well as two other common algorithms (BEST and DEME). For identifying individual binding sites on a benchmark cross species database (Tompa et al., 2005) we match the best performer without much human intervention. It also improved the performance on mammalian TFs. The ensemble can integrate orthogonal information from different weak learners (potentially using entirely different types of features) into a machine learner that can perform consistently better for more TFs. The TF gene target identification component (problem 1 above) is useful in constructing a transcriptional regulatory network from known TF-target associations. The ensemble is easily extendable to include more tools as well as future PWM-based information.Comment: 33 page

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Four dimensions characterize comprehensive trait judgments of faces

Author: Adolphs Ralph
Keleş Ümit
Lin Chujun
Publication venue
Publication date: 02/10/2019
Field of study

People readily attribute many traits to faces: some look beautiful, some competent, some aggressive. These snap judgments have important consequences in real life, ranging from success in political elections to decisions in courtroom sentencing. Modern psychological theories argue that the hundreds of different words people use to describe others from their faces are well captured by only two or three dimensions, such as valence and dominance, a highly influential framework that has been the basis for numerous studies in social and developmental psychology, social neuroscience, and in engineering applications. However, all prior work has used only a small number of words (12 to 18) to derive underlying dimensions, limiting conclusions to date. Here we employed deep neural networks to select a comprehensive set of 100 words that are representative of the trait words people use to describe faces, and to select a set of 100 faces. In two large-scale, preregistered studies we asked participants to rate the 100 faces on the 100 words (obtaining 2,850,000 ratings from 1,710 participants), and discovered a novel set of four psychological dimensions that best explain trait judgments of faces: warmth, competence, femininity, and youth. We reproduced these four dimensions across different regions around the world, in both aggregated and individual-level data. These results provide a new and most comprehensive characterization of face judgments, and reconcile prior work on face perception with work in social cognition and personality psychology

Directory of Open Access Journals

PubMed Central

Caltech Authors

Recommended from our members

Determining citizens’ opinions about stories in the news media: analysing Google, Facebook and Twitter

Author: Fernandez Miriam
Geana Ruxandra
Sizov Sergej
Taylor Steve
Walland Paul
Wandhöfer Timo
Weichselbaum Robert
Publication venue
Publication date: 01/01/2012
Field of study

We describe a method whereby a governmental policy maker can discover citizens’ reaction to news stories. This is particularly relevant in the political world, where governments’ policy statements are reported by the news media and discussed by citizens. The work here addresses two main questions: whereabouts are citizens discussing a news story, and what are they saying? Our strategy to answer the first question is to find news articles pertaining to the policy statements, then perform internet searches for references to the news articles’ headlines and URLs. We have created a software tool that schedules repeating Google searches for the news articles and collects the results in a database, enabling the user to aggregate and analyse them to produce ranked tables of sites that reference the news articles. Using data mining techniques we can analyse data so that resultant ranking reflects an overall aggregate score, taking into account multiple datasets, and this shows the most relevant places on the internet where the story is discussed. To answer the second question, we introduce the WeGov toolbox as a tool for analysing citizens’ comments and behaviour pertaining to news stories. We first use the tool for identifying social network discussions, using different strategies for Facebook and Twitter. We apply different analysis components to analyse the data to distil the essence of the social network users’ comments, to determine influential users and identify important comments

Open Research Online (The Open University)

SLIM : Scalable Linkage of Mobility Data

Author: Atluri Gowtham
Basik Fuat
Corless Robert
E.
Goga Oana
Kieu Tung
Reynolds Douglas A.
Sharma Vishal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. In this paper, we first propose a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. In the experimental evaluation, SLIM outperforms the two existing state-of-the-art approaches in terms of precision and recall. Moreover, the LSH-based approach brings two to four orders of magnitude speedup

arXiv.org e-Print Archive

Crossref

Bilkent University Institutional Repository

Warwick Research Archives Portal Repository

Point triangulation through polyhedron collapse using the l∞ norm

Author: Donné Simon
Goossens Bart
Philips Wilfried
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Multi-camera triangulation of feature points based on a minimisation of the overall l(2) reprojection error can get stuck in suboptimal local minima or require slow global optimisation. For this reason, researchers have proposed optimising the l(infinity) norm of the l(2) single view reprojection errors, which avoids the problem of local minima entirely. In this paper we present a novel method for l(infinity) triangulation that minimizes the l(infinity) norm of the l(infinity) reprojection errors: this apparently small difference leads to a much faster but equally accurate solution which is related to the MLE under the assumption of uniform noise. The proposed method adopts a new optimisation strategy based on solving simple quadratic equations. This stands in contrast with the fastest existing methods, which solve a sequence of more complex auxiliary Linear Programming or Second Order Cone Problems. The proposed algorithm performs well: for triangulation, it achieves the same accuracy as existing techniques while executing faster and being straightforward to implement

Crossref

Ghent University Academic Bibliography