Search CORE

30 research outputs found

Temporal cascade model for analyzing spread in evolving networks

Author: Demirci Gunduz Vehbi
Ferhatosmanoglu Hakan
Haldar Aparajita
Oakley Joe
Wang Shuang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/01/2023
Field of study

Current approaches for modeling propagation in networks (e.g., of diseases, computer viruses, rumors) cannot adequately capture temporal properties such as order/duration of evolving connections or dynamic likelihoods of propagation along connections. Temporal models on evolving networks are crucial in applications that need to analyze dynamic spread. For example, a disease spreading virus has varying transmissibility based on interactions between individuals occurring with different frequency, proximity, and venue population density. Similarly, propagation of information having a limited active period, such as rumors, depends on the temporal dynamics of social interactions. To capture such behaviors, we first develop the Temporal Independent Cascade (T-IC) model with a spread function that efficiently utilizes a hypergraph-based sampling strategy and dynamic propagation probabilities. We prove this function to be submodular, with guarantees of approximation quality. This enables scalable analysis on highly granular temporal networks where other models struggle, such as when the spread across connections exhibits arbitrary temporally evolving patterns. We then introduce the notion of ‘reverse spread’ using the proposed T-IC processes, and develop novel solutions to identify both sentinel/detector nodes and highly susceptible nodes. Extensive analysis on real-world datasets shows that the proposed approach significantly outperforms the alternatives in modeling both if and how spread occurs, by considering evolving network topology alongside granular contact/interaction information. Our approach has numerous applications, such as virus/rumor/influence tracking. Utilizing T-IC, we explore vital challenges of monitoring the impact of various intervention strategies over real spatio-temporal contact networks where we show our approach to be highly effective

Warwick Research Archives Portal Repository

Changing the focus: worker-centric optimization in human-in-the-loop computations

Author: Esfandiari Mohammadreza
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2020
Field of study

A myriad of emerging applications from simple to complex ones involve human cognizance in the computation loop. Using the wisdom of human workers, researchers have solved a variety of problems, termed as “micro-tasks” such as, captcha recognition, sentiment analysis, image categorization, query processing, as well as “complex tasks” that are often collaborative, such as, classifying craters on planetary surfaces, discovering new galaxies (Galaxyzoo), performing text translation. The current view of “humans-in-the-loop” tends to see humans as machines, robots, or low-level agents used or exploited in the service of broader computation goals. This dissertation is developed to shift the focus back to humans, and study different data analytics problems, by recognizing characteristics of the human workers, and how to incorporate those in a principled fashion inside the computation loop. The first contribution of this dissertation is to propose an optimization framework and a real world system to personalize worker’s behavior by developing a worker model and using that to better understand and estimate task completion time. The framework judiciously frames questions and solicits worker feedback on those to update the worker model. Next, improving workers skills through peer interaction during collaborative task completion is studied. A suite of optimization problems are identified in that context considering collaborativeness between the members as it plays a major role in peer learning. Finally, “diversified” sequence of work sessions for human workers is designed to improve worker satisfaction and engagement while completing tasks

Digital Commons @ New Jersey Institute of Technology (NJIT)

Recommended from our members

Generalised Bayesian matrix factorisation models

Author: Mohamed Shakir
Publication venue: University of Cambridge
Publication date: 15/03/2011
Field of study

Factor analysis and related models for probabilistic matrix factorisation are of central importance to the unsupervised analysis of data, with a colourful history more than a century long. Probabilistic models for matrix factorisation allow us to explore the underlying structure in data, and have relevance in a vast number of application areas including collaborative filtering, source separation, missing data imputation, gene expression analysis, information retrieval, computational finance and computer vision, amongst others. This thesis develops generalisations of matrix factorisation models that advance our understanding and enhance the applicability of this important class of models. The generalisation of models for matrix factorisation focuses on three concerns: widening the applicability of latent variable models to the diverse types of data that are currently available; considering alternative structural forms in the underlying representations that are inferred; and including higher order data structures into the matrix factorisation framework. These three issues reflect the reality of modern data analysis and we develop new models that allow for a principled exploration and use of data in these settings. We place emphasis on Bayesian approaches to learning and the advantages that come with the Bayesian methodology. Our port of departure is a generalisation of latent variable models to members of the exponential family of distributions. This generalisation allows for the analysis of data that may be real-valued, binary, counts, non-negative or a heterogeneous set of these data types. The model unifies various existing models and constructs for unsupervised settings, the complementary framework to the generalised linear models in regression. Moving to structural considerations, we develop Bayesian methods for learning sparse latent representations. We define ideas of weakly and strongly sparse vectors and investigate the classes of prior distributions that give rise to these forms of sparsity, namely the scale-mixture of Gaussians and the spike-and-slab distribution. Based on these sparsity favouring priors, we develop and compare methods for sparse matrix factorisation and present the first comparison of these sparse learning approaches. As a second structural consideration, we develop models with the ability to generate correlated binary vectors. Moment-matching is used to allow binary data with specified correlation to be generated, based on dichotomisation of the Gaussian distribution. We then develop a novel and simple method for binary PCA based on Gaussian dichotomisation. The third generalisation considers the extension of matrix factorisation models to multi-dimensional arrays of data that are increasingly prevalent. We develop the first Bayesian model for non-negative tensor factorisation and explore the relationship between this model and the previously described models for matrix factorisation.Supported by a Commonwealth Scholarship awarded by the Commonwealth Scholarship and Fellowship Programme (CSFP) [Award number ZACS-2207-363] Supported by award from the National Research Foundation, South Africa (NRF) [Award number SFH2007072200001

Apollo (Cambridge)

Bottom-up Object Segmentation for Visual Recognition

Author: Carreira João Luís da Silva
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Automatic recognition and segmentation of objects in images is a central open problem in computer vision. Most previous approaches have pursued either sliding-window object detection or dense classification of overlapping local image patches. Differently, the framework introduced in this thesis attempts to identify the spatial extent of objects prior to recognition, using bottom-up computational processes and mid-level selection cues. After a set of plausible object hypotheses is identified, a sequential recognition process is executed, based on continuous estimates of the spatial overlap between the image segment hypotheses and each putative class. The object hypotheses are represented as figure-ground segmentations, and are extracted automatically, without prior knowledge of the properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. It is show that CPMC significantly outperforms the state of the art for low-level segmentation in the PASCAL VOC 2009 and 2010 datasets. Results beyond the current state of the art for image classification, object detection and semantic segmentation are also demonstrated in a number of challenging datasets including Caltech-101, ETHZ-Shape as well as PASCAL VOC 2009-11. These results suggest that a greater emphasis on grouping and image organization may be valuable for making progress in high-level tasks such as object recognition and scene understanding

bonndoc – Der Publikationsserver der Universität Bonn

Biomedical Data Analysis with Prior Knowledge : Modeling and Learning

Author: Lou Xinghua
Publication venue
Publication date: 01/01/2011
Field of study

Modern research in biology and medicine is experiencing a data explosion in quantity and particularly in complexity. Efficient and accurate processing of these datasets demands state-of-the-art computational methods such as probabilistic graphical models, graph-based image analysis and many inference/optimization algorithms. However, the underlying complexity of biomedical experiments rules out direct out-of-the-box applications of these methods and requires novel formulation and enhancement to make them amendable to specific problems. This thesis explores novel approaches for incorporating prior knowledge into the data analysis workflow that leads to quantitative and meaningful interpretation of the datasets and also allows for sufficient user involvement. As discussed in Chapter 1, depending on the complexity of the prior knowledge, these approaches can be categorized as constrained modeling and learning. The first part of the thesis focuses on constrained modeling where the prior is normally explicitly represented as additional potential terms in the problem formulation. These terms prevent or discourage the downstream optimization of the formulation from yielding solutions that contradict the prior knowledge. In Chapter 2, we present a robust method for estimating and tracking the deuterium incorporation in the time-resolved hydrogen exchange (HX) mass spectrometry (MS) experiments with priors such as sparsity and sequential ordering. In Chapter 3, we introduce how to extend a classic Markov random field (MRF) model with a shape prior for cell nucleus segmentation. The second part of the thesis explores learning which addresses problems where the prior varies between different datasets or is too difficult to express explicitly. In this case, the prior is first abstracted as a parametric model and then its optimum parametrization is estimated from a training set using machine learning techniques. In Chapter 4, we extend the popular Rand Index in a cost-sensitive fashion and the problem-specific costs can be learned from manual scorings. This set of approaches becomes more interesting when the input/output becomes structured such as matrices or graphs. In Chapter 5, we present structured learning for cell tracking, a novel approach that learns optimum parameters automatically from a training set and allows for the use of a richer set of features which in turn affords improved tracking performance. Finally, conclusions and outlook are provided in Chapter 6

Heidelberger Dokumentenserver

A Markov Random Field Based Approach to 3D Mosaicing and Registration Applied to Ultrasound Simulation

Author: Kutarnia Jason Francis
Publication venue: Digital WPI
Publication date: 27/08/2014
Field of study

A novel Markov Random Field (MRF) based method for the mosaicing of 3D ultrasound volumes is presented in this dissertation. The motivation for this work is the production of training volumes for an affordable ultrasound simulator, which offers a low-cost/portable training solution for new users of diagnostic ultrasound, by providing the scanning experience essential for developing the necessary psycho-motor skills. It also has the potential for introducing ultrasound instruction into medical education curriculums. The interest in ultrasound training stems in part from the widespread adoption of point-of-care scanners, i.e. low cost portable ultrasound scanning systems in the medical community. This work develops a novel approach for producing 3D composite image volumes and validates the approach using clinically acquired fetal images from the obstetrics department at the University of Massachusetts Medical School (UMMS). Results using the Visible Human Female dataset as well as an abdominal trauma phantom are also presented. The process is broken down into five distinct steps, which include individual 3D volume acquisition, rigid registration, calculation of a mosaicing function, group-wise non-rigid registration, and finally blending. Each of these steps, common in medical image processing, has been investigated in the context of ultrasound mosaicing and has resulted in improved algorithms. Rigid and non-rigid registration methods are analyzed in a probabilistic framework and their sensitivity to ultrasound shadowing artifacts is studied. The group-wise non-rigid registration problem is initially formulated as a maximum likelihood estimation, where the joint probability density function is comprised of the partially overlapping ultrasound image volumes. This expression is simplified using a block-matching methodology and the resulting discrete registration energy is shown to be equivalent to a Markov Random Field. Graph based methods common in computer vision are then used for optimization, resulting in a set of transformations that bring the overlapping volumes into alignment. This optimization is parallelized using a fusion approach, where the registration problem is divided into 8 independent sub-problems whose solutions are fused together at the end of each iteration. This method provided a speedup factor of 3.91 over the single threaded approach with no noticeable reduction in accuracy during our simulations. Furthermore, the registration problem is simplified by introducing a mosaicing function, which partitions the composite volume into regions filled with data from unique partially overlapping source volumes. This mosaicing functions attempts to minimize intensity and gradient differences between adjacent sources in the composite volume. Experimental results to demonstrate the performance of the group-wise registration algorithm are also presented. This algorithm is initially tested on deformed abdominal image volumes generated using a finite element model of the Visible Human Female to show the accuracy of its calculated displacement fields. In addition, the algorithm is evaluated using real ultrasound data from an abdominal phantom. Finally, composite obstetrics image volumes are constructed using clinical scans of pregnant subjects, where fetal movement makes registration/mosaicing especially difficult. Our solution to blending, which is the final step of the mosaicing process, is also discussed. The trainee will have a better experience if the volume boundaries are visually seamless, and this usually requires some blending prior to stitching. Also, regions of the volume where no data was collected during scanning should have an ultrasound-like appearance before being displayed in the simulator. This ensures the trainee\u27s visual experience isn\u27t degraded by unrealistic images. A discrete Poisson approach has been adapted to accomplish these tasks. Following this, we will describe how a 4D fetal heart image volume can be constructed from swept 2D ultrasound. A 4D probe, such as the Philips X6-1 xMATRIX Array, would make this task simpler as it can acquire 3D ultrasound volumes of the fetal heart in real-time; However, probes such as these aren\u27t widespread yet. Once the theory has been introduced, we will describe the clinical component of this dissertation. For the purpose of acquiring actual clinical ultrasound data, from which training datasets were produced, 11 pregnant subjects were scanned by experienced sonographers at the UMMS following an approved IRB protocol. First, we will discuss the software/hardware configuration that was used to conduct these scans, which included some custom mechanical design. With the data collected using this arrangement we generated seamless 3D fetal mosaics, that is, the training datasets, loaded them into our ultrasound training simulator, and then subsequently had them evaluated by the sonographers at the UMMS for accuracy. These mosaics were constructed from the raw scan data using the techniques previously introduced. Specific training objectives were established based on the input from our collaborators in the obstetrics sonography group. Important fetal measurements are reviewed, which form the basis for training in obstetrics ultrasound. Finally clinical images demonstrating the sonographer making fetal measurements in practice, which were acquired directly by the Philips iU22 ultrasound machine from one of our 11 subjects, are compared with screenshots of corresponding images produced by our simulator

DigitalCommons@WPI

A Markov Random Field Based Approach to 3D Mosaicing and Registration Applied to Ultrasound Simulation

Author: Kutarnia Jason Francis
Publication venue: Digital WPI
Publication date: 15/07/2011
Field of study

DigitalCommons@WPI

University of Maryland University College: UMUC Digital Repository

Exploring Diversity and Fairness in Machine Learning

Author: Schumann Candice
Publication venue
Publication date: 01/01/2020
Field of study

With algorithms, artificial intelligence, and machine learning becoming ubiquitous in our society, we need to start thinking about the implications and ethical concerns of new machine learning models. In fact, two types of biases that impact machine learning models are social injustice bias (bias created by society) and measurement bias (bias created by unbalanced sampling). Biases against groups of individuals found in machine learning models can be mitigated through the use of diversity and fairness constraints. This dissertation introduces models to help humans make decisions by enforcing diversity and fairness constraints. This work starts with a call to action. Bias is rife in hiring, and since algorithms are being used in multiple companies to filter applicants, we need to pay special attention to this application. Inspired by this hiring application, I introduce new multi-armed bandit frameworks to help assign human resources in the hiring process while enforcing diversity through a submodular utility function. These frameworks increase diversity while using less resources compared to original admission decisions of the Computer Science graduate program at the University of Maryland. Moving outside of hiring I present a contextual multi-armed bandit algorithm that enforces group fairness by learning a societal bias term and correcting for it. This algorithm is tested on two real world datasets and shows marked improvement over other in-use algorithms. Additionally I take a look at fairness in traditional machine learning domain adaptation. I provide the first theoretical analysis of this setting and test the resulting model on two deal world datasets. Finally I explore extensions to my core work, delving into suicidality, comprehension of fairness definitions, and student evaluations

Digital Repository at the University of Maryland

Probabilistic Models for Joint Segmentation, Detection and Tracking

Author: Sixta Tomáš
Publication venue: Czech Technical University in Prague. Computing and Information Centre.
Publication date
Field of study

Migrace buněk a buněčných částic hraje důležitou roli ve fungování živých organismů. Systematický výzkum buněčné migrace byl umožněn v posledních dvaceti letech rychlým rozvojem neinvazivních zobrazovacích technik a digitálních snímačů. Moderní zobrazovací systémy dovolují studovat chování buněčných populací složených z mnoha ticíců buněk. Manuální analýza takového množství dat by byla velice zdlouhavá, protože některé experimenty vyžadují analyzovat tvar, rychlost a další charakteristiky jednotlivých buněk. Z tohoto důvodu je ve vědecké komunitě velká poptávka po automatických metodách.Migration of cells and subcellular particles plays a crucial role in many processes in living organisms. Despite its importance a systematic research of cell motility has only been possible in last two decades due to rapid development of non-invasive imaging techniques and digital cameras. Modern imaging systems allow to study large populations with thousands of cells. Manual analysis of the acquired data is infeasible, because in order to gain insight into underlying biochemical processes it is sometimes necessary to determine shape, velocity and other characteristics of individual cells. Thus there is a high demand for automatic methods

Digital Library of the Czech Technical University in Prague