6 research outputs found
Geosocial Graph-Based Community Detection
We apply spectral clustering and multislice modularity optimization to a Los
Angeles Police Department field interview card data set. To detect communities
(i.e., cohesive groups of vertices), we use both geographic and social
information about stops involving street gang members in the LAPD district of
Hollenbeck. We then compare the algorithmically detected communities with known
gang identifications and argue that discrepancies are due to sparsity of social
connections in the data as well as complex underlying sociological factors that
blur distinctions between communities.Comment: 5 pages, 4 figures Workshop paper for the IEEE International
Conference on Data Mining 2012: Workshop on Social Media Analysis and Minin
Multislice Modularity Optimization in Community Detection and Image Segmentation
Because networks can be used to represent many complex systems, they have
attracted considerable attention in physics, computer science, sociology, and
many other disciplines. One of the most important areas of network science is
the algorithmic detection of cohesive groups (i.e., "communities") of nodes. In
this paper, we algorithmically detect communities in social networks and image
data by optimizing multislice modularity. A key advantage of modularity
optimization is that it does not require prior knowledge of the number or sizes
of communities, and it is capable of finding network partitions that are
composed of communities of different sizes. By optimizing multislice modularity
and subsequently calculating diagnostics on the resulting network partitions,
it is thereby possible to obtain information about network structure across
multiple system scales. We illustrate this method on data from both social
networks and images, and we find that optimization of multislice modularity
performs well on these two tasks without the need for extensive
problem-specific adaptation. However, improving the computational speed of this
method remains a challenging open problem.Comment: 3 pages, 2 figures, to appear in IEEE International Conference on
Data Mining PhD forum conference proceeding
Criminal Group Dynamics and Network Methods
Value – Network methods provide a means to revisit and extend theories of crime and delinquency with a focus on social structure. The unique affinity between group dynamics and network methods highlights immense opportunities for expanding the knowledge of collective trajectories
Variational methods for geometric statistical inference
Estimating multiple geometric shapes such as tracks or surfaces creates significant mathematical challenges particularly in the presence of unknown data association. In particular, problems of this type have two major challenges. The first is typically the object of interest is infinite dimensional whilst data is finite dimensional. As a result the inverse problem is ill-posed without regularization. The second is the data association makes the likelihood function highly oscillatory.
The focus of this thesis is on techniques to validate approaches to estimating problems in geometric statistical inference. We use convergence of the large data limit as an indicator of robustness of the methodology. One particular advantage of our approach is that we can prove convergence under modest conditions on the data generating process. This allows one to apply the theory where very little is known about the data. This indicates a robustness in applications to real world problems.
The results of this thesis therefore concern the asymptotics for a selection of statistical inference problems. We construct our estimates as the minimizer of an appropriate functional and look at what happens in the large data limit. In each case we will show our estimates converge to a minimizer of a limiting functional. In certain cases we also add rates of convergence.
The emphasis is on problems which contain a data association or classification component. More precisely we study a generalized version of the k-means method which is suitable for estimating multiple trajectories from unlabeled data which combines data association with spline smoothing. Another problem considered is a graphical approach to estimating the labeling of data points. Our approach uses minimizers of the Ginzburg-Landau functional on a suitably defined graph.
In order to study these problems we use variational techniques and in particular I-convergence. This is the natural framework to use for studying sequences of minimization problems. A key advantage of this approach is that it allows us to deal with infinite dimensional and highly oscillatory functionals