1,310 research outputs found
Design, Analysis and Engineering of Algorithms for Closeness Centrality
L'identificazione dei vertici centrali nelle reti di grandi dimensioni è fondamentale in diverse applicazioni. La closeness centrality è una delle più popolari misure di centralità . Il suo calcolo esatto, seppur ottenibile in un tempo polinomiale, è infattibile in pratica se applicato su grafi di grandi dimensioni. L'obbiettivo di questo lavoro verte sul concepimento e sul'analisi di nuovi approcci efficienti per la stima della closeness cen
Degree of Scaffolding: Learning Objective Metadata: A Prototype Leaning System Design for Integrating GIS into a Civil Engineering Curriculum
Digital media and networking offer great potential as tools for enhancing classroom learning environments, both local and distant. One concept and related technological tool that can facilitate the effective application and distribution of digital educational resources is learning objects in combination with the SCORM (sharable content objects reference model) compliance framework. Progressive scaffolding is a learning design approach for educational systems that provides flexible guidance to students. We are in the process of utilizing this approach within a SCORM framework in the form of a multi-level instructional design. The associated metadata required by SCORM will describe the degree of scaffolding. This paper will discuss progressive scaffolding as it relates to SCORM compliant learning objects, within the context of the design of an application for integrating Geographic Information Systems (GIS) into the civil engineering curriculum at the University of Missouri - Rolla
Analyze Large Multidimensional Datasets Using Algebraic Topology
This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework
Evolving Creativity : An Analysis of the Creative Method in elBulli Restaurant
In this article we present an analysis of the creative
method developed in the restaurant elBulli (www.elbulli.com)
over the period 1987-2005. elBulli has been the 5-time recipient
of the Best Restaurant in the World by Restaurant Magazine,
and media, professionals and scientists have recognized the
global impact of its work in the food industry over the last two
decades. This impact is closely connected to the model of
evolving creativity that elBulli team has implemented and
refined over the years. We combine the qualitative study of
documents produced by elBulli restaurant with networks
analysis in order to represent a model of evolving creativity
that can be applied to other domains and industries.Junta de AndalucĂa TIC-606
Compressing DNA sequence databases with coil
Background: Publicly available DNA sequence databases such as GenBank are large, and are
growing at an exponential rate. The sheer volume of data being dealt with presents serious storage
and data communications problems. Currently, sequence data is usually kept in large "flat files,"
which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which
rarely achieves good compression ratios. While much research has been done on compressing
individual DNA sequences, surprisingly little has focused on the compression of entire databases
of such sequences. In this study we introduce the sequence database compression software coil.
Results: We have designed and implemented a portable software package, coil, for compressing
and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared
towards achieving high compression ratios at the expense of execution time and memory usage
during compression – the compression time represents a "one-off investment" whose cost is
quickly amortised if the resulting compressed file is transmitted many times. Decompression
requires little memory and is extremely fast. We demonstrate a 5% improvement in compression
ratio over state-of-the-art general-purpose compression tools for a large GenBank database file
containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental
additions to a sequence database.
Conclusion: coil presents a compelling alternative to conventional compression of flat files for the
storage and distribution of DNA sequence databases having a narrow distribution of sequence
lengths, such as EST data. Increasing compression levels for databases having a wide distribution of
sequence lengths is a direction for future work
Neural Face Editing with Intrinsic Image Disentangling
Traditional face editing methods often require a number of sophisticated and
task specific algorithms to be applied one after the other --- a process that
is tedious, fragile, and computationally intensive. In this paper, we propose
an end-to-end generative adversarial network that infers a face-specific
disentangled representation of intrinsic face properties, including shape (i.e.
normals), albedo, and lighting, and an alpha matte. We show that this network
can be trained on "in-the-wild" images by incorporating an in-network
physically-based image formation module and appropriate loss functions. Our
disentangling latent representation allows for semantically relevant edits,
where one aspect of facial appearance can be manipulated while keeping
orthogonal properties fixed, and we demonstrate its use for a number of facial
editing applications.Comment: CVPR 2017 ora
Unsupervised Discovery and Representation of Subspace Trends in Massive Biomedical Datasets
The goal of this dissertation is to develop unsupervised algorithms for discovering previously unknown subspace trends in massive multivariate biomedical data sets without the benefit of prior information. A subspace trend is a sustained pattern of gradual/progressive changes within an unknown subset of feature dimensions. A fundamental challenge to subspace trend discovery is the presence of irrelevant data dimensions, noise, outliers, and confusion from multiple subspace trends driven by independent factors that are mixed in with each other. These factors can obscure the trends in traditional dimension reduction and projection based data visualizations. To overcome these limitations, we propose a novel graph-theoretic neighborhood similarity measure for sensing concordant progressive changes across data dimensions. Using this measure, we present an unsupervised algorithm for trend-relevant feature selection and visualization. Additionally, we propose to use an efficient online density-based representation to make the algorithm scalable for massive datasets.
The representation not only assists in trend discovery, but also in cluster detection including rare populations. Our method has been successfully applied to diverse synthetic and real-world biomedical datasets, such as gene expression microarray and arbor morphology of neurons and microglia in brain tissue. Derived representations revealed biologically meaningful hidden subspace trend(s) that were obscured by irrelevant features and noise. Although our applications are mostly from the biomedical domain, the proposed algorithm is broadly applicable to exploratory analysis of high-dimensional data including visualization, hypothesis generation, knowledge discovery, and prediction in diverse other applications.Electrical and Computer Engineering, Department o
Doctor of Philosophy
dissertationA broad range of applications capture dynamic data at an unprecedented scale. Independent of the application area, finding intuitive ways to understand the dynamic aspects of these increasingly large data sets remains an interesting and, to some extent, unsolved research problem. Generically, dynamic data sets can be described by some, often hierarchical, notion of feature of interest that exists at each moment in time, and those features evolve across time. Consequently, exploring the evolution of these features is considered to be one natural way of studying these data sets. Usually, this process entails the ability to: 1) define and extract features from each time step in the data set; 2) find their correspondences over time; and 3) analyze their evolution across time. However, due to the large data sizes, visualizing the evolution of features in a comprehensible manner and performing interactive changes are challenging. Furthermore, feature evolution details are often unmanageably large and complex, making it difficult to identify the temporal trends in the underlying data. Additionally, many existing approaches develop these components in a specialized and standalone manner, thus failing to address the general task of understanding feature evolution across time. This dissertation demonstrates that interactive exploration of feature evolution can be achieved in a non-domain-specific manner so that it can be applied across a wide variety of application domains. In particular, a novel generic visualization and analysis environment that couples a multiresolution unified spatiotemporal representation of features with progressive layout and visualization strategies for studying the feature evolution across time is introduced. This flexible framework enables on-the-fly changes to feature definitions, their correspondences, and other arbitrary attributes while providing an interactive view of the resulting feature evolution details. Furthermore, to reduce the visual complexity within the feature evolution details, several subselection-based and localized, per-feature parameter value-based strategies are also enabled. The utility and generality of this framework is demonstrated by using several large-scale dynamic data sets
- …