57,612 research outputs found
Bipartite graph partitioning and data clustering
Many data types arising from data mining applications can be modeled as
bipartite graphs, examples include terms and documents in a text corpus,
customers and purchasing items in market basket analysis and reviewers and
movies in a movie recommender system. In this paper, we propose a new data
clustering method based on partitioning the underlying bipartite graph. The
partition is constructed by minimizing a normalized sum of edge weights between
unmatched pairs of vertices of the bipartite graph. We show that an approximate
solution to the minimization problem can be obtained by computing a partial
singular value decomposition (SVD) of the associated edge weight matrix of the
bipartite graph. We point out the connection of our clustering algorithm to
correspondence analysis used in multivariate analysis. We also briefly discuss
the issue of assigning data objects to multiple clusters. In the experimental
results, we apply our clustering algorithm to the problem of document
clustering to illustrate its effectiveness and efficiency.Comment: Proceedings of ACM CIKM 2001, the Tenth International Conference on
Information and Knowledge Management, 200
Monte Carlo Methods for Top-k Personalized PageRank Lists and Name Disambiguation
We study a problem of quick detection of top-k Personalized PageRank lists.
This problem has a number of important applications such as finding local cuts
in large graphs, estimation of similarity distance and name disambiguation. In
particular, we apply our results to construct efficient algorithms for the
person name disambiguation problem. We argue that when finding top-k
Personalized PageRank lists two observations are important. Firstly, it is
crucial that we detect fast the top-k most important neighbours of a node,
while the exact order in the top-k list as well as the exact values of PageRank
are by far not so crucial. Secondly, a little number of wrong elements in top-k
lists do not really degrade the quality of top-k lists, but it can lead to
significant computational saving. Based on these two key observations we
propose Monte Carlo methods for fast detection of top-k Personalized PageRank
lists. We provide performance evaluation of the proposed methods and supply
stopping criteria. Then, we apply the methods to the person name disambiguation
problem. The developed algorithm for the person name disambiguation problem has
achieved the second place in the WePS 2010 competition
Building Mini-Categories in Product Networks
We constructed a product network based on the sales data collected and
provided by a Fortune 500 speciality retailer. The structure of the network is
dominated by small isolated components, dense clique-based communities, and
sparse stars and linear chains and pendants. We used the identified structural
elements (tiles) to organize products into mini-categories -- compact
collections of potentially complementary and substitute items. The
mini-categories extend the traditional hierarchy of retail products (group -
class - subcategory) and may serve as building blocks towards exploration of
consumer projects and long-term customer behavior.Comment: Accepted to CompleNet, March 2015, NYC, NY, USA; 12 pages, 4 figure
The boundaries of dipole graphs and the complete bipartite graphs K_{2,n}
We study the Seifert surfaces of a link by relating the embeddings of graphs
by using induced graphs. As applications, we prove that every link is the
boundary of an oriented surface which is obtained from a graph embedding of a
complete bipartite graph , where all voltage assignments on the edges
of are 0. We also provide an algorithm to construct such a graph
diagram of a given link and demonstrate the algorithm by dealing with the links
and .Comment: 14 pages, 12 figure
An agent-based model for mRNA export through the nuclear pore complex.
mRNA export from the nucleus is an essential step in the expression of every protein- coding gene in eukaryotes, but many aspects of this process remain poorly understood. The density of export receptors that must bind an mRNA to ensure export, as well as how receptor distribution affects transport dynamics, is not known. It is also unclear whether the rate-limiting step for transport occurs at the nuclear basket, in the central channel, or on the cytoplasmic face of the nuclear pore complex. Using previously published biophysical and biochemical parameters of mRNA export, we implemented a three-dimensional, coarse-grained, agent-based model of mRNA export in the nanosecond regime to gain insight into these issues. On running the model, we observed that mRNA export is sensitive to the number and distribution of transport receptors coating the mRNA and that there is a rate-limiting step in the nuclear basket that is potentially associated with the mRNA reconfiguring itself to thread into the central channel. Of note, our results also suggest that using a single location-monitoring mRNA label may be insufficient to correctly capture the time regime of mRNA threading through the pore and subsequent transport. This has implications for future experimental design to study mRNA transport dynamics
Designing identity of a new material: a new product design approach
The present research is a design practice-based research based on the industrial development of a new concrete. The research focuses on the development of the specific identity of a new material. The research is aimed at demonstrating that product design can be used as a new strategy to create the material identity and thus to differentiate from existing materials. In order to design material specific identity in new products, we need to understand the perception process of shaped materials. Therefore we conducted exploratory study of materials recognition in products. We identified two types of products: the “messenger” products are specific shapes characteristic from the material; the “wrong messenger” products are imitations of other well known materials. The results of questionnaire about material recognition show that it’s more or less easy to identify material according to each product (whether it’s familiar or new shapes; whether it’s imitation or specific shapes and whether it’s well known or new material). We conclude on two types of shapes: on the one hand some familiar and typical shapes make easier and more certain the material recognition; on the other hand some new shapes make people more uncertain of what it is made of but more amazed. Designing amazing new shapes can be used as a new differentiation strategy to create the specific sensory identity of each new material. It means that the product can be a really useful support to fully communicate about a new material, beyond the traditional material samples.
Keywords:
New Material; Sensory Identity; Product Design</p
- …
