Search CORE

222 research outputs found

Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis

Author: Caiafa C.
Cichocki A.
De Lathauwer L.
Mandic D.
Phan A-H.
Zhao Q.
Zhou G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/03/2014
Field of study

The widespread use of multi-sensor technology and the emergence of big datasets has highlighted the limitations of standard flat-view matrix models and the necessity to move towards more versatile data analysis tools. We show that higher-order tensors (i.e., multiway arrays) enable such a fundamental paradigm shift towards models that are essentially polynomial and whose uniqueness, unlike the matrix methods, is guaranteed under verymild and natural conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical backbone, data analysis techniques using tensor decompositions are shown to have great flexibility in the choice of constraints that match data properties, and to find more general latent components in the data than matrix-based methods. A comprehensive introduction to tensor decompositions is provided from a signal processing perspective, starting from the algebraic foundations, via basic Canonical Polyadic and Tucker models, through to advanced cause-effect and multi-view data analysis schemes. We show that tensor decompositions enable natural generalizations of some commonly used signal processing paradigms, such as canonical correlation and subspace techniques, signal separation, linear regression, feature extraction and classification. We also cover computational aspects, and point out how ideas from compressed sensing and scientific computing may be used for addressing the otherwise unmanageable storage and manipulation problems associated with big datasets. The concepts are supported by illustrative real world case studies illuminating the benefits of the tensor framework, as efficient and promising tools for modern signal processing, data analysis and machine learning applications; these benefits also extend to vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker decomposition, HOSVD, tensor networks, Tensor Train

arXiv.org e-Print Archive

CiteSeerX

On Optimal Top-K String Retrieval

Author: Shah Rahul
Sheng Cheng
Thankachan Sharma V.
Vitter Jeffrey Scott
Publication venue
Publication date: 01/01/2012
Field of study

Let

{\cal{D}}

\{d_1, d_2, d_3, ..., d_D\}

be a given set of

D

(string) documents of total length

n

. The top-

k

document retrieval problem is to index

\cal{D}

such that when a pattern

P

of length

p

, and a parameter

k

come as a query, the index returns the

k

most relevant documents to the pattern

P

. Hon et. al. \cite{HSV09} gave the first linear space framework to solve this problem in

O(p + k\log k)

time. This was improved by Navarro and Nekrich \cite{NN12} to

O(p + k)

. These results are powerful enough to support arbitrary relevance functions like frequency, proximity, PageRank, etc. In many applications like desktop or email search, the data resides on disk and hence disk-bound indexes are needed. Despite of continued progress on this problem in terms of theoretical, practical and compression aspects, any non-trivial bounds in external memory model have so far been elusive. Internal memory (or RAM) solution to this problem decomposes the problem into

O(p)

subproblems and thus incurs the additive factor of

O(p)

. In external memory, these approaches will lead to

O(p)

I/Os instead of optimal

O(p/B)

I/O term where

B

is the block-size. We re-interpret the problem independent of

p

, as interval stabbing with priority over tree-shaped structure. This leads us to a linear space index in external memory supporting top-

k

queries (with unsorted outputs) in near optimal

O(p/B + \log_B n + \log^{(h)} n + k/B)

I/Os for any constant

h

{

\log^{(1)}n =\log n

and

\log^{(h)} n = \log (\log^{(h-1)} n)

}. Then we get

O(n\log^*n)

space index with optimal

O(p/B+\log_B n + k/B)

I/Os.Comment: 3 figure

arXiv.org e-Print Archive

CiteSeerX

StuCoSReC

Author
Publication venue: 'University of Primorska Press'
Publication date: 07/11/2021
Field of study

Eleven papers addressed this conference, covering several topics of the computer science. All the papers were reviewed by two international reviewers and accepted for the oral presentation. This fact confirms a good work with authors in their research institutions. The content of the papers will be presented in three sections covering different areas of computer science and even robotics

Repository of University of Primorska

Recommended from our members

Exploiting Social Networks for Recommendation in Online Image Sharing Systems

Author: Rae Adam
Publication venue
Publication date: 01/01/2012
Field of study

This thesis aims to demonstrate the distinct and so far little explored value of knowledge derived from social interaction data within large web-scale image sharing systems like Flickr, Picasa Web, Facebook and others for image recommendation. I have shown how such systems can be significantly improved through personalisation that takes into account the social context of users by modelling their interactions by mining data, building and evaluating systems that incorporate this information. These improvements allow users to search and browse large online image collections more quickly and to find results that more accurately match their personal information needs when compared to existing methods. Traditional information retrieval and recommendation datasets are contrived to provide stable baselines for researchers to compare against but they rarely accurately reflect the media systems users tend to encounter online. The online photo sharing site Flickr provides rich and varied data that can be used by researchers to analyse and understand users’ interactions with images and with each other. I analyse such data by modelling the connections between users as multigraphs and exploiting the resultant topologies to produce features that can be used to train recommender systems based on machine learnt classifiers. The core contributions of this work include insight into the nature of very large-scale on- line photo collections and the communities that form around them, as well as the dynamic nature of the interactions users have with their media. I do this through the rigorous evaluation of both a probabilistic tag recommendation system and a machine learnt classifier trained to mimic user decisions regarding image preference. These implementations focus on treating the user as both a unique individual and as a member of potentially many explicit and implicit communities. I also explore the validity of the Flickr ‘Favourite’ feedback label as proxy for user preference, which is particularly important when considering other analogous media systems to which my findings transfer. My conclusions highlight how vital both social context information and the understanding of user behaviour are for online image sharing systems. In the field of information retrieval the diverse nature of users is often forgotten in the hunt for increases in esoteric performance metrics. This thesis places them back at the centre of the problem of multimedia information retrieval and shows how their variety and uniqueness are valuable traits that can be exploited to augment and improve the experience of browsing and searching shared online image collections

Open Research Online (The Open University)

OpenGrey Repository

The Catalog Problem:Deep Learning Methods for Transforming Sets into Sequences of Clusters

Author: Jurewicz Mateusz
Publication venue: IT-Universitetet i København
Publication date: 01/01/2023
Field of study

The titular Catalog Problem refers to predicting a varying number of ordered clusters from sets of any cardinality. This task arises in many diverse areas, ranging from medical triage, through multi-channel signal analysis for petroleum exploration to product catalog structure prediction. This thesis focuses on the latter, which exemplifies a number of challenges inherent to ordered clustering. These include learning variable cluster constraints, exhibiting relational reasoning and managing combinatorial complexity. All of which present unique challenges for neural networks, combining elements of set representation, neural clustering and permutation learning.In order to approach the Catalog Problem, a curated dataset of over ten thousand real-world product catalogs consisting of more than one million product offers is provided. Additionally, a library for generating simpler, synthetic catalog structures is presented. These and other datasets form the foundation of the included work, allowing for a quantitative comparison of the proposed methods’ ability to address the underlying challenge. In particular, synthetic datasets enable the assessment of the models’ capacity to learn higher order compositional and structural rules.Two novel neural methods are proposed to tackle the Catalog Problem, a set encoding module designed to enhance the network’s ability to condition the prediction on the entirety of the input set, and a larger architecture for inferring an input- dependent number of diverse, ordered partitional clusters with an added cardinality prediction module. Both result in an improved performance on the presented datasets, with the latter being the only neural method fulfilling all requirements inherent to addressing the Catalog Problem

The IT University of Copenhagen's Repository

Deep filter banks for texture recognition, description, and segmentation

Author: Cimpoi Mircea
Kokkinos Iasonas
Maji Subhransu
Vedaldi Andrea
Publication venue
Publication date: 18/11/2015
Field of study

Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture representations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another.Comment: 29 pages; 13 figures; 8 table

arXiv.org e-Print Archive

HAL-CentraleSupelec

Springer - Publisher Connector

INRIA a CCSD electronic archive server

UCL Discovery

PubMed Central

Oxford University Research Archive

HAL-Rennes 1