Search CORE

1,073 research outputs found

Median K-flats for hybrid linear modeling with many outliers

Author: Lerman Gilad
Szlam Arthur
Zhang Teng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2009
Field of study

We describe the Median K-Flats (MKF) algorithm, a simple online method for hybrid linear modeling, i.e., for approximating data by a mixture of flats. This algorithm simultaneously partitions the data into clusters while finding their corresponding best approximating l1 d-flats, so that the cumulative l1 error is minimized. The current implementation restricts d-flats to be d-dimensional linear subspaces. It requires a negligible amount of storage, and its complexity, when modeling data consisting of N points in D-dimensional Euclidean space with K d-dimensional linear subspaces, is of order O(n K d D+n d^2 D), where n is the number of iterations required for convergence (empirically on the order of 10^4). Since it is an online algorithm, data can be supplied to it incrementally and it can incrementally produce the corresponding output. The performance of the algorithm is carefully evaluated using synthetic and real data

arXiv.org e-Print Archive

Crossref

Time-causal and time-recursive spatio-temporal receptive fields

Author: Lindeberg Tony
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, based on a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision. We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain much faster temporal response properties (shorter temporal delays) compared to a uniform distribution. Specifically, these kernels converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales, thereby allowing for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter. We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.Comment: 39 pages, 12 figures, 5 tables in Journal of Mathematical Imaging and Vision, published online Dec 201

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Springer - Publisher Connector

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Gradient Domain Diffusion Models for Image Synthesis

Author: Gong Yuanhao
Publication venue
Publication date: 04/09/2023
Field of study

Diffusion models are getting popular in generative image and video synthesis. However, due to the diffusion process, they require a large number of steps to converge. To tackle this issue, in this paper, we propose to perform the diffusion process in the gradient domain, where the convergence becomes faster. There are two reasons. First, thanks to the Poisson equation, the gradient domain is mathematically equivalent to the original image domain. Therefore, each diffusion step in the image domain has a unique corresponding gradient domain representation. Second, the gradient domain is much sparser than the image domain. As a result, gradient domain diffusion models converge faster. Several numerical experiments confirm that the gradient domain diffusion models are more efficient than the original diffusion models. The proposed method can be applied in a wide range of applications such as image processing, computer vision and machine learning tasks

arXiv.org e-Print Archive

PLMM: Personal Large Models on Mobile Devices

Author: Gong Yuanhao
Publication venue
Publication date: 26/09/2023
Field of study

Inspired by Federated Learning, in this paper, we propose personal large models that are distilled from traditional large language models but more adaptive to local users' personal information such as education background and hobbies. We classify the large language models into three levels: the personal level, expert level and traditional level. The personal level models are adaptive to users' personal information. They encrypt the users' input and protect their privacy. The expert level models focus on merging specific knowledge such as finance, IT and art. The traditional models focus on the universal knowledge discovery and upgrading the expert models. In such classifications, the personal models directly interact with the user. For the whole system, the personal models have users' (encrypted) personal information. Moreover, such models must be small enough to be performed on personal computers or mobile devices. Finally, they also have to response in real-time for better user experience and produce high quality results. The proposed personal large models can be applied in a wide range of applications such as language and vision tasks.Comment: arXiv admin note: substantial text overlap with arXiv:2307.1322

arXiv.org e-Print Archive

Dynamic Large Language Models on Blockchains

Author: Gong Yuanhao
Publication venue
Publication date: 19/07/2023
Field of study

Training and deploying the large language models requires a large mount of computational resource because the language models contain billions of parameters and the text has thousands of tokens. Another problem is that the large language models are static. They are fixed after the training process. To tackle these issues, in this paper, we propose to train and deploy the dynamic large language model on blockchains, which have high computation performance and are distributed across a network of computers. A blockchain is a secure, decentralized, and transparent system that allows for the creation of a tamper-proof ledger for transactions without the need for intermediaries. The dynamic large language models can continuously learn from the user input after the training process. Our method provides a new way to develop the large language models and also sheds a light on the next generation artificial intelligence systems

arXiv.org e-Print Archive

Multilevel Large Language Models for Everyone

Author: Gong Yuanhao
Publication venue
Publication date: 24/07/2023
Field of study

Large language models have made significant progress in the past few years. However, they are either generic {\it or} field specific, splitting the community into different groups. In this paper, we unify these large language models into a larger map, where the generic {\it and} specific models are linked together and can improve each other, based on the user personal input and information from the internet. The idea of linking several large language models together is inspired by the functionality of human brain. The specific regions on the brain cortex are specific for certain low level functionality. And these regions can jointly work together to achieve more complex high level functionality. Such behavior on human brain cortex sheds the light to design the multilevel large language models that contain global level, field level and user level models. The user level models run on local machines to achieve efficient response and protect the user's privacy. Such multilevel models reduce some redundancy and perform better than the single level models. The proposed multilevel idea can be applied in various applications, such as natural language processing, computer vision tasks, professional assistant, business and healthcare

arXiv.org e-Print Archive

Un modelo distribuido para calcular descriptores locales de malla 3D basados en k-rings

Author: Guzmán Leonardo
Hurtado Jan
Márquez Alejandra
Suni-Lopez Franci
Publication venue: 'Universidad de Lima'
Publication date: 29/07/2022
Field of study

In order to facilitate 3D object processing, it is common to use high-level representations such as local descriptors that are usually computed using defined neighborhoods. K-rings, a technique to define them, is widely used by several methods. In this work, we propose a model for the distributed computation of local descriptors over 3D triangular meshes, using the concept of k-rings. In our experiments, we measure the performance of our model on huge meshes, evaluating the speedup, the scalability, and the descriptor computation time. We show the optimal configuration of our model for the cluster we implemented and the linear growth of computation time regarding the mesh size and the number of rings. We used the Harris response, which describes the saliency of the object, for our tests.Para facilitar el procesamiento de objetos 3D, es común utilizar representaciones de alto nivel, como los descriptores locales que generalmente se calculan utilizando vecindarios definidos. K-rings es una técnica para definirlos y es ampliamente utilizada por varios métodos. En este trabajo, proponemos un modelo para el cálculo distribuido de descriptores locales sobre mallas triangulares 3D, utilizando el concepto de anillos k. En nuestros experimentos, medimos el rendimiento de nuestro modelo en mallas enormes, evaluando la aceleración, la escalabilidad y el tiempo de cálculo del descriptor. Mostramos la configuración óptima de nuestro modelo para el clúster que implementamos y el crecimiento lineal del tiempo de cálculo con respecto al tamaño de la malla y el número de anillos. Usamos la respuesta de Harris, que describe la prominencia del objeto, para nuestras pruebas

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Portal de Revistas Ulima (Universidad de Lima)