Search CORE

33 research outputs found

Категориально-информационная модель адаптивной системы непрерывного обучения

Author: Петров С.А.
Publication venue: Міжнародний науково-навчальний центр інформаційних технологій і систем НАН та МОН України
Publication date: 01/01/2009
Field of study

Рассмотрена категориально-информационная модель адаптивной системы поддержки непрерывного обучения, построенная в рамках теории распознавания образов и информационного критерия функциональной эффективности. Показан процесс оптимизации технологических и дидактических параметров системы поддержки обучения. Предложен вероятностный алгоритм дообучения системы в процессе ее функционирования и определения момента ее полного переобучения.Розглянуто категоріально-інформаційну модель адаптивної системи підтримки неперервного навчання, побудовану в межах теорії розпізнавання образів та інформаційного критерію функціональної ефективності. Показано процес оптимізації технологічних та дидактичних параметрів системи підтримки неперервного навчання. Запропоновано ймовірнісний алгоритм донавчання системи в процесі її функціонування та визначення моменту її повного перенавчання.A categorial-information model of the adaptive system to support the lifelong learning built in the framework of the theory of pattern recognition and an information criterion of performance is considered. The process of optimization of technological and didactic parameters of the system of learning support is described. A probabilistic algorithm to adapt the system in its functioning and to determine the moment of its full retraining is suggested

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Fractional norms and quasinorms do not help to overcome the curse of dimensionality

Author: Allohibi Jeza
Gorban Alexander N.
Mirkes Evgeny M.
Publication venue: 'MDPI AG'
Publication date: 29/04/2020
Field of study

The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using of the Manhattan distance and even fractional quasinorms lp (for p less than 1) can help to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. We confirm that fractional quasinorms have a greater relative contrast or coefficient of variation than the Euclidean norm l2, but we also demonstrate that the distance concentration shows qualitatively the same behaviour for all tested norms and quasinorms and the difference between them decays as dimension tends to infinity. Estimation of classification quality for kNN based on different norms and quasinorms shows that a greater relative contrast does not mean better classifier performance and the worst performance for different databases was shown by different norms (quasinorms). A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

LDA-Based Industry Classification

Author: Datta Anindya
Dutta Kaushik
Fang Fang
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2013
Field of study

Industry classification is a crucial step for financial analysis. However, existing industry classification schemes have several limitations. In order to overcome these limitations, in this paper, we propose an industry classification methodology on the basis of business commonalities using the topic features learned by the Latent Dirichlet Allocation (LDA) from firms’ business descriptions. Two types of classification – firm-centric classification and industry-centric classification were explored. Preliminary evaluation results showed the effectiveness of our method

AIS Electronic Library (AISeL)

ScholarBank@NUS

Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

Author: Andoni A.
Beyer K.
Broder A. Z.
Brown P. F.
Fried D.
Le Q.
Mikolov T.
Mu Y.
Muja M.
Petrović S.
Riezler S.
Salton G.
Wang J.
Weber R.
Yang L.
Yao X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/10/2016
Field of study

Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

arXiv.org e-Print Archive

Crossref

Scipedia

Efficient Processing of Exact Top-k Queries over Disk-Resident Sorted Lists

Author: A. Marian
A. Silberschatz
A. Spink
B. Arai
B. Bloom
Baihua Zheng
D.D. Lewis
F. Korn
G. Adomavicius
H.P. Hung
HweeHwa Pang
K. Yi
L. Zhu
M. Hua
M. Theobald
M.A. Soliman
M.L. Yiu
N. Bruno
N. Mamoulis
R. Baeza-Yates
R. Fagin
S. Brin
S. Chaudhuri
S. Hwang
Xuhua Ding
Y. Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2010
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Automated Inference System for End-To-End Diagnosis of Network Performance Issues in Client-Terminal Devices

Author: Fitzpatrick Paul G.
Ivanovich Milosh V.
Li Jonathan C.
Widanapathirana Chathuranga
Şekercioǧlu Y. Ahmet
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2012
Field of study

Traditional network diagnosis methods of Client-Terminal Device (CTD) problems tend to be laborintensive, time consuming, and contribute to increased customer dissatisfaction. In this paper, we propose an automated solution for rapidly diagnose the root causes of network performance issues in CTD. Based on a new intelligent inference technique, we create the Intelligent Automated Client Diagnostic (IACD) system, which only relies on collection of Transmission Control Protocol (TCP) packet traces. Using soft-margin Support Vector Machine (SVM) classifiers, the system (i) distinguishes link problems from client problems and (ii) identifies characteristics unique to the specific fault to report the root cause. The modular design of the system enables support for new access link and fault types. Experimental evaluation demonstrated the capability of the IACD system to distinguish between faulty and healthy links and to diagnose the client faults with 98% accuracy. The system can perform fault diagnosis independent of the user's specific TCP implementation, enabling diagnosis of diverse range of client devicesComment: arXiv admin note: substantial text overlap with arXiv:1207.356

arXiv.org e-Print Archive

Crossref

Effective Dimensionality: A Tutorial

Author: Del Giudice Marco
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

The topic of this tutorial is the effective dimensionality (ED) of a dataset, that is, the equivalent number of orthogonal dimensions that would produce the same overall pattern of covariation. The ED quantifies the total dimensionality of a set of variables, with no assumptions about their underlying structure. The ED of a dataset has important implications for the “curse of dimensionality”; it can be used to inform decisions about data analysis and answer meaningful empirical questions. The tutorial offers an accessible introduction to ED, distinguishes it from the related but distinct concept of intrinsic dimensionality, critically reviews various ED estimators, and gives indications for practical use with examples from personality research. An R function is provided to implement the techniques described in the tutorial

Archivio istituzionale della ricerca - Università di Trieste

Institutional Research Information System University of Turin