16 research outputs found
Analiza i predviÄanje toka vremenskih serija pomoÄu āCase-BasedReasoningā tehnologije.
This thesis describes one promising approach where a problem of time series analysis and prediction was solved by using Case Based Reasoning (CBR) technology. Foundations and main concepts of this technology are described in detail. Furthermore, a detailed study of different approaches in time series analysis is given. System CuBaGe (Curve Base Generator) - A robust and general architecture for curve representation and indexing time series databases, based on Case based reasoning technology, was developed. Also, a corresponding similarity measure was modelled for a given kind of curve representation. The presented architecture may be employed equally well not only in conventional time series (where all values are known), but also in some non-standard time series (sparse, vague, non-equidistant). Dealing with the non-standard time series is the highest advantage of the presented architecture.U ovoj doktorskoj disertaciji prikazan je interesantan i perspektivan pristup reÅ”avanja problema analize i predviÄanja vremenskih serija koriÅ”Äenjem Case Based Reasoning (CBR) tehnologije. Detaljno su opisane osnove i glavni koncepti ove tehnologije. TakoÄe, data je komparativna analiza razliÄitih pristupa u analizi vremenskih serija sa posebnim osvrtom na predviÄanje. Kao najveÄi doprinos ove disertacije, prikazan je sistem CuBaGe (Curve Base Generator) u kome je realizovan originalni naÄin reprezentacije vremenskih serija zajedno sa, takoÄe originalnom, odgovarajuÄom merom sliÄnosti. Robusnost i generalnost sistema ilustrovana je realnom primenom u domenu finansijskog predviÄanja, gde je pokazano da sistem jednako dobro funkcioniÅ”e sa standardnim, ali i sa nekim nestandardnim vremenskim serijama (neodreÄenim, retkim i neekvidistantnim)
Local Intrinsic Dimensionality Measures for Graphs, with Applications to Graph Embeddings
The notion of local intrinsic dimensionality (LID) is an important
advancement in data dimensionality analysis, with applications in data mining,
machine learning and similarity search problems. Existing distance-based LID
estimators were designed for tabular datasets encompassing data points
represented as vectors in a Euclidean space. After discussing their limitations
for graph-structured data considering graph embeddings and graph distances, we
propose NC-LID, a novel LID-related measure for quantifying the discriminatory
power of the shortest-path distance with respect to natural communities of
nodes as their intrinsic localities. It is shown how this measure can be used
to design LID-aware graph embedding algorithms by formulating two LID-elastic
variants of node2vec with personalized hyperparameters that are adjusted
according to NC-LID values. Our empirical analysis of NC-LID on a large number
of real-world graphs shows that this measure is able to point to nodes with
high link reconstruction errors in node2vec embeddings better than node
centrality metrics. The experimental evaluation also shows that the proposed
LID-elastic node2vec extensions improve node2vec by better preserving graph
structure in generated embeddings
The Influence of Global Constraints on Similarity Measures for Time-Series Databases
A time series consists of a series of values or events obtained over repeated
measurements in time. Analysis of time series represents and important tool in
many application areas, such as stock market analysis, process and quality
control, observation of natural phenomena, medical treatments, etc. A vital
component in many types of time-series analysis is the choice of an appropriate
distance/similarity measure. Numerous measures have been proposed to date, with
the most successful ones based on dynamic programming. Being of quadratic time
complexity, however, global constraints are often employed to limit the search
space in the matrix during the dynamic programming procedure, in order to speed
up computation. Furthermore, it has been reported that such constrained
measures can also achieve better accuracy. In this paper, we investigate two
representative time-series distance/similarity measures based on dynamic
programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS),
and the effects of global constraints on them. Through extensive experiments on
a large number of time-series data sets, we demonstrate how global constrains
can significantly reduce the computation time of DTW and LCS. We also show
that, if the constraint parameter is tight enough (less than 10-15% of
time-series length), the constrained measure becomes significantly different
from its unconstrained counterpart, in the sense of producing qualitatively
different 1-nearest neighbor graphs. This observation explains the potential
for accuracy gains when using constrained measures, highlighting the need for
careful tuning of constraint parameters in order to achieve a good trade-off
between speed and accuracy
Modules for evaluating ASCAPE machine learning models
This repository contains modules for evaluating ASCAPE machine learning models
evalm.py - classes for evaluating classification and regression models
loader.py - loader for folded CSV datasets that do not contain missing values
skie.py - inference routines for scikit-learn models
tfnnie.py - inference routines for tensorflow neural network models
eval_all.py - script to evaluate all trained model
The Influence of Global Constraints on DTW and LCS Similarity Measures for Time-Series Databases
Analysis of time series represents an important tool in many application areas. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10ā15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbour graphs. This observation highlights the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy
Achievement motive and locus of control as motivation factors of European identity
The research presented in this paper is a part of the project "Condition, Factors and Development of European Identity in Serbia and Montenegro" which is financed by the Ministry for Science and Protection of Environment of the Republic of Serbia. The goal of this research is to determine the relation between the achievement motive and locus of control on the one hand, and European identity on the other. These motivational factors were selected as the indicators of the active orientation of persons, which is a psychological prerequisite for the development of consciousness of the people and the society as a whole. The research was carried out during 2003 with the sample of 2635 subjects from four regions of Serbia and Montenegro, of both sexes, different levels of education, 18 to 43 years old. The scale EUROID2002 was applied to measure the European and national identity. The achievement motive was measured with the MOP2002 scale, and locus of control with LOKKON2002. All these instruments were created at the Department of Psychology in Novi Sad. The canonical correlation analysis was used to process the data. The results point to the existence of two statistically significant regularities in the relation between the achievement motive and European identity. The achievement of the goal, which is experienced as a source of pleasure and followed by the need to compete with others, was related to two factors of social identity, namely with the need to preserve national identity and with the exclusive national attachment. Thus, the results indicate that the persons for whom competition with others is a significant goal also express a higher degree of nationalism. Furthermore, the results indicate that the degree of nationalism is higher with the higher degree of persistence in the competition with others. In the relation between the locus of control and European identity, three statistically significant regularities were obtained. If the faith in the power of destiny is more pronounced, the need to preserve national identity and exclusive national attachment is also more pronounced. Stronger internal locus of control is related to greater openness for technological progress. Furthermore, the results indicate that those who in achievement situations attach great importance to the activities of other people also have a more pronounced pro-European orientation
Relation between European and national identity and rigidity as a personality trait
The research presented in this paper is a part of the project āCondition, Factors and Development of European Identity in Serbia and Montenegroā, which started in 2002 with the financial support of the Ministry for Science and Protection of Environment of the Republic of Serbia. The research studied the relation between certain kinds of social identity ( European, i.e. national identity ) on the one hand, and the degree of rigidity of the subjects on the other. The sample consisted of 2685 inhabitants of Serbia and Montenegro, of both sexes, different levels of education and 18 to 43 years old. European, i.e. national identity was measured with the questionnaire EUROID2002, which consisted of 36 items related to different aspects of social identity. Factor analysis singled out five factors of social identity: pro-European orientation, advocating the preservation of national identity, confronting traditional values and technological civilization, globalization as a threatening factor for small and poor nations, and exclusive national attachment. Rigidity of the subjects was determined with the application of the RG-2 questionnaire which consists of 30 items related to the rigidity of thought in various life situations. Factor analysis singled out two factors of rigidity: rigidity toward oneself and others, as well as rigidity in oneās life habits. The relation between social identity and rigidity was determined with the technique of canonical analysis. Two significant canonical roots were singled out: the first canonical root includes two aspects of rigidity which were positively related to the factors implying a strongly pronounced national identity ( advocating the preservation of national identity and exclusive national attachment ). The second canonical root indicates a positive relation between rigidity in life habits and traditional standpoints and fear of globalization
Sentiment prediction based on analysis of customers assessments in food serving businesses
Human activities and behaviour in different domains are usually influenced by other peopleās actions and opinion. Nowadays, it is evident that there is a growing research interest in sentiment analysis, evaluation and prediction. Content from web sources and social media is frequently used when people want to see othersā opinion about different things. Our research is focused on ML-based sentiment analysis of food services reviews data. The comparison of several regression models with regards to prediction of customer satisfaction of restaurant and food services is presented. The experimental data collected from food serving businesses located in Shanghai Lujiazui Commercial Zone includes keywords extracted from the customersā written reviews. Additionally, the data are spatially labelled enabling to conduct separate analyses for different geographical regions. As a conclusion, the keywords extracted from the customerās reviews were suitable for the prediction of three observed satisfaction criteria: food taste, service, and environment