16 research outputs found

    Analiza i predviđanje toka vremenskih serija pomoću ā€œCase-BasedReasoningā€ tehnologije.

    Get PDF
    This thesis describes one promising approach where a problem of time series analysis and prediction was solved by using Case Based Reasoning (CBR) technology. Foundations and main concepts of this technology are described in detail. Furthermore, a detailed study of different approaches in time series analysis is given. System CuBaGe (Curve Base Generator) - A robust and general architecture for curve representation and indexing time series databases, based on Case based reasoning technology, was developed. Also, a corresponding similarity measure was modelled for a given kind of curve representation. The presented architecture may be employed equally well not only in conventional time series (where all values are known), but also in some non-standard time series (sparse, vague, non-equidistant). Dealing with the non-standard time series is the highest advantage of the presented architecture.U ovoj doktorskoj disertaciji prikazan je interesantan i perspektivan pristup reÅ”avanja problema analize i predviđanja vremenskih serija koriŔćenjem Case Based Reasoning (CBR) tehnologije. Detaljno su opisane osnove i glavni koncepti ove tehnologije. Takođe, data je komparativna analiza različitih pristupa u analizi vremenskih serija sa posebnim osvrtom na predviđanje. Kao najveći doprinos ove disertacije, prikazan je sistem CuBaGe (Curve Base Generator) u kome je realizovan originalni način reprezentacije vremenskih serija zajedno sa, takođe originalnom, odgovarajućom merom sličnosti. Robusnost i generalnost sistema ilustrovana je realnom primenom u domenu finansijskog predviđanja, gde je pokazano da sistem jednako dobro funkcioniÅ”e sa standardnim, ali i sa nekim nestandardnim vremenskim serijama (neodređenim, retkim i neekvidistantnim)

    Local Intrinsic Dimensionality Measures for Graphs, with Applications to Graph Embeddings

    Full text link
    The notion of local intrinsic dimensionality (LID) is an important advancement in data dimensionality analysis, with applications in data mining, machine learning and similarity search problems. Existing distance-based LID estimators were designed for tabular datasets encompassing data points represented as vectors in a Euclidean space. After discussing their limitations for graph-structured data considering graph embeddings and graph distances, we propose NC-LID, a novel LID-related measure for quantifying the discriminatory power of the shortest-path distance with respect to natural communities of nodes as their intrinsic localities. It is shown how this measure can be used to design LID-aware graph embedding algorithms by formulating two LID-elastic variants of node2vec with personalized hyperparameters that are adjusted according to NC-LID values. Our empirical analysis of NC-LID on a large number of real-world graphs shows that this measure is able to point to nodes with high link reconstruction errors in node2vec embeddings better than node centrality metrics. The experimental evaluation also shows that the proposed LID-elastic node2vec extensions improve node2vec by better preserving graph structure in generated embeddings

    The Influence of Global Constraints on Similarity Measures for Time-Series Databases

    Full text link
    A time series consists of a series of values or events obtained over repeated measurements in time. Analysis of time series represents and important tool in many application areas, such as stock market analysis, process and quality control, observation of natural phenomena, medical treatments, etc. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. Furthermore, it has been reported that such constrained measures can also achieve better accuracy. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10-15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbor graphs. This observation explains the potential for accuracy gains when using constrained measures, highlighting the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy

    Modules for evaluating ASCAPE machine learning models

    No full text
    This repository contains modules for evaluating ASCAPE machine learning models evalm.py - classes for evaluating classification and regression models loader.py - loader for folded CSV datasets that do not contain missing values skie.py - inference routines for scikit-learn models tfnnie.py - inference routines for tensorflow neural network models eval_all.py - script to evaluate all trained model

    The Influence of Global Constraints on DTW and LCS Similarity Measures for Time-Series Databases

    No full text
    Analysis of time series represents an important tool in many application areas. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10ā€“15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbour graphs. This observation highlights the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy

    Achievement motive and locus of control as motivation factors of European identity

    No full text
    The research presented in this paper is a part of the project "Condition, Factors and Development of European Identity in Serbia and Montenegro" which is financed by the Ministry for Science and Protection of Environment of the Republic of Serbia. The goal of this research is to determine the relation between the achievement motive and locus of control on the one hand, and European identity on the other. These motivational factors were selected as the indicators of the active orientation of persons, which is a psychological prerequisite for the development of consciousness of the people and the society as a whole. The research was carried out during 2003 with the sample of 2635 subjects from four regions of Serbia and Montenegro, of both sexes, different levels of education, 18 to 43 years old. The scale EUROID2002 was applied to measure the European and national identity. The achievement motive was measured with the MOP2002 scale, and locus of control with LOKKON2002. All these instruments were created at the Department of Psychology in Novi Sad. The canonical correlation analysis was used to process the data. The results point to the existence of two statistically significant regularities in the relation between the achievement motive and European identity. The achievement of the goal, which is experienced as a source of pleasure and followed by the need to compete with others, was related to two factors of social identity, namely with the need to preserve national identity and with the exclusive national attachment. Thus, the results indicate that the persons for whom competition with others is a significant goal also express a higher degree of nationalism. Furthermore, the results indicate that the degree of nationalism is higher with the higher degree of persistence in the competition with others. In the relation between the locus of control and European identity, three statistically significant regularities were obtained. If the faith in the power of destiny is more pronounced, the need to preserve national identity and exclusive national attachment is also more pronounced. Stronger internal locus of control is related to greater openness for technological progress. Furthermore, the results indicate that those who in achievement situations attach great importance to the activities of other people also have a more pronounced pro-European orientation

    Relation between European and national identity and rigidity as a personality trait

    No full text
    The research presented in this paper is a part of the project ā€œCondition, Factors and Development of European Identity in Serbia and Montenegroā€, which started in 2002 with the financial support of the Ministry for Science and Protection of Environment of the Republic of Serbia. The research studied the relation between certain kinds of social identity ( European, i.e. national identity ) on the one hand, and the degree of rigidity of the subjects on the other. The sample consisted of 2685 inhabitants of Serbia and Montenegro, of both sexes, different levels of education and 18 to 43 years old. European, i.e. national identity was measured with the questionnaire EUROID2002, which consisted of 36 items related to different aspects of social identity. Factor analysis singled out five factors of social identity: pro-European orientation, advocating the preservation of national identity, confronting traditional values and technological civilization, globalization as a threatening factor for small and poor nations, and exclusive national attachment. Rigidity of the subjects was determined with the application of the RG-2 questionnaire which consists of 30 items related to the rigidity of thought in various life situations. Factor analysis singled out two factors of rigidity: rigidity toward oneself and others, as well as rigidity in oneā€™s life habits. The relation between social identity and rigidity was determined with the technique of canonical analysis. Two significant canonical roots were singled out: the first canonical root includes two aspects of rigidity which were positively related to the factors implying a strongly pronounced national identity ( advocating the preservation of national identity and exclusive national attachment ). The second canonical root indicates a positive relation between rigidity in life habits and traditional standpoints and fear of globalization

    Sentiment prediction based on analysis of customers assessments in food serving businesses

    No full text
    Human activities and behaviour in different domains are usually influenced by other peopleā€™s actions and opinion. Nowadays, it is evident that there is a growing research interest in sentiment analysis, evaluation and prediction. Content from web sources and social media is frequently used when people want to see othersā€™ opinion about different things. Our research is focused on ML-based sentiment analysis of food services reviews data. The comparison of several regression models with regards to prediction of customer satisfaction of restaurant and food services is presented. The experimental data collected from food serving businesses located in Shanghai Lujiazui Commercial Zone includes keywords extracted from the customersā€™ written reviews. Additionally, the data are spatially labelled enabling to conduct separate analyses for different geographical regions. As a conclusion, the keywords extracted from the customerā€™s reviews were suitable for the prediction of three observed satisfaction criteria: food taste, service, and environment
    corecore