679 research outputs found

    Triangular Fuzzy Time Series for Two Factors High-order based on Interval Variations

    Get PDF
    Fuzzy time series (FTS) firstly introduced by Song and Chissom has been developed to forecast such as enrollment data, stock index, air pollution, etc. In forecasting FTS data several authors define universe of discourse using coefficient values with any integer or real number as a substitute. This study focuses on interval variation in order to get better evaluation. Coefficient values analyzed and compared in unequal partition intervals and equal partition intervals with base and triangular fuzzy membership functions applied in two factors high-order. The study implemented in the Shen-hu stock index data. The models evaluated by average forecasting error rate (AFER) and compared with existing methods. AFER value 0.28% for Shen-hu stock index daily data. Based on the result, this research can be used as a reference to determine the better interval and degree membership value in the fuzzy time series.

    The cross-association relation based on intervals ratio in fuzzy time series

    Get PDF
    The fuzzy time series (FTS) is a forecasting model based on linguistic values. This forecasting method was developed in recent years after the existing ones were insufficiently accurate. Furthermore, this research modified the accuracy of existing methods for determining and the partitioning universe of discourse, fuzzy logic relationship (FLR), and variation historical data using intervals ratio, cross association relationship, and rubber production Indonesia data, respectively. The modifed steps start with the intervals ratio to partition the determined universe discourse. Then the triangular fuzzy sets were built, allowing fuzzification. After this, the FLR are built based on the cross association relationship, leading to defuzzification. The average forecasting error rate (AFER) was used to compare the modified results and the existing methods. Additionally, the simulations were conducted using rubber production Indonesia data from 2000-2020. With an AFER result of 4.77%<10%, the modification accuracy has a smaller error than previous methods, indicating  very good forecasting criteria. In addition, the coefficient values of D1 and D2 were automatically obtained from the intervals ratio algorithm. The future works modified the partitioning of the universe of discourse using frequency density to eliminate unused partition intervals

    A Hybrid Fuzzy Time Series Technique for Forecasting Univariate Data

    Get PDF
    In this paper a hybrid forecasting technique that integrates Cat Swarm optimization Clustering (CSO-C) and Particle Swarm Optimization (PSO) with Fuzzy Time Series (FTS) forecasting is presented. In the three stages of FTS, CSO-C found application at the fuzzification module where its efficient capability in terms of data classification was utilized to neutrally divide the universe of discourse into unequal parts. Then, disambiguated fuzzy relationships were obtained using Fuzzy Set Group (FSG). In the final stage, PSO was adopted for optimization; by tuning weights assigned to fuzzy sets in a rule. This rule is a fuzzy logical relationship induced from FSG. The forecasting results showed that the proposed method outperformed other existing methods; using RMSE and MAPE as performance metrics.            

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

    A NEW HYBRID FUZZY TIME SERIES FORECASTING MODEL BASED ON COMBINING FUZZY C-MEANS CLUSTERING AND PARTICLE SWAM OPTIMIZATION

    Get PDF
    Fuzzy time series (FTS) model is one of the effective tools that can be used to identify factors in order to solve the complex process and uncertainty. Nowadays, it has been widely used in many forecasting problems. However, establishing effective fuzzy relationships groups, finding proper length of each interval, and building defuzzification rule are three issues that exist in FTS model. Therefore, in this paper, a novel FTS forecasting model based on fuzzy C-means (FCM) clustering and particle swarm optimization (PSO) was developed to enhance the forecasting accuracy. Firstly, the FCM clustering is used to divide the historical data into intervals with different lengths. After generating interval, the historical data is fuzzified into fuzzy sets. Following, fuzzy relationship groups were established based on the appearance history of the fuzzy sets on the right-hand side of the fuzzy logical relationships with the aim to serve for calculating the forecasting output.  Finally, the proposed model combined with PSO algorithm was applied to adjust interval lengths and find proper intervals in the universe of discourse for obtaining the best forecasting accuracy. To verify the effectiveness of the forecasting model, three numerical datasets (enrolments data of the University of Alabama, the Taiwan futures exchange –TAIFEX data and yearly deaths in car road accidents in Belgium) are selected to illustrate the proposed model. The experimental results indicate that the proposed model is better than any existing forecasting models in term of forecasting accuracy based on the first – order and high-order FTS

    Hierarchical Clustering of Time Series Based on Linear Information Granules

    Get PDF
    Time series clustering is one of the main tasks in time series data mining. In this paper, a new time series clustering algorithm is proposed based on linear information granules. First, we improve the identification method of fluctuation points using threshold set, which represents the main trend information of the original time series. Then using fluctuation points as segmented nodes, we segment the original time series into several information granules, and linear function is used to represent the information granules. With information granulation, a granular time series consisting of several linear information granules replaces the original time series. In order to cluster time series, we then propose a linear information granules based segmented matching distance measurement (LIG_SMD) to calculate the distance between every two granular time series. In addition, hierarchical clustering method is applied based on the new distance (LIG_SMD_HC) to get clustering results. Finally, some public and real datasets about time series are experimented to examine the effectiveness of the proposed algorithm. Specifically, Euclidean distance based hierarchical clustering (ED_HC) and Dynamic Time Warping distance based hierarchical clustering (DTW_HC) are used as the compared algorithms. Our results show that LIG_SMD_HC is better than ED_HC and DTW_HC in terms of F-Measure and Accuracy

    Low-latency, query-driven analytics over voluminous multidimensional, spatiotemporal datasets

    Get PDF
    2017 Summer.Includes bibliographical references.Ubiquitous data collection from sources such as remote sensing equipment, networked observational devices, location-based services, and sales tracking has led to the accumulation of voluminous datasets; IDC projects that by 2020 we will generate 40 zettabytes of data per year, while Gartner and ABI estimate 20-35 billion new devices will be connected to the Internet in the same time frame. The storage and processing requirements of these datasets far exceed the capabilities of modern computing hardware, which has led to the development of distributed storage frameworks that can scale out by assimilating more computing resources as necessary. While challenging in its own right, storing and managing voluminous datasets is only the precursor to a broader field of study: extracting knowledge, insights, and relationships from the underlying datasets. The basic building block of this knowledge discovery process is analytic queries, encompassing both query instrumentation and evaluation. This dissertation is centered around query-driven exploratory and predictive analytics over voluminous, multidimensional datasets. Both of these types of analysis represent a higher-level abstraction over classical query models; rather than indexing every discrete value for subsequent retrieval, our framework autonomously learns the relationships and interactions between dimensions in the dataset (including time series and geospatial aspects), and makes the information readily available to users. This functionality includes statistical synopses, correlation analysis, hypothesis testing, probabilistic structures, and predictive models that not only enable the discovery of nuanced relationships between dimensions, but also allow future events and trends to be predicted. This requires specialized data structures and partitioning algorithms, along with adaptive reductions in the search space and management of the inherent trade-off between timeliness and accuracy. The algorithms presented in this dissertation were evaluated empirically on real-world geospatial time-series datasets in a production environment, and are broadly applicable across other storage frameworks

    Fuzzy Time Series dan Algoritme Average Based Length untuk Prediksi Pekerja Migran Indonesia

    Get PDF
    Perkembangan jumlah Pekerja Migran Indonesia (PMI) program Government to Government (G to G) Jepang bidang perawat (nurse) dan perawat orang berusia lanjut (care worker) mengalami naik turun dari tahun 2008 hingga 2018. Untuk dapat menganalisis jumlah PMI yang mengalami naik turun dengan mengukur perkembangan jumlah PMI saat ini dan memprediksikan kondisi tersebut pada masa mendatang, maka diperlukan model prediksi. Dalam penelitian ini diterapkan model fuzzy time series dengan menggunakan algoritme average-based length. Penentuan panjang interval yang efektif dapat mempengaruhi hasil prediksi yaitu dapat meningkatkan keakuratan yang tinggi dalam fuzzy time series. Hasil proses prediksi PMI program G to G Jepang tahun 2019 bidang nurse diperoleh 43.3, bidang care worker diperoleh 300 dan bidang keseluruhan diperoleh 325. Hasil uji kinerja prediksi PMI program G to G Jepang, menggunakan Mean Absolute Percentage Error (MAPE) adalah 24.27% untuk bidang nurse dengan nilai akurasi prediksi 20–50% termasuk dalam kriteria “wajar”, bidang care worker 11.29% dengan nilai akurasi prediksi 10–20% termasuk dalam kriteria “baik”, sedangkan untuk bidang keseluruhan diperoleh 8.41% dengan nilai akurasi prediksi MAPE <10% termasuk dalam kriteria “sangat baik”. Berdasarkan hasil prediksi tersebut dapat digunakan sebagai pendukung keputusan bagi manajemen dalam membuat kebijakan terkait persiapan, perencanaan, penjadwalan, penempatan, dan perlindungan terhadap para calon PMI pada masa mendatang. Dengan demikian dapat meningkatkan kualitas kinerja sumberdaya manusia dalam memberikan pelayanan terbaik terhadap para calon PMI program G to G Jepang.AbstractThe development of the number of Pekerja Migran Indonesia (PMI) Government to Government programs (G to G) in Japan in the field of nurses  and care workers experienced ups and downs from 2008 to 2018. To be able to analyze the number of PMIs experiencing ups and downs by measuring the development of the current number of PMIs and predicting these conditions in the future, a prediction model is needed. In this study fuzzy time series models are applied using an average-based length algorithm. Determining the length of an effective interval can influence the results of predictions, which can increase high accuracy in fuzzy time series. The results of the PMI program G to G Japan prediction process for 2019 in the nurse field were obtained 43.3, the care worker field was obtained 300 and the overall field was 325. The results of the G to G Japan PMI prediction performance test, using the Mean Absolute Percentage Error (MAPE) were 24.27% for nurse field with predictive accuracy value of 20–50% included in the criteria of "reasonable", the field of care worker 11.29% with a prediction accuracy value of 10-20% included in the criteria "good", while for the overall field obtained 8.41% with MAPE prediction accuracy value < 10% is included in the criteria of "very good". Based on the results of these predictions it can be used as a decision support for management in making policies related to preparation, planning, scheduling, placement, and protection of future PMI candidates. Thus it can improve the quality of the performance of human resources in providing the best service to prospective G-G Japan PMI programs

    Rough Set Applied to Air Pollution: A New Approach to Manage Pollutions in High Risk Rate Industrial Areas

    Get PDF
    This study presents a rough set application, using together the ideas of classical rough set approach, based on the indiscernibility relation and the dominance-based rough set approach (DRSA), to air micro-pollution management in an industrial site with a high environmental risk rate, such as the industrial area of Syracuse, located in the South of Italy (Sicily). This new data analysis tool has been applied to different decision problems in various fields with considerable success, since it is able to deal both with quantitative and with qualitative data and the results are expressed in terms of decision rules understandable by the decision-maker. In this chapter, some issue related to multi-attribute sorting (i.e. preference-ordered classification) of air pollution risk is presented, considering some meteorological variables, both qualitative and quantitative as attributes, and criteria describing the different objects (pollution occurrences) to be classified, that is, different levels of sulfur oxides (SOx), nitrogen oxides (NOx), and methane (CH4) as pollution indicators. The most significant results obtained from this particular application are presented and discussed: examples of ‘if, … then’ decision rules, attribute relevance as output of the data analysis also in terms of exchangeable or indispensable attributes/criteria, of qualitative substitution effect and interaction between them
    corecore