    AUGUR: Forecasting the Emergence of New Research Topics

    Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. We address this issue by introducing Augur, a novel approach to the early detection of research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 interval and outperformed four alternative approaches in terms of both precision and recall

    Assessing and predicting small industrial enterprises’ credit ratings:A fuzzy decision-making approach

    Corporate credit-rating assessment plays a crucial role in helping financial institutions make their lending decisions and in reducing the financial constraints of small enterprises. This paper presents a new approach for small industrial enterprises’ credit-rating assessment using fuzzy decision-making methods, and tests it using real bank loan data from 1,820 small industrial enterprises in China. The procedure of the proposed rating approach includes (1) using triangular fuzzy numbers to quantify the qualitative evaluation indicators; (2) adopting a correlation analysis, univariate analysis and stepping backwards feature selection method to select the input features; (3) employing the best-worst method (BWM) combined with the entropy weight method (EWM), the fuzzy c-means algorithm and the technique for order of preference by similarity to ideal solution (TOPSIS) to classify small enterprises into rating classes; and (4) applying the lattice degree of nearness to predict a new loan applicant’s rating. We also conduct a 10-fold cross-validation to evaluate the predictive performance of our proposed approach. The predictive results demonstrate that our proposed data-processing and feature selection approaches have better accuracy than the alternative approaches in predicting default, offering bankers a new valuable rating system to assist their decision making

    Image-based quantitative analysis of gold immunochromatographic strip via cellular neural network approach

    "(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works."Gold immunochromatographic strip assay provides a rapid, simple, single-copy and on-site way to detect the presence or absence of the target analyte. This paper aims to develop a method for accurately segmenting the test line and control line of the gold immunochromatographic strip (GICS) image for quantitatively determining the trace concentrations in the specimen, which can lead to more functional information than the traditional qualitative or semi-quantitative strip assay. The canny operator as well as the mathematical morphology method is used to detect and extract the GICS reading-window. Then, the test line and control line of the GICS reading-window are segmented by the cellular neural network (CNN) algorithm, where the template parameters of the CNN are designed by the switching particle swarm optimization (SPSO) algorithm for improving the performance of the CNN. It is shown that the SPSO-based CNN offers a robust method for accurately segmenting the test and control lines, and therefore serves as a novel image methodology for the interpretation of GICS. Furthermore, quantitative comparison is carried out among four algorithms in terms of the peak signal-to-noise ratio. It is concluded that the proposed CNN algorithm gives higher accuracy and the CNN is capable of parallelism and analog very-large-scale integration implementation within a remarkably efficient time

    Development of Neurofuzzy Architectures for Electricity Price Forecasting

    In 20th century, many countries have liberalized their electricity market. This power markets liberalization has directed generation companies as well as wholesale buyers to undertake a greater intense risk exposure compared to the old centralized framework. In this framework, electricity price prediction has become crucial for any market player in their decision‐making process as well as strategic planning. In this study, a prototype asymmetric‐based neuro‐fuzzy network (AGFINN) architecture has been implemented for short‐term electricity prices forecasting for ISO New England market. AGFINN framework has been designed through two different defuzzification schemes. Fuzzy clustering has been explored as an initial step for defining the fuzzy rules while an asymmetric Gaussian membership function has been utilized in the fuzzification part of the model. Results related to the minimum and maximum electricity prices for ISO New England, emphasize the superiority of the proposed model over well‐established learning‐based models

    Modelling socio-ecological systems: Implementation of an advanced Fuzzy Cognitive Map framework for policy development for addressing complex real-life challenges

    This study implements a novel Fuzzy Cognitive Map (FCM) framework for addressing large complex socio-ecological problems. These are characterized as qualitative, dominated by uncertainty, human involvement with different and vague perceptions/expectations, and complex systems dynamics due to feedback relations. The FCM framework provides a participatory soft computing approach to develop consensus solutions. We demonstrate its implementation in a case study: a national-scale acute water scarcity crisis. The model has eight steps starting from collecting data from stakeholders in the form of FCMs (bi-directional graphs) represented by nodes and imprecise connections. All subsequent steps operate within a new fuzzy 2-tuple framework that overcomes previous FCM limitations through advanced processing methods, where large FCMs are fuzzified and analyzed, condensed, and aggregated using graph-theoretic measures. FCMs are simulated as Auto-Associative Neural Networks (AANN) to assess policy solutions to address the problem. In this study, very large cognitive maps were developed through interviews capturing perceptions of five different stakeholder groups taking into consideration the causes, consequences and challenges of the acute water scarcity problem in Jordan. The complex FCMs containing 186 variables comprehensively covered all aspects of water scarcity. FCMs were condensed into smaller maps in two levels. They were also combined into five stakeholder group FCMs and one whole system FCM (total 123 FCMs). AANN simulations of policy scenarios were conducted on the whole system FCM, first at the most condensed level and then moved top-down through the next two levels of granularity to explore potential solutions. These were ranked by a novel fuzzy Appropriateness criterion to provide a number of high level and effective strategies to mitigate the water crisis

    Distributed Semi-supervised Fuzzy Regression with Interpolation Consistency Regularization

    Recently, distributed semi-supervised learning (DSSL) algorithms have shown their effectiveness in leveraging unlabeled samples over interconnected networks, where agents cannot share their original data with each other and can only communicate non-sensitive information with their neighbors. However, existing DSSL algorithms cannot cope with data uncertainties and may suffer from high computation and communication overhead problems. To handle these issues, we propose a distributed semi-supervised fuzzy regression (DSFR) model with fuzzy if-then rules and interpolation consistency regularization (ICR). The ICR, which was proposed recently for semi-supervised problem, can force decision boundaries to pass through sparse data areas, thus increasing model robustness. However, its application in distributed scenarios has not been considered yet. In this work, we proposed a distributed Fuzzy C-means (DFCM) method and a distributed interpolation consistency regularization (DICR) built on the well-known alternating direction method of multipliers to respectively locate parameters in antecedent and consequent components of DSFR. Notably, the DSFR model converges very fast since it does not involve back-propagation procedure and is scalable to large-scale datasets benefiting from the utilization of DFCM and DICR. Experiments results on both artificial and real-world datasets show that the proposed DSFR model can achieve much better performance than the state-of-the-art DSSL algorithm in terms of both loss value and computational cost

    Development of c-means Clustering Based Adaptive Fuzzy Controller for A Flapping Wing Micro Air Vehicle

    Advanced and accurate modelling of a Flapping Wing Micro Air Vehicle (FW MAV) and its control is one of the recent research topics related to the field of autonomous Unmanned Aerial Vehicles (UAVs). In this work, a four wing Natureinspired (NI) FW MAV is modeled and controlled inspiring by its advanced features like quick flight, vertical take-off and landing, hovering, and fast turn, and enhanced manoeuvrability when contrasted with comparable-sized fixed and rotary wing UAVs. The Fuzzy C-Means (FCM) clustering algorithm is utilized to demonstrate the NIFW MAV model, which has points of interest over first principle based modelling since it does not depend on the system dynamics, rather based on data and can incorporate various uncertainties like sensor error. The same clustering strategy is used to develop an adaptive fuzzy controller. The controller is then utilized to control the altitude of the NIFW MAV, that can adapt with environmental disturbances by tuning the antecedent and consequent parameters of the fuzzy system.Comment: this paper is currently under review in Journal of Artificial Intelligence and Soft Computing Researc

    Predictive Performance Of Machine Learning Algorithms For Ore Reserve Estimation In Sparse And Imprecise Data

    Thesis (Ph.D.) University of Alaska Fairbanks, 2006Traditional geostatistical estimation techniques have been used predominantly in the mining industry for the purpose of ore reserve estimation. Determination of mineral reserve has always posed considerable challenge to mining engineers due to geological complexities that are generally associated with the phenomenon of ore body formation. Considerable research over the years has resulted in the development of a number of state-of-the-art methods for the task of predictive spatial mapping such as ore reserve estimation. Recent advances in the use of the machine learning algorithms (MLA) have provided a new approach to solve the age-old problem. Therefore, this thesis is focused on the use of two MLA, viz. the neural network (NN) and support vector machine (SVM), for the purpose of ore reserve estimation. Application of the MLA have been elaborated with two complex drill hole datasets. The first dataset is a placer gold drill hole data characterized by high degree of spatial variability, sparseness and noise while the second dataset is obtained from a continuous lode deposit. The application and success of the models developed using these MLA for the purpose of ore reserve estimation depends to a large extent on the data subsets on which they are trained and subsequently on the selection of the appropriate model parameters. The model data subsets obtained by random data division are not desirable in sparse data conditions as it usually results in statistically dissimilar subsets, thereby reducing their applicability. Therefore, an ideal technique for data subdivision has been suggested in the thesis. Additionally, issues pertaining to the optimum model development have also been discussed. To investigate the accuracy and the applicability of the MLA for ore reserve estimation, their generalization ability was compared with the geostatistical ordinary kriging (OK) method. The analysis of Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Error (ME) and the coefficient of determination (R2) as the indices of the model performance indicated that they may significantly improve the predictive ability and thereby reduce the inherent risk in ore reserve estimation