15,179 research outputs found

    Concept Relation Discovery and Innovation Enabling Technology (CORDIET)

    Get PDF
    Concept Relation Discovery and Innovation Enabling Technology (CORDIET), is a toolbox for gaining new knowledge from unstructured text data. At the core of CORDIET is the C-K theory which captures the essential elements of innovation. The tool uses Formal Concept Analysis (FCA), Emergent Self Organizing Maps (ESOM) and Hidden Markov Models (HMM) as main artifacts in the analysis process. The user can define temporal, text mining and compound attributes. The text mining attributes are used to analyze the unstructured text in documents, the temporal attributes use these document's timestamps for analysis. The compound attributes are XML rules based on text mining and temporal attributes. The user can cluster objects with object-cluster rules and can chop the data in pieces with segmentation rules. The artifacts are optimized for efficient data analysis; object labels in the FCA lattice and ESOM map contain an URL on which the user can click to open the selected document

    An overview of decision table literature 1982-1995.

    Get PDF
    This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

    Designing algorithms to aid discovery by chemical robots

    Get PDF
    Recently, automated robotic systems have become very efficient, thanks to improved coupling between sensor systems and algorithms, of which the latter have been gaining significance thanks to the increase in computing power over the past few decades. However, intelligent automated chemistry platforms for discovery orientated tasks need to be able to cope with the unknown, which is a profoundly hard problem. In this Outlook, we describe how recent advances in the design and application of algorithms, coupled with the increased amount of chemical data available, and automation and control systems may allow more productive chemical research and the development of chemical robots able to target discovery. This is shown through examples of workflow and data processing with automation and control, and through the use of both well-used and cutting-edge algorithms illustrated using recent studies in chemistry. Finally, several algorithms are presented in relation to chemical robots and chemical intelligence for knowledge discovery

    Visualizing engineering design data using a modified two-level self-organizing map clustering approach

    Get PDF
    Engineers tasked with designing large and complex systems are continually in need of decision-making aids able to sift through enormous amounts of data produced through simulation and experimentation. Understanding these systems often requires visualizing multidimensional design data. Visual cues such as size, color, and symbols are often used to denote specific variables (dimensions) as well as characteristics of the data. However, these cues are unable to effectively convey information attributed to a system containing more than three dimensions. Two general techniques can be employed to reduce the complexity of information presented to an engineer: dimension reduction, and individual variable comparison. Each approach can provide a comprehensible visualization of the resulting design space, which is vital for an engineer to decide upon an appropriate optimization algorithm. Visualization techniques, such as self-organizing maps (SOMs), offer powerful methods able to surmount the difficulties of reducing the complexity of n-dimensional data by producing simple to understand visual representations that quickly highlight trends to support decision-making. The SOM can be extended by providing relevant output information in the form of contextual labels. Furthermore, these contextual labels can be leveraged to visualize a set of output maps containing statistical evaluations of each node residing within a trained SOM. These maps give a designer a visual context to the data set’s natural topology by highlighting the nodal performance amongst the maps. A drawback to using SOMs is the clustering of promising points with predominately less desirable data. Similar data groupings can be revealed from the trained output maps using visualization techniques such as the SOM, but these are not inherently cluster analysis methods. Cluster analysis is an approach able to assimilate similar data objects into “natural groups” from an otherwise unknown prior knowledge of a data set. Engineering data composed of design alternatives with associated variable parameters often contain data objects with unknown classification labels. Consequently, identifying the correct classifications can be difficult and costly. This thesis applies a cluster analysis technique to SOMs to segment a high-dimensional dataset into “meta-clusters”. Furthermore, the thesis will describe the algorithm created to establish these meta-clusters through the development of several computational metrics involving intra and inter cluster densities. The results from this work show the presented algorithm’s ability to narrow a large-complex system’s plethora of design alternatives into a few overarching set of design groups containing similar principal characteristics, which saves the time a designer would otherwise spend analyzing numerous design alternatives

    Adaptive Cooperative Learning Methodology for Oil Spillage Pattern Clustering and Prediction

    Get PDF
    The serious environmental, economic and social consequences of oil spillages could devastate any nation of the world. Notable aftermath of this effect include loss of (or serious threat to) lives, huge financial losses, and colossal damage to the ecosystem. Hence, understanding the pattern and  making precise predictions in real time is required (as opposed to existing rough and discrete prediction) to give decision makers a more realistic picture of environment. This paper seeks to address this problem by exploiting oil spillage features with sets of collected data of oil spillage scenarios. The proposed system integrates three state-of-the-art tools: self organizing maps, (SOM), ensembles of deep neural network (k-DNN) and adaptive neuro-fuzzy inference system (ANFIS). It begins with unsupervised learning using SOM, where four natural clusters were discovered and used in making the data suitable for classification and prediction (supervised learning) by ensembles of k-DNN and ANFIS. Results obtained showed the significant classification and prediction improvements, which is largely attributed to the hybrid learning approach, ensemble learning and cognitive reasoning capabilities. However, optimization of k-DNN structure and weights would be needed for speed enhancement. The system would provide a means of understanding the nature, type and severity of oil spillages thereby facilitating a rapid response to impending oils spillages. Keywords: SOM, ANFIS, Fuzzy Logic, Neural Network, Oil Spillage, Ensemble Learnin

    Data collection and advanced statistical analysis in phytotoxic activity of aerial parts exudates of Salvia spp

    Get PDF
    In order to define the phytotoxic potential of Salvia species a database was developed for fast and efficient data collection in screening studies of the inhibitory activity of Salvia exudates on the germination of Papaver rhoeas L. and Avena sativa L.. The structure of the database is associated with the use of algorithms for calculating the usual germination indices reported in the literature, plus the newly defined indices (Weighted Average Damage, Differential Weighted Average Damage, Germination Weighted Average Velocity) and other variables usually recorded in experiments of phytotoxicity (LC50, LC90). Furthermore, other algorithms were designed to calculate the one-way ANOVA followed by Duncan's multiple range test to highlight automatically significant differences between the species. The database model was designed in order to be suitable also for the development of further analysis based on the artificial neural network approach, using Self-Organising Maps (SOM)

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance
    corecore