8 research outputs found
Recommended from our members
Mining learning preferences in web-based instruction: Holists vs. Serialists
Web-based instruction programs are used by learners with diverse knowledge, skills and needs. These differences determine their preferences for the design of Web-based instruction programs and ultimately influence learners' success in using them. Cognitive style has been found to significantly affect learners' preferences of web-based instruction programs. However, the majority of previous studies focus on Field Dependence/Independence. Pask's Holist/Serialist dimension has conceptual links with Field Dependence/Independence but it is left mostly unstudied. Therefore, this study focuses on identifying how this dimension of cognitive style affects learner preferences of Web-based instruction programs. A data mining approach is used to illustrate the difference in preferences between Holists and Serialists. The findings show that there are clear differences in regard to content presentation and navigation support. A set of design features were then produced to help designers incorporate cognitive styles into the development of Web-based instruction programs to ensure that they can accommodate learners' different preferences.This work is partially funded by National Science Council, Taiwan, ROC (NSC 98-2511-S-008-012- MY3; NSC 99-
2511-S-008 -003 -MY2; NSC 99-2631-S-008-001)
A study on the prediction of flight delays of a private aviation airline
The delay is a crucial performance indicator of any transportation system, and flight delays
cause financial and economic consequences to passengers and airlines. Hence, recognizing
them through prediction may improve marketing decisions. The goal is to use machine learning
techniques to predict an aviation challenge: flight delay above 15 minutes on departure of a
private airline. Business and data understanding of this particular segment of aviation are
revised against literature revision, and data preparation, modelling and evaluation are addressed
to lead towards a model that may contribute as support for decision-making in a private aviation
environment. The results show us which algorithms performed better and what variables
contribute the most for the model, thereafter delay on departure.O atraso de voo é um indicador fulcral em toda a indútria de transporte aéreo e esses atrasos
têm consequências económicas e financeiras para passageiros e companhias aéras. Reconhecê-
los através de predição poderá melhorar decisões estratégicas e operacionais. O objectivo é
utilizar técnicas de aprendizagem de máquina (machine learning) para prever um eterno desafio
da aviação: atraso de voo à partida, utilizando dados de uma companhia aérea privada. O
conhecimento do contexto do negócio e dos dados adquiridos, num segmento singular da
aviação, são revistos à luz das literatura vigente e a preparação dos dados, a modelização e
respectiva avaliação são conduzidos de modo a contribuir para uma ferramenta de apoio à
decisão no contexto da aviação privada. Os resultados obtidos revelam quais dos algoritmos
utilizados demonstra uma melhor performance e quais as variáveis dos dados obtidos que mais
contribuem para o modelo e consequentemente para o atraso à partida
Izbor atributa integracijom znanja o domenu primenom metoda odlučivanja kod prediktivnog modelovanja vremenskih serija nadgledanim mašinskim učenjem
The aim of the research presented within this doctoral dissertation is
to develop a feature selection methodology through integrating
domain-specific knowledge by applying mathematical methods of
decision-making, to improve the feature selection process and the
precision of supervised machine learning methods for predictive
modeling of time series.
To integrate domain-specific knowledge, a multi-criteria decision
making method is used, i.e. an analytical hierarchical process proven
to be successful in numerous studies carried out to date. This
approach was selected because it allows the selection of a set of
factors based on their relevance, even in the case of mutually opposite
criteria.
In predicting the movement of time series, the possibility of
integrating feature relevance into support vector machines to improve
their prediction accuracy was studied.
The proposed methodology was applied as a feature-selection method
for the predictive modelling of movement of financial time series.
Unlike existing approaches, where the feature selection method is
based on a quantitative analysis of the input values, the proposed
methodology carries out a qualitative evaluation of the attributes in
relation to the prediction domain and represents a means of
integrating a priori knowledge of the prediction domain
Recommended from our members
The influence of human factors on user's preferences of web-based applications: A data mining approach
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University on 20/12/2010.As the Web is fast becoming an integral feature in many of our daily lives, designers are faced with the challenge of designing Web-based applications for an increasingly diverse user group. In order to develop applications that successfully meet the needs of this user group, designers have to understand the influence of human factors upon users‘ needs and preferences. To address this issue, this thesis presents an investigation that analyses the influence of three human factors, including cognitive style, prior knowledge and gender differences, on users‘ preferences for Web-based applications. In particular, two applications are studied: Web search tools and Web-based instruction tools. Previous research has suggested a number of relationships between these three human factors, so this thesis was driven by three research questions. Firstly, to what extent is the similarity between the two cognitive style dimensions of Witkin‘s Field Dependence/Independence and Pask‘s Holism/Serialism? Secondly, to what extent do computer experts have the same preferences as Internet experts and computer novices have the same preferences as Internet novices? Finally, to what extent are Field Independent users, experts and males alike, and Field Dependent users, novices and females alike? As traditional statistical analysis methods would struggle to effectively capture such relationships, this thesis proposes an integrated data mining approach that combines feature selection and decision trees to effectively capture users‘ preferences. From this, a framework is developed that integrates the combined effect of the three human factors and can be used to inform system designers.
The findings suggest that firstly, there are links between these three human factors. In terms of cognitive style, the relationship between Field Dependent users and Holists can be seen more clearly than the relationship between Field Independent users and Serialists. In terms of prior knowledge, although it is shown that there is a link between computer experience and Internet experience, computer experts are shown to have similar preferences to Internet novices. In terms of the relationship between all three human factors, the results of this study highlighted that the links between cognitive style and gender and between cognitive style and system experience were found to be stronger than the relationship between system experience and gender. This work contributes both theory and methodology to multiple academic communities, including human-computer interaction, information retrieval and data mining. In terms of theory, it has helped to deepen the understanding of the effects of single and multiple human factors on users‘ preferences for Web-based applications. In terms of methodology, an integrated data mining analysis approach was proposed and was shown that is able to capture users‘ preferences
Recommended from our members
The role of classifiers in feature selection: Number vs nature
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature.
This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions.
The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter.
The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users
Individual and ensemble functional link neural networks for data classification
This study investigated the Functional Link Neural Network (FLNN) for solving data classification problems. FLNN based models were developed using evolutionary methods as well as ensemble methods. The outcomes of the experiments covering benchmark classification problems, positively demonstrated the efficacy of the proposed models for undertaking data classification problems
The role of classifiers in feature selection : number vs nature
Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature. This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions. The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter. The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recommended from our members
The predictive power of stock micro-blogging sentiment in forecasting stock market behaviour
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonOnline stock forums have become a vital investing platform on which to publish relevant and valuable user-generated content (UGC) data such as investment recommendations and other stock-related information that allow investors to view the opinions of a large number of users and share-trading ideas. This thesis applies methods from computational linguistics and text-mining techniques to analyse and extract, on a daily basis, sentiments from stock-related micro-blogging messages called “StockTwits”. The primary aim of this research is to provide an understanding of the predictive ability of stock micro-blogging sentiments to forecast future stock price behavioural movements by investigating the various roles played by investor sentiments in determining asset pricing on the stock market.
The empirical analysis in this thesis consists of four main parts based on the predictive power and the role of investor sentiment in the stock market. The first part discusses the findings of the text-mining procedure for extracting and predicting sentiments from stock-related micro-blogging data. The purpose is to provide a comparative textual analysis of different machine learning algorithms for the purpose of selecting the most accurate text-mining techniques for predicting sentiment analysis on StockTwits through the provision of two different applications of feature selection, namely filter and wrapper approaches. The second part of the analysis focuses on investigating the predictive correlations between StockTwits features and the stock market indicators. It aims to examine the explanatory power of StockTwits variables in explaining the dynamic nature of different financial market indicators. The third part of the analysis investigates the role played by noise traders in determining asset prices. The aim is to show that stock returns, volatility and trading volumes are affected by investor sentiment; it also seeks to investigate whether changes in sentiment (bullish or bearish) will have different effects on stock market prices. The fourth part offers an in-depth analysis of some tweet-market relationships which represent an open problem in the empirical literature (e.g. sentiment-return relations and volume-disagreement relations).
The results suggest that StockTwits sentiments exhibit explanatory power in explaining the dynamics of stock prices in the U.S. market. Taking different approaches by combining text-mining techniques with feature selection methods has proved successful in predicting StockTwits sentiments. The applications of the approach presented in this thesis offer real-time investment ideas that may provide investors and their peers with a decision support mechanism. Investor sentiment plays a critical role in determining asset prices in capital markets. Overall, the findings suggest that investor sentiment among noise traders is a priced factor. The findings confirm the existence of asymmetric spillover effects of bullish and bearish sentiments on the stock market. They also suggest that sentiment is a significant factor in explaining stock price behaviour in the capital market and imply the positive role of the stock market in the formation of investor sentiment in stock markets. Furthermore, the research findings demonstrate that disagreement is not only an important factor in determining trading volumes but it is also considered a very significant factor in influencing asset prices and returns in capital markets.
Overall, the findings of the thesis provide empirical evidence that failure to consider the role of investor sentiment in traditional finance theory could lead to an imperfect picture when explaining the behaviour of stock prices in stock market