Search CORE

487 research outputs found

Comparative analysis of neural networks techniques to forecast Airfare Prices

Author: Alessandro Aliberti
Alessio Viticchie
Edoardo Patti
Enrico Macii
Yao Xin
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

With the growth of tourism industry, airplanes have became an affordable choice for medium- and long-distance travels. Accurate forecasting of flights tickets helps the aviation industry to match demand, supply flexibly and optimize aviation resources. Airline companies use dynamic pricing strategies to determine the price of airline tickets to maximize profits. Passengers want to purchase tickets at the lowest selling price for the flight of their choice. However, airline tickets are a special commodity that is time-sensitive and scarce, and the price of airline tickets is affected by various factors. Our research work provides a systematic comparison of various traditional machine learning methods (i.e., Ridge Regression, Lasso Regression, K-Nearest Neighbor, Decision Tree, XGBoost, Random Forest) and deep learning methods (e.g., Fully Connected Networks, Convolutional Neural Networks, Transformer) to address the problem of airfare prediction, by keeping the consumers’ needs. Moreover, we proposed innovative Bayesian neural networks, which represent the first exploitation attempt of Bayesian Inference for the airfare prediction task, to the best of our knowledge. Therefore, we evaluate the performance of our implemented and optimized models on an open dataset. The experimental results show that deep learning-based methods achieve better results on average than traditional ones, while Bayesian neural networks can achieve better performance among the other machine learning methods. However, taking into account both prediction performance and computational time, the Random Forest turns out to be the best choice to apply in this scenario

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Improving Intent Classication By Automatic Data Augmentation Using Word Sense Disambiguation

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Virtual digital assistants are automated software systems which assist humans by understanding natural languages such as English, either in voice or textual form. In recent times, a lot of digital applications have shifted towards providing a user experience using natural language interface. The change is brought up by the degree of ease with which the virtual digital assistants such as Google Assistant and Amazon Alexa can be integrated into your application. These assistants make use of a Natural Language Understanding (NLU) system which acts as an interface to translate unstructured natural language data into a structured form. Such an NLU system uses an intent finding algorithm which gives a high-level idea or meaning of a user query, termed as intent classification. The intent classification step identifies the action(s) that a user wants the assistant to perform. The intent classification step is followed by an entity recognition step in which the entities in the utterance are identified on which the intended action is performed. This step can be viewed as a sequence labeling task which maps an input word sequence into a corresponding sequence of slot labels. This step is also termed as slot filling. In this thesis, we improve the intent classification and slot filling in the virtual voice agents by automatic data augmentation. Spoken Language Understanding systems face the issue of data sparsity. The reason behind this is that it is hard for a human-created training sample to represent all the patterns in the language. Due to the lack of relevant data, deep learning methods are unable to generalize the Spoken Language Understanding model. This thesis expounds a way to overcome the issue of data sparsity in deep learning approaches on Spoken Language Understanding tasks. Here we have described the limitations in the current intent classifiers and how the proposed algorithm uses existing knowledge bases to overcome those limitations. The method helps in creating a more robust intent classifier and slot filling system.Dissertation/ThesisMasters Thesis Computer Science 201

ASU Digital Repository

Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis

Author: Sara Al-Ruzeiqi (1249953)
Publication venue
Publication date: 14/05/2020
Field of study

In this study, a machine learning approach based on time series models, different feature engineering, feature extraction, and feature derivation is proposed to improve air passenger forecasting. Different types of datasets were created to extract new features from the core data. An experiment was undertaken with artificial neural networks to test the performance of neurons in the hidden layer, to optimise the dimensions of all layers and to obtain an optimal choice of connection weights – thus the nonlinear optimisation problem could be solved directly. A method of tuning deep learning models using H2O (which is a feature-rich, open source machine learning platform known for its R and Spark integration and its ease of use) is also proposed, where the trained network model is built from samples of selected features from the dataset in order to ensure diversity of the samples and to improve training. A successful application of deep learning requires setting numerous parameters in order to achieve greater model accuracy. The number of hidden layers and the number of neurons, are key parameters in each layer of such a network. Hyper-parameter, grid search, and random hyper-parameter approaches aid in setting these important parameters. Moreover, a new ensemble strategy is suggested that shows potential to optimise parameter settings and hence save more computational resources throughout the tuning process of the models. The main objective, besides improving the performance metric, is to obtain a distribution on some hold-out datasets that resemble the original distribution of the training data. Particular attention is focused on creating a modified version of Principal Component Analysis (PCA) using a different correlation matrix – obtained by a different correlation coefficient based on kinetic energy to derive new features. The data were collected from several airline datasets to build a deep prediction model for forecasting airline passenger numbers. Preliminary experiments show that fine-tuning provides an efficient approach for tuning the ultimate number of hidden layers and the number of neurons in each layer when compared with the grid search method. Similarly, the results show that the modified version of PCA is more effective in data dimension reduction, classes reparability, and classification accuracy than using traditional PCA.</div

Loughborough University Institutional Repository

Forecasting flight prices with machine learning models : a comparative analysis between low and high-cost airlines

Author: Daly Sophia Maria
Publication venue
Publication date: 01/09/2023
Field of study

Forecasting fight prices is a challenging task due to the complex nature of the pricing algorithms that airlines use. Apart from the fact that these algorithms are not public, they have to take into account many different variables that affect ticket prices. Since the airlines’ demand forecasting may not always hold true as a result of varying demand, prices need to be adjusted accordingly. This approach is called dynamic pricing. It is a technique of price discrimination based on temporal differences mainly, leading to the widely spread assumption that the time of booking is a crucial determinant of the ticket price. This analysis shows that apart from days to departure, especially fight distance and airline type infuence the price significantly. That is, longer fights as well as fights operated by full-service carriers, as opposed to low-cost carriers, are usually more expensive. This thesis uses a dataset including the fight fares and other fight-related characteristics of one-way fights in the US between April and October 2022, retrieved from the search engine Expedia.com. The data is used to train and compare the performance of several supervised learning models aiming to forecast fight prices. Each model is deployed three times, first with the entire dataset, and then once with data only from low-cost-carrier and only from full-service-carriers, respectively. The most accurate models for all three datasets are the random forests followed by k-nearest-neighbor. The results of this thesis suggest that a large part of the fight price can be predicted using fight-related details such as days to departure and fight duration, yet, it also shows that there remains a certain inexplicable variability that could be due to external factors that are not included in the present analysis.Prever os preços de voo é uma tarefa desafiante devido à natureza complexa dos algoritmos de fixação de preços que as companhias aéreas utilizam habitualmente. Para além da sua natureza privada, estes algoritmos levam em consideração muitas variáveis diferentes que afetam, por essa via, os preços das passagens aéreas. Uma vez que a previsão da procura pelas rotas das companhias aéreas nem sempre se mantém válida devido à sua variabilidade ao longo do tempo, os preços precisam de ser ajustados continuamente de modo a favorecer a rentabilidade dessas companhias. Esta prática designa-se por fixação de preços dinâmica, uma técnica de discriminação de preços baseada principalmente em diferenças temporais, levando à amplamente difundida perceção de que o momento da reserva é o principal determinante do preço das passagem aéreas. A presente análise revela que, para além do número de dias até à data de partida, o tipo de companhia aérea e, sobretudo, a distância de voo também influenciam significativamente o respetivo preço. Assim, voos mais longos e operados por companhias de serviço completo, em oposição às companhias de baixo custo, são geralmente mais caros. A presente tese utilizou uma base de dados incluindo os preços das passagens aéreas e outras características relacionadas com voos de ida nos EUA entre abril e outubro de 2022, obtidas através do motor de busca Expedia.com. Estes dados foram utilizados para treinar e comparar o desempenho de vários modelos de aprendizagem automática supervisionada com o objetivo de prever os preços de voo. Cada modelo foi implementado três vezes, primeiro com a base de dados completa, depois com os registos relativos às companhias de baixo custo e, finalmente, apenas com os dados das companhias de serviço completo. Os modelos mais precisos para os três conjuntos de dados são as florestas aleatória seguidos pelos modelos de K vizinhanças próximas. Os resultados deste trabalho sugerem que uma parte significativa do preço pode ser prevista utilizando detalhes relacionados com o voo, como o número de dias até a partida e a duração da viagem. Contudo, permanece uma certa variabilidade não explicada que pode dever-se a fatores externos não incluídos na presente análise

Repositório Institucional da Universidade Católica Portuguesa

Time Series Event Forecasting in Consumer Electronic Markets using Random Forests

Author: Buchwitz Benjamin
Falkenberg Anne
Küsters Ulrich
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2019
Field of study

Consumers are price-sensitive and opportunistic about the place of purchase when buying electronic goods. However, services that advise customers on their purchase time decisions for those products are missing. Given the objective to provide a binary signal to customers to either wait or purchase immediately, classification algorithms are a direct methodological choice. Approaches like random forests allow for the derivation of a probability and class prediction but are usually not used in time series contexts. This is due to missing or time-invariant regressors and unclear prediction settings. We show how classification methods can be used to generate reliable predictions of price events and analyze if they are subject to common market dependencies. Pooling univariate random forests and enhancing them with multivariate features shows that our approach generates stable and valuable recommendations. Because dependency structures between products are transferable, multivariate forecasting increases accuracy and issues recommendations where univariate approaches fail

Publikationsserver der Katholischen Universität Eichstätt-Ingolstadt

AIS Electronic Library (AISeL)

The Impact of COVID-19 on Airfares-A Machine Learning Counterfactual Analysis

Author: Wozny Florian
Publication venue: 'MDPI AG'
Publication date: 01/02/2022
Field of study

This paper studies the performance of machine learning predictions for the counterfactual analysis of air transport. It is motivated by the dynamic and universally regulated international air transport market, where ex post policy evaluations usually lack counterfactual control scenarios. As an empirical example, this paper studies the impact of the COVID-19 pandemic on airfares in 2020 as the difference between predicted and actual airfares. Airfares are important from a policy makers’ perspective, as air transport is crucial for mobility. From a methodological point of view, airfares are also of particular interest given their dynamic character, which makes them challenging for prediction. This paper adopts a novel multi-step prediction technique with walk-forward validation to increase the transparency of the model’s predictive quality. For the analysis, the universe of worldwide airline bookings is combined with detailed airline information. The results show that machine learning with walk-forward validation is powerful for the counterfactual analysis of airfares

Institute of Transport Research:Publications

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Multimedia Big Data Analytics and Fusion for Data Science

Author: Wang Tianyi
Publication venue
Publication date: 19/05/2023
Field of study

Title from PDF of title page, viewed May 24, 2023Dissertation advisor: Shu-Ching ChenVitaIncludes bibliographical references (pages 178-212)Dissertation (Ph.D.)--Department of Computer Science and Electrical Engineering. University of Missouri--Kansas City, 2023Big data is becoming increasingly prevalent in people's everyday lives due to the enormous quantity of data generated from social and economic activities worldwide. As a result, extensive research has been undertaken to support the big data revolution. However, as data grows in volume, traditional data analytic methods face various challenges—especially when raw data comes in multiple forms and formats. This dissertation proposes a multimodal big data analytics and fusion framework that addresses several challenges in data science for handling and learning from multimodal big data. The proposed framework addresses issues during a standard data science project workflow, including data fusion, spatio-temporal deep feature extraction, and model training optimization strategy. First, a hierarchical graph fusion network is presented to capture the inter-modality correlations among modalities. The network hierarchy models the modality-wise combinations with gradually increased complexity to explore all n-modality interactions. Next, an adaptive spatio-temporal graph network is proposed to capture the hidden patterns from spatio-temporal data. It exploits local and global node correlations by improving the pre-defined graph Laplacian and automatically generates the graph adjacency matrix based on a data-driven method. In addition, a dynamic multi-task learning method is introduced to optimize the model training progress by dynamically adjusting the loss weights assigned to each task. It systematically monitors the sample-level prediction errors, task-level weight parameter changing rate, and iteration-level total loss to adjust the weight balance among tasks. The proposed framework has been evaluated on various datasets, including disaster event videos, social media, traffic flow, and other public datasets.Introduction -- Related work -- Overview of the framework -- Dynamic multi-task learning -- Hierarchical graph fusion -- Spatio-temporal graph network -- Conclusions and future wor

University of Missouri: MOspace

Recommended from our members

Developing advanced methods to predict air traffic network growth

Author: Busquets J. G.
Publication venue
Publication date
Field of study

This dissertation describes a forecasting methodology that takes into account changes in the connectivity of an air transportation system and assesses the impact at other levels of the network, such as route demand and air traffic levels. To achieve this, the modelling framework looks at city pair demand generation, route demand assignment and air traffic estimation. While generating air traffic forecasts, the resulting model is also intended to highlight the most important factors driving air traffic network growth. This is achieved by considering a larger set of drivers than those considered in existing methodologies and research as well as exploring the use of alternative modelling techniques. Network evolution is incorporated in the method through an airport connectivity model which identifies how and when airport-pairs across the network change their connectivity status. The problem is split into two models: one identifying those airport-pairs that are added to the network; and another one identifying those airport-pairs that are removed from the network. The modelling approach explores the use of network theory metrics along with other input variables, such as passenger demand, to see whether existing models employing only network theory metrics could be improved. The impact of network evolution is assessed by the effect on air itinerary shares. Two itinerary choice models are developed using two different modelling approaches: multinomial logit and neural networks. While the multinomial logit formulation is the most common approach used to model itinerary shares, only few studies have looked at modelling itinerary shares at the network level. Neural networks have yet to be explored in this field. In this research, air itinerary choice models have been developed at the most aggregate level, using open-source booking data, for a large group of city-pairs within the US Air Transportation System. The output of the itinerary choice models, influenced by the consideration of network evolution, is then used to project air traffic levels and assess the impact of network structure changes in the number of operations in the US ATS. The results reflect the complexity behind network evolution, especially for cases when a mature system is considered (e.g. US ATS): comparisons between the case of a static network and the case when network evolution is considered indicate that the impact of network changes on overall system metrics is relatively minor in the US. However, they indicate that changes in fossil fuel prices may influence changes in the overall network characteristics, and consequently network evolution. The results also prove the feasibility of estimating a single itinerary choice model at the network level for an entire air transportation system. Although the multinomial logit model results have better accuracy, the potential of neural networks for this purpose is also demonstrated, the latter being more representative of the hub-and-spoke network strategy

City Research Online