6,339 research outputs found
Generic Subsequence Matching Framework: Modularity, Flexibility, Efficiency
Subsequence matching has appeared to be an ideal approach for solving many
problems related to the fields of data mining and similarity retrieval. It has
been shown that almost any data class (audio, image, biometrics, signals) is or
can be represented by some kind of time series or string of symbols, which can
be seen as an input for various subsequence matching approaches. The variety of
data types, specific tasks and their partial or full solutions is so wide that
the choice, implementation and parametrization of a suitable solution for a
given task might be complicated and time-consuming; a possibly fruitful
combination of fragments from different research areas may not be obvious nor
easy to realize. The leading authors of this field also mention the
implementation bias that makes difficult a proper comparison of competing
approaches. Therefore we present a new generic Subsequence Matching Framework
(SMF) that tries to overcome the aforementioned problems by a uniform frame
that simplifies and speeds up the design, development and evaluation of
subsequence matching related systems. We identify several relatively separate
subtasks solved differently over the literature and SMF enables to combine them
in straightforward manner achieving new quality and efficiency. This framework
can be used in many application domains and its components can be reused
effectively. Its strictly modular architecture and openness enables also
involvement of efficient solutions from different fields, for instance
efficient metric-based indexes. This is an extended version of a paper
published on DEXA 2012.Comment: This is an extended version of a paper published on DEXA 201
Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review
The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features
The effect of google drive distance and duration in residential property in Sydney, Australia
© 2016 by World Scientific Publishing Co. Pte. Ltd. Predicting the market value of a residential property accurately without inspection by professional valuer could be beneficial for vary of organization and people. Building an Automated Valuation Model could be beneficial if it will be accurate adequately. This paper examined 47 machine learning models (linear and non-linear). These models are fitted on 1967 records of units from 19 suburbs of Sydney, Australia. The main aim of this paper is to compare the performance of these techniques using this data set and investigate the effect of spatial information on valuation accuracy. The results demonstrated that tree models named eXtreme Gradient Boosting Linear, eXtreme Gradient Boosting Tree and Random Forest respectively have best performance among other techniques and spatial information such drive distance and duration to CBD increase the predictive model performance significantly
Recommended from our members
A novel improved model for building energy consumption prediction based on model integration
Building energy consumption prediction plays an irreplaceable role in energy planning, management, and conservation. Constantly improving the performance of prediction models is the key to ensuring the efficient operation of energy systems. Moreover, accuracy is no longer the only factor in revealing model performance, it is more important to evaluate the model from multiple perspectives, considering the characteristics of engineering applications. Based on the idea of model integration, this paper proposes a novel improved integration model (stacking model) that can be used to forecast building energy consumption. The stacking model combines advantages of various base prediction algorithms and forms them into “meta-features” to ensure that the final model can observe datasets from different spatial and structural angles. Two cases are used to demonstrate practical engineering applications of the stacking model. A comparative analysis is performed to evaluate the prediction performance of the stacking model in contrast with existing well-known prediction models including Random Forest, Gradient Boosted Decision Tree, Extreme Gradient Boosting, Support Vector Machine, and K-Nearest Neighbor. The results indicate that the stacking method achieves better performance than other models, regarding accuracy (improvement of 9.5%–31.6% for Case A and 16.2%–49.4% for Case B), generalization (improvement of 6.7%–29.5% for Case A and 7.1%-34.6% for Case B), and robustness (improvement of 1.5%–34.1% for Case A and 1.8%–19.3% for Case B). The proposed model enriches the diversity of algorithm libraries of empirical models
Mobile Crowd Location Prediction with Hybrid Features using Ensemble Learning
With the explosive growth of location-based service on mobile devices, predicting users’ future locations and trajectories is of increasing importance to support proactive information services. In this paper, we model this problem as a supervised learning task and propose to use ensemble learning methods with hybrid features to solve it. We characterize the properties of users’ visited locations and movement patterns and then extract feature types (temporal, spatial, and system) to quantify the correlation between locations and features. Finally, we apply ensemble methods to predict users’ future locations with extracted features. Moreover, we design an adaptive Markov Chain model to predict users’ trajectories between two locations. To evaluate the system performance, we use a real-life dataset from the Nokia Mobile Data Challenge. Experiment results unveil interesting findings: (1) For individual predictors, Bayes Networks outperform all others when data quality is good, while J48 delivers the best results when data quality is bad; (2) Ensemble predictors outperform individual predictors in general under all conditions; and (3) Ensemble predictor performance depends on the user movement patterns
Survey of Spectrum Sharing for Inter-Technology Coexistence
Increasing capacity demands in emerging wireless technologies are expected to
be met by network densification and spectrum bands open to multiple
technologies. These will, in turn, increase the level of interference and also
result in more complex inter-technology interactions, which will need to be
managed through spectrum sharing mechanisms. Consequently, novel spectrum
sharing mechanisms should be designed to allow spectrum access for multiple
technologies, while efficiently utilizing the spectrum resources overall.
Importantly, it is not trivial to design such efficient mechanisms, not only
due to technical aspects, but also due to regulatory and business model
constraints. In this survey we address spectrum sharing mechanisms for wireless
inter-technology coexistence by means of a technology circle that incorporates
in a unified, system-level view the technical and non-technical aspects. We
thus systematically explore the spectrum sharing design space consisting of
parameters at different layers. Using this framework, we present a literature
review on inter-technology coexistence with a focus on wireless technologies
with equal spectrum access rights, i.e. (i) primary/primary, (ii)
secondary/secondary, and (iii) technologies operating in a spectrum commons.
Moreover, we reflect on our literature review to identify possible spectrum
sharing design solutions and performance evaluation approaches useful for
future coexistence cases. Finally, we discuss spectrum sharing design
challenges and suggest future research directions
Mobile Crowd Location Prediction with Hybrid Features using Ensemble Learning
With the explosive growth of location-based service on mobile devices, predicting users’ future locations and trajectories is of increasing importance to support proactive information services. In this paper, we model this problem as a supervised learning task and propose to use ensemble learning methods with hybrid features to solve it. We characterize the properties of users’ visited locations and movement patterns and then extract feature types (temporal, spatial, and system) to quantify the correlation between locations and features. Finally, we apply ensemble methods to predict users’ future locations with extracted features. Moreover, we design an adaptive Markov Chain model to predict users’ trajectories between two locations. To evaluate the system performance, we use a real-life dataset from the Nokia Mobile Data Challenge. Experiment results unveil interesting findings: (1) For individual predictors, Bayes Networks outperform all others when data quality is good, while J48 delivers the best results when data quality is bad; (2) Ensemble predictors outperform individual predictors in general under all conditions; and (3) Ensemble predictor performance depends on the user movement patterns
- …