18,855 research outputs found
Improvements on coronal hole detection in SDO/AIA images using supervised classification
We demonstrate the use of machine learning algorithms in combination with
segmentation techniques in order to distinguish coronal holes and filaments in
SDO/AIA EUV images of the Sun. Based on two coronal hole detection techniques
(intensity-based thresholding, SPoCA), we prepared data sets of manually
labeled coronal hole and filament channel regions present on the Sun during the
time range 2011 - 2013. By mapping the extracted regions from EUV observations
onto HMI line-of-sight magnetograms we also include their magnetic
characteristics. We computed shape measures from the segmented binary maps as
well as first order and second order texture statistics from the segmented
regions in the EUV images and magnetograms. These attributes were used for data
mining investigations to identify the most performant rule to differentiate
between coronal holes and filament channels. We applied several classifiers,
namely Support Vector Machine, Linear Support Vector Machine, Decision Tree,
and Random Forest and found that all classification rules achieve good results
in general, with linear SVM providing the best performances (with a true skill
statistic of ~0.90). Additional information from magnetic field data
systematically improves the performance across all four classifiers for the
SPoCA detection. Since the calculation is inexpensive in computing time, this
approach is well suited for applications on real-time data. This study
demonstrates how a machine learning approach may help improve upon an
unsupervised feature extraction method.Comment: in press for SWS
What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?
Purpose:
The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint.
Design/methodology/approach:
A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint.
Findings:
The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior.
Research limitations/implications:
The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation.
Originality/value:
Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Replication issues in syntax-based aspect extraction for opinion mining
Reproducing experiments is an important instrument to validate previous work
and build upon existing approaches. It has been tackled numerous times in
different areas of science. In this paper, we introduce an empirical
replicability study of three well-known algorithms for syntactic centric
aspect-based opinion mining. We show that reproducing results continues to be a
difficult endeavor, mainly due to the lack of details regarding preprocessing
and parameter setting, as well as due to the absence of available
implementations that clarify these details. We consider these are important
threats to validity of the research on the field, specifically when compared to
other problems in NLP where public datasets and code availability are critical
validity components. We conclude by encouraging code-based research, which we
think has a key role in helping researchers to understand the meaning of the
state-of-the-art better and to generate continuous advances.Comment: Accepted in the EACL 2017 SR
Historical collaborative geocoding
The latest developments in digital have provided large data sets that can
increasingly easily be accessed and used. These data sets often contain
indirect localisation information, such as historical addresses. Historical
geocoding is the process of transforming the indirect localisation information
to direct localisation that can be placed on a map, which enables spatial
analysis and cross-referencing. Many efficient geocoders exist for current
addresses, but they do not deal with the temporal aspect and are based on a
strict hierarchy (..., city, street, house number) that is hard or impossible
to use with historical data. Indeed historical data are full of uncertainties
(temporal aspect, semantic aspect, spatial precision, confidence in historical
source, ...) that can not be resolved, as there is no way to go back in time to
check. We propose an open source, open data, extensible solution for geocoding
that is based on the building of gazetteers composed of geohistorical objects
extracted from historical topographical maps. Once the gazetteers are
available, geocoding an historical address is a matter of finding the
geohistorical object in the gazetteers that is the best match to the historical
address. The matching criteriae are customisable and include several dimensions
(fuzzy semantic, fuzzy temporal, scale, spatial precision ...). As the goal is
to facilitate historical work, we also propose web-based user interfaces that
help geocode (one address or batch mode) and display over current or historical
topographical maps, so that they can be checked and collaboratively edited. The
system is tested on Paris city for the 19-20th centuries, shows high returns
rate and is fast enough to be used interactively.Comment: WORKING PAPE
A fuzzy rule model for high level musical features on automated composition systems
Algorithmic composition systems are now well-understood. However, when they are used for specific tasks like creating material for a part of a piece, it is common to prefer, from all of its possible outputs, those exhibiting specific properties. Even though the number of valid outputs is huge, many times the selection is performed manually, either using expertise in the algorithmic model, by means of sampling techniques, or some times even by chance. Automations of this process have been done traditionally by using machine learning techniques. However, whether or not these techniques are really capable of capturing the human rationality, through which the selection is done, to a great degree remains as an open question. The present work discusses a possible approach, that combines expert’s opinion and a fuzzy methodology for rule extraction, to model high level features. An early implementation able to explore the universe of outputs of a particular algorithm by means of the extracted rules is discussed. The rules search for objects similar to those having a desired and pre-identified feature. In this sense, the model can be seen as a finder of objects with specific properties.Peer ReviewedPostprint (author's final draft
Patient Specific Congestive Heart Failure Detection From Raw ECG signal
In this study; in order to diagnose congestive heart failure (CHF) patients,
non-linear second-order difference plot (SODP) obtained from raw 256 Hz sampled
frequency and windowed record with different time of ECG records are used. All
of the data rows are labelled with their belongings to classify much more
realistically. SODPs are divided into different radius of quadrant regions and
numbers of the points fall in the quadrants are computed in order to extract
feature vectors. Fisher's linear discriminant, Naive Bayes, Radial basis
function, and artificial neural network are used as classifier. The results are
considered in two step validation methods as general k-fold cross-validation
and patient based cross-validation. As a result, it is shown that using neural
network classifier with features obtained from SODP, the constructed system
could distinguish normal and CHF patients with 100% accuracy rate. KeywordsComment: Congestive heart failure, ECG, Second-Order Difference Plot,
classification, patient based cross-validatio
- …