122,787 research outputs found
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?
Purpose:
The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint.
Design/methodology/approach:
A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel NaĂŻve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint.
Findings:
The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior.
Research limitations/implications:
The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation.
Originality/value:
Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective
Assessment in anatomy
From an educational perspective, a very important problem is that of assessment, for establishing competency and as selection criterion for different professional purposes. Among the issues to be addressed are the methods of assessment and/or the type of tests, the range of scores, or the definition of honour degrees. The methods of assessment comprise such different forms such as the spotter examination, short or long essay questions, short answer questions, true-false questions, single best answer questions, multiple choice questions, extended match questions, or several forms of oral approaches such as viva voce examinations.Knowledge about this is important when assessing different educational objectives; assessing educational objectives from the cognitive domain will need different assessment instruments than assessing educational objectives from the psychomotor domain or even the affective domain.There is no golden rule, which type of assessment instrument or format will be the best in measuring certain educational objectives; but one has to respect that there is no assessment instrument, which is capable to assess educational objectives from all domains of educational objectives.Whereas the first two or three levels of progress can be assessed by well-structured written examinations such as multiple choice questions, or multiple answer questions, other and higher level progresses need other instruments, such as a thesis, or direct observation.This is no issue at all in assessment tools, where the students are required to select the appropriate answer from a given set of choices, as in true false questions, MCQ, EMQ, etc. The standard setting is done in these cases by the selection of the true answer
Content-Aware DataGuides for Indexing Large Collections of XML Documents
XML is well-suited for modelling structured data with
textual content. However, most indexing approaches perform
structure and content matching independently, combining
the retrieved path and keyword occurrences in a third
step. This paper shows that retrieval in XML documents can
be accelerated significantly by processing text and structure
simultaneously during all retrieval phases. To this end,
the Content-Aware DataGuide (CADG) enhances the wellknown
DataGuide with (1) simultaneous keyword and path
matching and (2) a precomputed content/structure join. Extensive
experiments prove the CADG to be 50-90% faster
than the DataGuide for various sorts of query and document,
including difficult cases such as poorly structured
queries and recursive document paths. A new query classification
scheme identifies precise query characteristics with
a predominant influence on the performance of the individual
indices. The experiments show that the CADG is applicable
to many real-world applications, in particular large
collections of heterogeneously structured XML documents
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
We develop a learning principle and an efficient algorithm for batch learning
from logged bandit feedback. This learning setting is ubiquitous in online
systems (e.g., ad placement, web search, recommendation), where an algorithm
makes a prediction (e.g., ad ranking) for a given input (e.g., query) and
observes bandit feedback (e.g., user clicks on presented ads). We first address
the counterfactual nature of the learning problem through propensity scoring.
Next, we prove generalization error bounds that account for the variance of the
propensity-weighted empirical risk estimator. These constructive bounds give
rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM
can be used to derive a new learning method -- called Policy Optimizer for
Exponential Models (POEM) -- for learning stochastic linear rules for
structured output prediction. We present a decomposition of the POEM objective
that enables efficient stochastic gradient optimization. POEM is evaluated on
several multi-label classification problems showing substantially improved
robustness and generalization performance compared to the state-of-the-art.Comment: 10 page
BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking
Data generation is a key issue in big data benchmarking that aims to generate
application-specific data sets to meet the 4V requirements of big data.
Specifically, big data generators need to generate scalable data (Volume) of
different types (Variety) under controllable generation rates (Velocity) while
keeping the important characteristics of raw data (Veracity). This gives rise
to various new challenges about how we design generators efficiently and
successfully. To date, most existing techniques can only generate limited types
of data and support specific big data systems such as Hadoop. Hence we develop
a tool, called Big Data Generator Suite (BDGS), to efficiently generate
scalable big data while employing data models derived from real data to
preserve data veracity. The effectiveness of BDGS is demonstrated by developing
six data generators covering three representative data types (structured,
semi-structured and unstructured) and three data sources (text, graph, and
table data)
- …