293,402 research outputs found
Exploiting Multi-Category Characteristics and Unified Framework to Extract Web Content
Abstract Extracting web content is to obtain the required data embedded in web pages, usually including structured records, such as product information, and text content, such as news. Web pages use a large number of HTML tags to organize and to present various information. Both knowing little about the structures of web pages and mixing kinds of information in web pages are making the extraction process very challenging to guarantee extraction performance and extraction adaptability. This study proposes a unified web content extraction framework that can be applied in various web environments to extract both structured records and text content. First, we construct a characteristic container to hold kinds of characteristics related with extraction objectives, including visual text information, content semantics(instead of HTML tag semantics), web page structures, etc. Second, the above characteristics are integrated into an extraction framework for extraction decisions on different web sites. Especially, we put forward different strategies, path aggregation for extracting text content and HMM model for structured records, to locate the extraction area by exploiting both those extraction characteristics. Comparative experiments on multiple web sites with popular extraction methods, including CETR, CETD and CNBE, show that our proposed extraction method can provide better extraction precision and extraction adaptability
Reporting methodological search filter performance comparisons : a literature review
© 2014 The authors. Health Information and Libraries Journal © 2014 Health Libraries Journal.Peer reviewedPostprin
Product Redesign and Innovation Based on Online Reviews:A Multistage Combined Search Method
Online reviews published on the e-commerce platform provide a new source of information for designers to develop new products. Past research on new product development (NPD) using user-generated textual data commonly focused solely on extracting and identifying product features to be improved. However, the competitive analysis of product features and more specific improvement strategies have not been explored deeply. This study fully uses the rich semantic attributes of online review texts and proposes a novel online review–driven modeling framework. This new approach can extract fine-grained product features; calculate their importance, performance, and competitiveness; and build a competitiveness network for each feature. As a result, decision making is assisted, and specific product improvement strategies are developed for NPD beyond existing modeling approaches in this domain. Specifically, online reviews are first classified into redesign- and innovation-related themes using a multiple embedding model, and the redesign and innovation product features can be extracted accordingly using a mutual information multilevel feature extraction method. Moreover, the importance and performance of features are calculated, and the competitiveness and competitiveness network of features are obtained through a personalized unidirectional bipartite graph algorithm. Finally, the importance performance competitiveness analysis plot is constructed, and the product improvement strategy is developed via a multistage combined search algorithm. Case studies and comparative experiments show the effectiveness of the proposed method and provide novel business insights for stakeholders, such as product providers, managers, and designers
Expediting TTS Synthesis with Adversarial Vocoding
Recent approaches in text-to-speech (TTS) synthesis employ neural network
strategies to vocode perceptually-informed spectrogram representations directly
into listenable waveforms. Such vocoding procedures create a computational
bottleneck in modern TTS pipelines. We propose an alternative approach which
utilizes generative adversarial networks (GANs) to learn mappings from
perceptually-informed spectrograms to simple magnitude spectrograms which can
be heuristically vocoded. Through a user study, we show that our approach
significantly outperforms na\"ive vocoding strategies while being hundreds of
times faster than neural network vocoders used in state-of-the-art TTS systems.
We also show that our method can be used to achieve state-of-the-art results in
unsupervised synthesis of individual words of speech.Comment: Published as a conference paper at INTERSPEECH 201
Simulation in manufacturing and business: A review
Copyright @ 2009 Elsevier B.V.This paper reports the results of a review of simulation applications published within peer-reviewed literature between 1997 and 2006 to provide an up-to-date picture of the role of simulation techniques within manufacturing and business. The review is characterised by three factors: wide coverage, broad scope of the simulation techniques, and a focus on real-world applications. A structured methodology was followed to narrow down the search from around 20,000 papers to 281. Results include interesting trends and patterns. For instance, although discrete event simulation is the most popular technique, it has lower stakeholder engagement than other techniques, such as system dynamics or gaming. This is highly correlated with modelling lead time and purpose. Considering application areas, modelling is mostly used in scheduling. Finally, this review shows an increasing interest in hybrid modelling as an approach to cope with complex enterprise-wide systems
HOG, LBP and SVM based Traffic Density Estimation at Intersection
Increased amount of vehicular traffic on roads is a significant issue. High
amount of vehicular traffic creates traffic congestion, unwanted delays,
pollution, money loss, health issues, accidents, emergency vehicle passage and
traffic violations that ends up in the decline in productivity. In peak hours,
the issues become even worse. Traditional traffic management and control
systems fail to tackle this problem. Currently, the traffic lights at
intersections aren't adaptive and have fixed time delays. There's a necessity
of an optimized and sensible control system which would enhance the efficiency
of traffic flow. Smart traffic systems perform estimation of traffic density
and create the traffic lights modification consistent with the quantity of
traffic. We tend to propose an efficient way to estimate the traffic density on
intersection using image processing and machine learning techniques in real
time. The proposed methodology takes pictures of traffic at junction to
estimate the traffic density. We use Histogram of Oriented Gradients (HOG),
Local Binary Patterns (LBP) and Support Vector Machine (SVM) based approach for
traffic density estimation. The strategy is computationally inexpensive and can
run efficiently on raspberry pi board. Code is released at
https://github.com/DevashishPrasad/Smart-Traffic-Junction.Comment: paper accepted at IEEE PuneCon 201
- …