293,402 research outputs found

    Exploiting Multi-Category Characteristics and Unified Framework to Extract Web Content

    Get PDF
    Abstract Extracting web content is to obtain the required data embedded in web pages, usually including structured records, such as product information, and text content, such as news. Web pages use a large number of HTML tags to organize and to present various information. Both knowing little about the structures of web pages and mixing kinds of information in web pages are making the extraction process very challenging to guarantee extraction performance and extraction adaptability. This study proposes a unified web content extraction framework that can be applied in various web environments to extract both structured records and text content. First, we construct a characteristic container to hold kinds of characteristics related with extraction objectives, including visual text information, content semantics(instead of HTML tag semantics), web page structures, etc. Second, the above characteristics are integrated into an extraction framework for extraction decisions on different web sites. Especially, we put forward different strategies, path aggregation for extracting text content and HMM model for structured records, to locate the extraction area by exploiting both those extraction characteristics. Comparative experiments on multiple web sites with popular extraction methods, including CETR, CETD and CNBE, show that our proposed extraction method can provide better extraction precision and extraction adaptability

    Reporting methodological search filter performance comparisons : a literature review

    Get PDF
    © 2014 The authors. Health Information and Libraries Journal © 2014 Health Libraries Journal.Peer reviewedPostprin

    Product Redesign and Innovation Based on Online Reviews:A Multistage Combined Search Method

    Get PDF
    Online reviews published on the e-commerce platform provide a new source of information for designers to develop new products. Past research on new product development (NPD) using user-generated textual data commonly focused solely on extracting and identifying product features to be improved. However, the competitive analysis of product features and more specific improvement strategies have not been explored deeply. This study fully uses the rich semantic attributes of online review texts and proposes a novel online review–driven modeling framework. This new approach can extract fine-grained product features; calculate their importance, performance, and competitiveness; and build a competitiveness network for each feature. As a result, decision making is assisted, and specific product improvement strategies are developed for NPD beyond existing modeling approaches in this domain. Specifically, online reviews are first classified into redesign- and innovation-related themes using a multiple embedding model, and the redesign and innovation product features can be extracted accordingly using a mutual information multilevel feature extraction method. Moreover, the importance and performance of features are calculated, and the competitiveness and competitiveness network of features are obtained through a personalized unidirectional bipartite graph algorithm. Finally, the importance performance competitiveness analysis plot is constructed, and the product improvement strategy is developed via a multistage combined search algorithm. Case studies and comparative experiments show the effectiveness of the proposed method and provide novel business insights for stakeholders, such as product providers, managers, and designers

    Expediting TTS Synthesis with Adversarial Vocoding

    Get PDF
    Recent approaches in text-to-speech (TTS) synthesis employ neural network strategies to vocode perceptually-informed spectrogram representations directly into listenable waveforms. Such vocoding procedures create a computational bottleneck in modern TTS pipelines. We propose an alternative approach which utilizes generative adversarial networks (GANs) to learn mappings from perceptually-informed spectrograms to simple magnitude spectrograms which can be heuristically vocoded. Through a user study, we show that our approach significantly outperforms na\"ive vocoding strategies while being hundreds of times faster than neural network vocoders used in state-of-the-art TTS systems. We also show that our method can be used to achieve state-of-the-art results in unsupervised synthesis of individual words of speech.Comment: Published as a conference paper at INTERSPEECH 201

    Simulation in manufacturing and business: A review

    Get PDF
    Copyright @ 2009 Elsevier B.V.This paper reports the results of a review of simulation applications published within peer-reviewed literature between 1997 and 2006 to provide an up-to-date picture of the role of simulation techniques within manufacturing and business. The review is characterised by three factors: wide coverage, broad scope of the simulation techniques, and a focus on real-world applications. A structured methodology was followed to narrow down the search from around 20,000 papers to 281. Results include interesting trends and patterns. For instance, although discrete event simulation is the most popular technique, it has lower stakeholder engagement than other techniques, such as system dynamics or gaming. This is highly correlated with modelling lead time and purpose. Considering application areas, modelling is mostly used in scheduling. Finally, this review shows an increasing interest in hybrid modelling as an approach to cope with complex enterprise-wide systems

    HOG, LBP and SVM based Traffic Density Estimation at Intersection

    Full text link
    Increased amount of vehicular traffic on roads is a significant issue. High amount of vehicular traffic creates traffic congestion, unwanted delays, pollution, money loss, health issues, accidents, emergency vehicle passage and traffic violations that ends up in the decline in productivity. In peak hours, the issues become even worse. Traditional traffic management and control systems fail to tackle this problem. Currently, the traffic lights at intersections aren't adaptive and have fixed time delays. There's a necessity of an optimized and sensible control system which would enhance the efficiency of traffic flow. Smart traffic systems perform estimation of traffic density and create the traffic lights modification consistent with the quantity of traffic. We tend to propose an efficient way to estimate the traffic density on intersection using image processing and machine learning techniques in real time. The proposed methodology takes pictures of traffic at junction to estimate the traffic density. We use Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP) and Support Vector Machine (SVM) based approach for traffic density estimation. The strategy is computationally inexpensive and can run efficiently on raspberry pi board. Code is released at https://github.com/DevashishPrasad/Smart-Traffic-Junction.Comment: paper accepted at IEEE PuneCon 201
    • …
    corecore