5,488 research outputs found

    Growing Story Forest Online from Massive Breaking News

    Full text link
    We describe our experience of implementing a news content organization system at Tencent that discovers events from vast streams of breaking news and evolves news story structures in an online fashion. Our real-world system has distinct requirements in contrast to previous studies on topic detection and tracking (TDT) and event timeline or graph generation, in that we 1) need to accurately and quickly extract distinguishable events from massive streams of long text documents that cover diverse topics and contain highly redundant information, and 2) must develop the structures of event stories in an online manner, without repeatedly restructuring previously formed stories, in order to guarantee a consistent user viewing experience. In solving these challenges, we propose Story Forest, a set of online schemes that automatically clusters streaming documents into events, while connecting related events in growing trees to tell evolving stories. We conducted extensive evaluation based on 60 GB of real-world Chinese news data, although our ideas are not language-dependent and can easily be extended to other languages, through detailed pilot user experience studies. The results demonstrate the superior capability of Story Forest to accurately identify events and organize news text into a logical structure that is appealing to human readers, compared to multiple existing algorithm frameworks.Comment: Accepted by CIKM 2017, 9 page

    Assessing seismic damage through stochastic simulation of ground shaking: the case of the 1998 Faial Earthquake (Azores Islands)

    Get PDF
    In July 1998, an Mw = 6.2 earthquake struck the islands of Faial, Pico and San Jorge (in the Azores Archipelago), registering VIII on the Modified Mercalli Intensity scale and causing major destruction in the northeastern part of Faial. The main shock was located offshore, 8 km North East of the island, and it triggered a seismic sequence that lasted for several weeks. The existing data for this earthquake include both the general tectonic environment of the region and the teleseismic information. This is accompanied by one strong-motion record obtained 15 km from the epicentre, the epicentre location of aftershocks, and a large collection of the damage inflicted to the building stock (as poor rubble masonry, of 2-3 storeys). The present study was carried out in two steps: first, with a finite-fault stochastic simulation method of ground motion at sites throughout the affected islands, for two possible locations of the rupturing fault and for a large number of combinations of rupture mechanisms (as a parametric analysis); secondly, the damage to buildings was modelled using a well-known macroseismic method that considers the building typologies and their associated vulnerabilities. The main intent was to integrate different data (geological, seismological and building features) to produce a scenario model to reproduce and justify the level of damage generated during the Faial earthquake. Finally, through validation of the results provided by these different approaches, we obtained a complete procedure for the parameters of a first model for the production of seismic damage scenarios for the Azores Islands region

    Application of random forest classification and remotely sensed data in geological mapping on the Jebel Meloussi area (Tunisia)

    Get PDF
    Remotely sensed data such as satellite photos and radar images can be used to produce geological maps on arid regions, where the vegetation coverage does not have a significant effect. In central Tunisia, the Jebel Meloussi area has unique geological features and characteristic morphology (i.e. flat areas with dune fields in contrast with hills of folded and eroded stratigraphic sequences), which makes it an ideal area for testing new methods of automatic terrain classification. For this, data from the Sentinel 2 satellite sensor and the SRTM-based MERIT DEM (digital elevation model) were used in the present study. Using R scripts and the random forest classification method, modelling was performed on four lithological variables-derived from the different bands of the Sentinel 2 images-and two morphometric parameters for the area of the 1:50,000 geological map sheet no. 103. The four lithological variables were chosen to highlight the iron-bearing minerals since the spectral parameters of the Sentinel 2 sensors are especially useful for this purpose. The training areas of the classification were selected on the geological map. The results of the modelling identified Eocene and Cretaceous evaporite-bearing sedimentary series (such as the Jebs and the Bouhedma Formations) with the highest producer accuracy (> 60% of the predicted pixels match with the map). The pyritic argillites of the Sidi Khalif Formation were also recognized with the same accuracy, and the Quaternary sebhkas and dunes were also well predicted. The study concludes that the classification-based geological map is useful for field geologist prior to field surveys

    Econometric Studies of Business Cycles in the History of Econometrics

    Get PDF
    This study examines the evolution of econometric research in business cycle analysis during the 1960-90 period. It shows how the research was dominated by an assimilation of the tradition of NBER business cycle analysis by the Haavelmo-Cowles Commission approach, catalysed by time-series statistical methods. Methodological consequences of the assimilation are critically evaluated in light of the meagre achievement of the research in predicting the current global recession.Business cycles, NBER, Forecasting
    corecore