171 research outputs found

    Reconstructing human-generated provenance through similarity-based clustering

    Get PDF
    In this paper, we revisit our method for reconstructing the primary sources of documents, which make up an important part of their provenance. Our method is based on the assumption that if two documents are semantically similar, there is a high chance that they also share a common source. We previously evaluated this assumption on an excerpt from a news archive, achieving 68.2% precision and 73% recall when reconstructing the primary sources of all articles. However, since we could not release this dataset to the public, it made our results hard to compare to others. In this work, we extend the flexibility of our method by adding a new parameter, and re-evaluate it on the human-generated dataset created for the 2014 Provenance Reconstruction Challenge. The extended method achieves up to 86% precision and 59% recall, and is now directly comparable to any approach that uses the same dataset

    Web-scale provenance reconstruction of implicit information diffusion on social media

    Get PDF
    Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess specific pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. We can process datasets consisting of up to several millions of messages on a single machine at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment

    Enabling automatic provenance-based trust assessment of web content

    Get PDF

    Evaluating the impact of hierarchical deep-water slope channel architecture on fluid flow behavior, Cretaceous Tres Pasos Formation, Chile

    Get PDF
    2021 Spring.Includes bibliographical references.Channelized deep-water reservoirs inherently contain sub-seismic scale heterogeneity, resulting in uncertainty when evaluating reservoir connectivity and flow patterns. Stratigraphic architectural features, including stacked channel elements, channel element fill, mass transport deposits (MTDs), and channel base drapes, can have a complex and significant impact on fluid flow pathways. While this detailed stratigraphic architecture can be difficult to capture at the development scale, it can be effectively modeled at the sector scale using high-resolution outcrop data. The characterization of flow behaviors and reservoir performance at this finer scale can then be used in the construction of lower-resolution development-scale simulations. This study uses a three-part sensitivity analysis to test how fluid flow behavior responds to channel element stacking patterns, net to gross ratio, channel base drape coverage, and MTD properties. First, simplified models are used to isolate key flow behaviors. Then, field data is incorporated from the seismic-scale Laguna Figueroa outcrop of the Cretaceous Tres Pasos Formation, Magallanes Basin, Chile to construct a deterministic outcrop model that incorporates realistic stacking patterns and architectural features, including MTDs. Finally, stochastic object-based methods are used to try to replicate the flow characteristics of the outcrop model using established geostatistical methods and limited data input. Fluid flow was simulated using a constant flux aquifer at the base of the model and three producing wells at the top, and the results of the three modeling methods were compared in an effort to elucidate key flow behaviors

    DETRITAL-ZIRCON GEOCHRONOLOGIC PROVENANCE ANALYSES THAT TEST AND EXPAND THE EAST SIBERIA - WEST LAURENTIA RODINIA RECONSTRUCTION

    Get PDF
    Laurentia\u27s position in the Neoproterozoic supercontinent Rodinia is a topic of continuing debate. Several reconstructions involving Laurentia and Siberia have been proposed, including the east Siberia-west Laurentia connection. The east Siberia-west Laurentia Rodinia reconstruction includes lithostratigraphic correlations of conjugate rift miogeocline sediments located in SE Siberia and SW Laurentia. To constrain provenance relationships and to quantitatively test correlations, we determined detrital-zircon age spectra from samples from the Sette Daban Range of SE Siberia and the Death Valley and White-Inyo regions of SW Laurentia using SHRIMP and LAICPMS. Data indicate that several Siberian samples contain zircons correlative with Laurentian sources, and several Laurentian samples contain zircons correlative with Siberian sources. The results not only strengthen correlations between the two successions, they also lead to a new tectono-sedimentary evolution model of the restored region. The new model involves thickening of the crust during the southern Laurentian Grenville orogeny, crustal flexure resulting in a foreland basin, and extension sub-perpendicular to the Grenville front resulting in extensional basins within the foreland basin on both cratons. Sediments were transported from the Grenville Orogen across the margin between the two cratons and were preserved in the extensional basins. The Grenville source for SE Siberian sediments was cut off in the latest Proterozoic due to successful continental rifting. Grenville sediments were cut off from SW Laurentia in the latest Proterozoic-Middle Cambrian either due to the development of an impediment between the orogen and the basin, or due to the transgression of the Cambrian Sea from the east. Petrographic analyses corroborate the new tectono-sedimentary model. My research resulted in other avenues of thought that were not anticipated from the outset. For example, I present a comparison of the age-equivalency of SHRIMP and LAICPMS detrital-zircon analyses obtained by analyzing the same zircon grains with both instruments. Also, I show that a possible extension of the East Siberia - West Laurentia connection could be the Anti-Atlas region of Morocco, based on trilobite evidence and detrital-zircon provenance information. Finally, I demonstrate two ways to involve K-12 students in geoscience research

    Novel Applications Of Meteoric- And In Situ-Produced Beryllium-10 In The East Antarctic

    Get PDF
    This work comprises three novel applications of in situ- and meteoric-produced beryllium-10 (Be-10) in East Antarctica. Sampled deposits cover a wide spatiotemporal transect through the Dry Valleys, from an inland, middle elevation location of Quaternary age, to a mid-valley, high elevation location of Miocene age, and finally to an offshore, submarine location of Pliocene age. Each research chapter we present is a unique project unto itself, but all chapters utilize the cosmogenic radionuclide Be-10. In the first application, we present ``Difference Dating,\u27\u27 a new approach to date glacial moraines in regions where traditional exposure age dating is fraught with complications. Difference Dating allows for the construction of deglaciation chronologies in regions where they are frequently precluded by inheritance issues. We use Difference Dating to constrain the ages of Quaternary moraines in an alpine glacial cirque, Wright Valley, Dry Valleys. The second and third applications use meteoric-produced Be-10 in two different depositional settings. In marine sediments, we recast the Be-10/Be-9 ratio as a proxy for East Antarctic Ice Sheet freshwater discharge during mid-Pliocene interglacials. Using this record, we suggest that zones of deep water formation may be significant in funneling Be into the global thermohaline circulation belt. We also apply the meteoric-produced Be-10 system to paleolake sediments, where extremely low concentrations are used to construct an age model extending to 14-17.5 Ma. This range is commensurate with lake sediment deposition during the Middle Miocene Climatic Optimum, a rare Antarctic terrestrial deposit of this globally significant warming event

    Paleo-Ice Sheet and Deglacial History of the Southwestern Great Slave Lake Area

    Get PDF
    The western Laurentide Ice Sheet (LIS) is known to have experienced complex ice-flow shifts during the last glaciation due to ice divide migration and increasing topographic influence during deglaciation. Several glacial lakes also developed at different elevations during ice margin retreat over the region. However, due to limited field-based studies and surficial mapping, the evolution of the western LIS is still poorly constrained. Improving reconstructions of the western LIS evolution and understanding its net effect on landscapes and surficial sediments can provide important insights into long-term glacial processes, as well as useful knowledge for mineral exploration in glaciated terrains. Furthermore, detailing retreat over this region can help refine continental-scale ice sheet models and help test suggested meltwater drainage pathways to the northwest down the Mackenzie River Valley, which have important implications in paleoclimatology. This research details relative ice-flow chronology and associated till stratigraphy and provides a reconstruction of ice margin retreat and glacial lake positions along a portion of the western LIS situated west of Great Slave Lake, in the Northwest Territories. Relative ice-flow chronology is established using glacial landforms, outcrop-scale ice-flow indicators, as well as till stratigraphic and provenance analyses. Outcrop-scale indicators show a shift in ice-flow direction from an oldest southwestern (230°) flow, to a western (250°) flow, to a final northwestern (305°) flow. This sequence counters the simple westward flow of other studies and suggests a younger rather than older northwestward ice-flow. Lodged boulders and till clast fabrics from till stratigraphic sections across the study area are broadly consistent with the clockwise ice-flow shift up the stratigraphic column. Indicators of northeast provenance include Canadian Shield (igneous and metamorphic) clasts that are in higher proportions than Mesozoic mudrocks and Paleozoic carbonate rocks that underlie the study area. Major oxides, from till matrix geochemistry, are enriched in metals (SiO2-Al2O3-Fe2O3-K2O-TiO2-Cr2O3) interpreted to indicate a northeast Canadian Shield provenance, however, there is overlap with the geochemical signature of the Mesozoic mudrocks. At least one till unit is associated to the oldest southwest ice-flow phase initially recognized in the striation and landform records based on its compositional signature as well as till fabrics. Younger tills were deposited during the clockwise ice flow shift. These tills are located at surface in lower topographic regions throughout the study area and their composition has an increased carbonate signature from the underlying Paleozoic sedimentary rocks. These tills show some compositional inheritance from the older till unit(s). Within these upper tills is a unit sourced from hyper-saline beds to the northeast. Ultimately, the clockwise rotation of ice flow is preserved in both the erosional (landform and outcrop-scale ice-flow indicators) and depositional (till fabrics and composition) records. The ice-flow chronology shows compelling evidence for major shifts in ice sheet configuration and flow dynamics, as well as related subglacial conditions (e.g. changes in subglacial sediment entrainment) during the last glaciation. A retreat sequence showing ice margins and pro-glacial lake positions is established using sediment landform assemblages from surficial maps and topographic basins and drainage outlets from the 2m resolution ArcticDEM. Seven optical ages, two from a 223 m a.s.l.beach ridges and five from eolian dunes, and radiocarbon ages from wood and peat were obtained and provide additional chronological constraints within the region. A stepwise pattern of eastward retreat is reconstructed, which shows impounded drainage along the ice margin creating a series of pro-glacial lakes at different elevations along the margin and through time. During this eastward retreat the Snake Creek Moraine was deposited into a shallow pro-glacial lake. The optical age of the beach ridge currently at 223m a.s.l.indicates deposition at 12.0 ± 0.7 ka BP. This is the most limiting age and suggests the previously published ice margin positions used for the region are older as the deposition of the Snake Creek Moraine is estimated at 12.5 cal. ka BP. The eolian optical ages show continual eolian reworking indicating the landscape was exposed after 10.4 ± 0.3 ka BP. The radiocarbon ages of 2.7 – 2.1 cal. ka BP cal. ka BP from wood and peat is much younger and thus not related to deglaciation. The updated ice margin retreat sequence is more detailed than those currently being used currently continental-scale ice sheet models, and also provides new evidence to constrain the evolution of proglacial lakes, which were open to northwestward drainage down the Mackenzie River Valley. This study provides new insights into the ice-flow and deglacial history of the western LIS, which are constrained by field data and observations. The ice-flow history and till stratigraphy detailed in this research provides new constraints for establishing the locations of past ice divides. Updates to the ice margins and lake limits during deglaciation show complex eastward retreat and geochronology ages indicate the area was deglaciated at an earlier time than previously thought (at least 500 years). Finally, all results from this study provide important new information that should inform mineral exploration in the area, especially for techniques that utilize surficial sediments to trace, characterize, and locate buried targets of interest in bedrock

    Reconstructing Data Provenance from Log Files

    Get PDF
    Data provenance describes the derivation history of data, capturing details such as the entities involved and the relationships between entities. Knowledge of data provenance can be used to address issues, such as data quality assurance, data audit and system security. However, current computer systems are usually not equipped with means to acquire data provenance. Modifying underlying systems or introducing new monitoring software for provenance logging may be too invasive for production systems. As a result, data provenance may not always be available. This thesis investigates the completeness and correctness of data provenance reconstructed from log files with respect to the actual derivation history. To accomplish this, we designed and tested a solution that first extracts and models information from log files into provenance relations then reconstructs the data provenance from those relations. The reconstructed output is then evaluated against the ground truth provenance. The thesis also details the methodology used for constructing a dataset for provenance reconstruction research. Experimental results revealed data provenance that completely captures the ground truth can be reconstructed from system-layer log files. However, the outputs are susceptible to errors generated during event logging and errors induced by program dependencies. Results also show that usage of log files of different granularities collected from the system can help resolve logging errors described. Experiments with removing suspected program dependencies using approaches such as blacklisting and clustering have shown that the number of errors can be reduced by a factor of one hundred. Conclusions drawn from this research contribute towards the work on using reconstruction as an alternative approach for acquiring data provenance from computer systems

    Digital Forensics Investigation Frameworks for Cloud Computing and Internet of Things

    Get PDF
    Rapid growth in Cloud computing and Internet of Things (IoT) introduces new vulnerabilities that can be exploited to mount cyber-attacks. Digital forensics investigation is commonly used to find the culprit and help expose the vulnerabilities. Traditional digital forensics tools and methods are unsuitable for use in these technologies. Therefore, new digital forensics investigation frameworks and methodologies are required. This research develops frameworks and methods for digital forensics investigations in cloud and IoT platforms

    Circum-Arctic Mineralogy & Pan-Arctic Chronostratigraphy of Late Pleistocene Sediments: Developing a Comprehensive Age Model for the Western Arctic Ocean Using Unique Ice-Rafted Signals

    Get PDF
    To improve understanding of geographic mineral distribution from the circum-Arctic Ocean, samples from the Arctic periphery were collected and analyzed for (semi-) quantitative mineral composition. Most samples were collected from the North American region of the Arctic Ocean, a region which has had limited mineral investigation. In addition, more than 1000 published clay mineral data points were gathered to provide the most comprehensive clay mineral distribution map to date. The identification of a smectite source within the Canadian Arctic may reduce the usefulness of this mineral as a unique provenance signal for the Eurasian region. Smectite speciation may be useful in maintaining the use of this mineral for provenance determinations. Pyrophyllite, tridymite, zeolite and feldspar species were identified as potentially useful mineral provenance indicators. Strongly contrasting feldspar phases between North America and Eurasia provide an empirical signal for discerning sediment inputs to the Central Arctic. Similarities between ratios of potassium feldspar to plagioclase and quartz to total feldspar indicate that basin-wide sedimentation events likely occurred in the past. These basin-wide sedimentation events, if sufficiently constrained, could be useful for correlations between the eastern and western Arctic basins. An age model for a sub-Arctic core from Yermak Plateau (JPC22) is developed using oxygen-isotope stratigraphy. The model is based on similarities between the global oxygen-isotope model and a unique paleomagnetic signal. This model identified several periods of rapid sedimentation that, if occurred over broad spatial scales, could be used as correlative tie points for the transfer of ages to central Arctic sediment records. Central Arctic sediments have long been poorly understood due to the lack of age control. Using the newly developed oxygen-isotope stratigraphy from JPC22 (Yermak Plateau) an age model was developed for a central Arctic core from Mendeleev Ridge (JPC9). Based on this model, sediment transport mechanisms were interpreted in the context of glacial and interglacial environmental changes. Periods of rapid sedimentation are identified in the JPC9. Identification of these rapid events is typically hindered in central Arctic sediments due to reduced overall accumulation rates and reduced resolution. The timing of these unique depositional events appears to be coincident with Greenland stadial and interstadial events, therefore their origin may be related to Dansgaard-Oeschger cycles and other similar or related phenomena
    corecore