34,706 research outputs found

    Building an Archive with Saada

    Full text link
    Saada transforms a set of heterogeneous FITS files or VOTables of various categories (images, tables, spectra ...) in a database without writing code. Databases created with Saada come with a rich Web interface and an Application Programming Interface (API). They support the four most common VO services. Such databases can mix various categories of data in multiple collections. They allow a direct access to the original data while providing a homogenous view thanks to an internal data model compatible with the characterization axis defined by the VO. The data collections can be bound to each other with persistent links making relevant browsing paths and allowing data-mining oriented queries.Comment: 18 pages, 5 figures Special VO issu

    Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

    Full text link
    Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201

    Knowledge Rich Natural Language Queries over Structured Biological Databases

    Full text link
    Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made

    Challenges in Bridging Social Semantics and Formal Semantics on the Web

    Get PDF
    This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics. The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The re-search results are applied to support and foster interactions in online communities and manage their resources

    You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems

    Full text link
    Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specified using intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice, possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with experts from three diverse domains---astronomy, genetics, and material science---via a year-long user-centered design process to develop a VQS that supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hoc sketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns of interest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participants employ all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all three sensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries.Comment: Accepted for presentation at IEEE VAST 2019, to be held October 20-25 in Vancouver, Canada. Paper will also be published in a special issue of IEEE Transactions on Visualization and Computer Graphics (TVCG) IEEE VIS (InfoVis/VAST/SciVis) 2019 ACM 2012 CCS - Human-centered computing, Visualization, Visualization design and evaluation method

    Metagenomic sequencing unravels gene fragments with phylogenetic signatures of O2-tolerant NiFe membrane-bound hydrogenases in lacustrine sediment

    Get PDF
    Many promising hydrogen technologies utilising hydrogenase enzymes have been slowed by the fact that most hydrogenases are extremely sensitive to O2. Within the group 1 membrane-bound NiFe hydrogenase, naturally occurring tolerant enzymes do exist, and O2 tolerance has been largely attributed to changes in iron–sulphur clusters coordinated by different numbers of cysteine residues in the enzyme’s small subunit. Indeed, previous work has provided a robust phylogenetic signature of O2 tolerance [1], which when combined with new sequencing technologies makes bio prospecting in nature a far more viable endeavour. However, making sense of such a vast diversity is still challenging and could be simplified if known species with O2-tolerant enzymes were annotated with information on metabolism and natural environments. Here, we utilised a bioinformatics approach to compare O2-tolerant and sensitive membrane-bound NiFe hydrogenases from 177 bacterial species with fully sequenced genomes for differences in their taxonomy, O2 requirements, and natural environment. Following this, we interrogated a metagenome from lacustrine surface sediment for novel hydrogenases via high-throughput shotgun DNA sequencing using the Illumina™ MiSeq platform. We found 44 new NiFe group 1 membrane-bound hydrogenase sequence fragments, five of which segregated with the tolerant group on the phylogenetic tree of the enzyme’s small subunit, and four with the large subunit, indicating de novo O2-tolerant protein sequences that could help engineer more efficient hydrogenases
    • …
    corecore