1,607 research outputs found

    A Peer-reviewed Newspaper About_ Excessive Research

    Get PDF
    Research on machines, research with machines, and research as a machine. Publication resulting from research workshop at Exhibition Research Lab, Liverpool John Moores University, organised in collaboration with Liverpool John Moores University and Liverpool Biennial, and transmediale festival for art and digital culture, Berlin

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Scalable Architecture for Integrated Batch and Streaming Analysis of Big Data

    Get PDF
    Thesis (Ph.D.) - Indiana University, Computer Sciences, 2015As Big Data processing problems evolve, many modern applications demonstrate special characteristics. Data exists in the form of both large historical datasets and high-speed real-time streams, and many analysis pipelines require integrated parallel batch processing and stream processing. Despite the large size of the whole dataset, most analyses focus on specific subsets according to certain criteria. Correspondingly, integrated support for efficient queries and post- query analysis is required. To address the system-level requirements brought by such characteristics, this dissertation proposes a scalable architecture for integrated queries, batch analysis, and streaming analysis of Big Data in the cloud. We verify its effectiveness using a representative application domain - social media data analysis - and tackle related research challenges emerging from each module of the architecture by integrating and extending multiple state-of-the-art Big Data storage and processing systems. In the storage layer, we reveal that existing text indexing techniques do not work well for the unique queries of social data, which put constraints on both textual content and social context. To address this issue, we propose a flexible indexing framework over NoSQL databases to support fully customizable index structures, which can embed necessary social context information for efficient queries. The batch analysis module demonstrates that analysis workflows consist of multiple algorithms with different computation and communication patterns, which are suitable for different processing frameworks. To achieve efficient workflows, we build an integrated analysis stack based on YARN, and make novel use of customized indices in developing sophisticated analysis algorithms. In the streaming analysis module, the high-dimensional data representation of social media streams poses special challenges to the problem of parallel stream clustering. Due to the sparsity of the high-dimensional data, traditional synchronization method becomes expensive and severely impacts the scalability of the algorithm. Therefore, we design a novel strategy that broadcasts the incremental changes rather than the whole centroids of the clusters to achieve scalable parallel stream clustering algorithms. Performance tests using real applications show that our solutions for parallel data loading/indexing, queries, analysis tasks, and stream clustering all significantly outperform implementations using current state-of-the-art technologies

    Selection on Visual Opsin Genes in Diurnal Neotropical Frogs and Loss of the SWS2 Opsin in Poison Frogs

    Get PDF
    Amphibians are ideal for studying visual system evolution because their biphasic (aquatic and terrestrial) life history and ecological diversity expose them to a broad range of visual conditions. Here, we evaluate signatures of selection on visual opsin genes across Neotropical anurans and focus on three diurnal clades that are well-known for the concurrence of conspicuous colors and chemical defense (i.e., aposematism): poison frogs (Dendrobatidae), Harlequin toads (Bufonidae: Atelopus), and pumpkin toadlets (Brachycephalidae: Brachycephalus). We found evidence of positive selection on 44 amino acid sites in LWS, SWS1, SWS2, and RH1 opsin genes, of which one in LWS and two in RH1 have been previously identified as spectral tuning sites in other vertebrates. Given that anurans have mostly nocturnal habits, the patterns of selection revealed new sites that might be important in spectral tuning for frogs, potentially for adaptation to diurnal habits and for color-based intraspecific communication. Furthermore, we provide evidence that SWS2, normally expressed in rod cells in frogs and some salamanders, has likely been lost in the ancestor of Dendrobatidae, suggesting that under low-light levels, dendrobatids have inferior wavelength discrimination compared to other frogs. This loss might follow the origin of diurnal activity in dendrobatids and could have implications for their behavior. Our analyses show that assessments of opsin diversification in across taxa could expand our understanding of the role of sensory system evolution in ecological adaptation.</p

    Quootstrap: Scalable Unsupervised Extraction of Quotation-Speaker Pairs from Large News Corpora via Bootstrapping

    Full text link
    We propose Quootstrap, a method for extracting quotations, as well as the names of the speakers who uttered them, from large news corpora. Whereas prior work has addressed this problem primarily with supervised machine learning, our approach follows a fully unsupervised bootstrapping paradigm. It leverages the redundancy present in large news corpora, more precisely, the fact that the same quotation often appears across multiple news articles in slightly different contexts. Starting from a few seed patterns, such as ["Q", said S.], our method extracts a set of quotation-speaker pairs (Q, S), which are in turn used for discovering new patterns expressing the same quotations; the process is then repeated with the larger pattern set. Our algorithm is highly scalable, which we demonstrate by running it on the large ICWSM 2011 Spinn3r corpus. Validating our results against a crowdsourced ground truth, we obtain 90% precision at 40% recall using a single seed pattern, with significantly higher recall values for more frequently reported (and thus likely more interesting) quotations. Finally, we showcase the usefulness of our algorithm's output for computational social science by analyzing the sentiment expressed in our extracted quotations.Comment: Accepted at the 12th International Conference on Web and Social Media (ICWSM), 201

    The CBRB regulon: Promoter dissection reveals novel insights into the CbrAB expression network in Pseudomonas putida

    Get PDF
    CbrAB is a high ranked global regulatory system exclusive of the Pseudomonads that responds to carbon limiting conditions. It has become necessary to define the particular regulon of CbrB and discriminate it from the downstream cascades through other regulatory components. We have performed in vivo binding analysis of CbrB in P. putida and determined that it directly controls the expression of at least 61 genes; 20% involved in regulatory functions, including the previously identified CrcZ and CrcY small regulatory RNAs. The remaining are porines or transporters (20%), metabolic enzymes (16%), activities related to protein translation (5%) and orfs of uncharacterised function (38%). Amongst the later, we have selected the operon PP2810-13 to make an exhaustive analysis of the CbrB binding sequences, together with those of crcZ and crcY. We describe the implication of three independent non-palindromic subsites with a variable spacing in three different targets; CrcZ, CrcY and operon PP2810-13 in the CbrAB activation. CbrB is a quite peculiar σN—depen-dent activator since it is barely dependent on phosphorylation for transcriptional activation. With the depiction of the precise contacts of CbrB with the DNA, the analysis of the multi-merisation status and its dependence on other factors such as RpoN o IHF, we propose a model of transcriptional activation.Ministerio de Economía y Competitividad BIO2014-57545-
    • …
    corecore