8,053 research outputs found

    Automated retrieval and extraction of training course information from unstructured web pages

    Get PDF
    Web Information Extraction (WIE) is the discipline dealing with the discovery, processing and extraction of specific pieces of information from semi-structured or unstructured web pages. The World Wide Web comprises billions of web pages and there is much need for systems that will locate, extract and integrate the acquired knowledge into organisations practices. There are some commercial, automated web extraction software packages, however their success comes from heavily involving their users in the process of finding the relevant web pages, preparing the system to recognise items of interest on these pages and manually dealing with the evaluation and storage of the extracted results. This research has explored WIE, specifically with regard to the automation of the extraction and validation of online training information. The work also includes research and development in the area of automated Web Information Retrieval (WIR), more specifically in Web Searching (or Crawling) and Web Classification. Different technologies were considered, however after much consideration, Naïve Bayes Networks were chosen as the most suitable for the development of the classification system. The extraction part of the system used Genetic Programming (GP) for the generation of web extraction solutions. Specifically, GP was used to evolve Regular Expressions, which were then used to extract specific training course information from the web such as: course names, prices, dates and locations. The experimental results indicate that all three aspects of this research perform very well, with the Web Crawler outperforming existing crawling systems, the Web Classifier performing with an accuracy of over 95% and a precision of over 98%, and the Web Extractor achieving an accuracy of over 94% for the extraction of course titles and an accuracy of just under 67% for the extraction of other course attributes such as dates, prices and locations. Furthermore, the overall work is of great significance to the sponsoring company, as it simplifies and improves the existing time-consuming, labour-intensive and error-prone manual techniques, as will be discussed in this thesis. The prototype developed in this research works in the background and requires very little, often no, human assistance

    ZNF804A genotype modulates neural activity during working memory for faces

    Get PDF
    Copyright © 2013 S. Karger AG, Basel.Peer reviewedPublisher PD

    Rice Galaxy: An open resource for plant science

    Get PDF
    Background: Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non−computer savvy rice researchers. Findings: The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice−bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. Conclusions: Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science

    Widespread sex differences in gene expression and splicing in the adult human brain

    Get PDF
    There is strong evidence to show that men and women differ in terms of neurodevelopment, neurochemistry and susceptibility to neurodegenerative and neuropsychiatric disease. The molecular basis of these differences remains unclear. Progress in this field has been hampered by the lack of genome-wide information on sex differences in gene expression and in particular splicing in the human brain. Here we address this issue by using post-mortem adult human brain and spinal cord samples originating from 137 neuropathologically confirmed control individuals to study whole-genome gene expression and splicing in 12 CNS regions. We show that sex differences in gene expression and splicing are widespread in adult human brain, being detectable in all major brain regions and involving 2.5% of all expressed genes. We give examples of genes where sex-biased expression is both disease-relevant and likely to have functional consequences, and provide evidence suggesting that sex biases in expression may reflect sex-biased gene regulatory structures

    A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

    Get PDF
    In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

    Neuronal glucose transporter isoform 3 deficient mice demonstrate features of autism spectrum disorders.

    Get PDF
    Neuronal glucose transporter (GLUT) isoform 3 deficiency in null heterozygous mice led to abnormal spatial learning and working memory but normal acquisition and retrieval during contextual conditioning, abnormal cognitive flexibility with intact gross motor ability, electroencephalographic seizures, perturbed social behavior with reduced vocalization and stereotypies at low frequency. This phenotypic expression is unique as it combines the neurobehavioral with the epileptiform characteristics of autism spectrum disorders. This clinical presentation occurred despite metabolic adaptations consisting of an increase in microvascular/glial GLUT1, neuronal GLUT8 and monocarboxylate transporter isoform 2 concentrations, with minimal to no change in brain glucose uptake but an increase in lactate uptake. Neuron-specific glucose deficiency has a negative impact on neurodevelopment interfering with functional competence. This is the first description of GLUT3 deficiency that forms a possible novel genetic mechanism for pervasive developmental disorders, such as the neuropsychiatric autism spectrum disorders, requiring further investigation in humans

    Neuroevolution of Self-Interpretable Agents

    Full text link
    Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from irrelevant details. Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck. By constraining access to only a small fraction of the visual input, we show that their policies are directly interpretable in pixel space. We find neuroevolution ideal for training self-attention architectures for vision-based reinforcement learning (RL) tasks, allowing us to incorporate modules that can include discrete, non-differentiable operations which are useful for our agent. We argue that self-attention has similar properties as indirect encoding, in the sense that large implicit weight matrices are generated from a small number of key-query parameters, thus enabling our agent to solve challenging vision based tasks with at least 1000x fewer parameters than existing methods. Since our agent attends to only task critical visual hints, they are able to generalize to environments where task irrelevant elements are modified while conventional methods fail. Videos of our results and source code available at https://attentionagent.github.io/Comment: To appear at the Genetic and Evolutionary Computation Conference (GECCO 2020) as a full pape

    Panzea: a database and resource for molecular and functional diversity in the maize genome

    Get PDF
    Serving as a community resource, Panzea () is the bioinformatics arm of the Molecular and Functional Diversity in the Maize Genome project. Maize, a classical model for genetic studies, is an important crop species and also the most diverse crop species known. On average, two randomly chosen maize lines have one single-nucleotide polymorphism every ∼100 bp; this divergence is roughly equivalent to the differences between humans and chimpanzees. This exceptional genotypic diversity underlies the phenotypic diversity maize needs to be cultivated in a wide range of environments. The Molecular and Functional Diversity in the Maize Genome project aims to understand how selection has shaped molecular diversity in maize and then relate molecular diversity to functional phenotypic variation. The project will screen 4000 loci for the signature of selection and create a wide range of maize and maize–teosinte mapping populations. These populations will be genotyped and phenotyped, permitting high-power and high-resolution dissection of the traits and relating the molecular diversity to functional variation. Panzea provides access to the genotype, phenotype and polymorphism data produced by the project through user-friendly web-based database searches and data retrieval/visualization tools, as well as a wide variety of information and services related to maize diversity

    The International Human Epigenome Consortium: A Blueprint for Scientific Collaboration and Discovery

    Get PDF
    The International Human Epigenome Consortium (IHEC) coordinates the generation of a catalog of high-resolution reference epigenomes of major primary human cell types. The studies now presented (see the Cell Press IHEC web portal at http://www.cell.com/consortium/IHEC) highlight the coordinated achievements of IHEC teams to gather and interpret comprehensive epigenomic datasets to gain insights in the epigenetic control of cell states relevant for human health and disease
    corecore