38 research outputs found

    Recognition of Promoters in DNA Sequences Using Weightily Averaged One-dependence Estimators

    Get PDF
    AbstractThe completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17% in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes

    Дакументальныя крыніцы па гісторыі беларускай праваслаўнай царквы ХХ ст. у архівах Беларусі: праблемы іх публікацыі і навуковага выкарыстання

    Get PDF
    The phylogenetic status of Sivachoerus is re-evaluated according to the new materials recovered from the central Myanmar. Sivachoerus had also been known from the Pliocene Siwalik sediments of Indian Subcontinent. Compared to Siwalik specimens, Myanmar specimens are poorly known and have not been confirmed their geological age yet. New evidence for the discovery of Sivachoerus from the Irrawaddy Formation presumed that Sivachoerus has appeared during the Pliocene in Myanmar, Southeast Asia. Dental morphology and chronology of Sivachoenu strongly support the 'African origin' hypothesis than the 'Asian origin' of this genus. Stvachoerus probably evolved from the African Nyanzachoerus rather than the Asian Conohvus, during the Late Miocene, and migrated to Asia during the latest Miocene

    Computer-aided diagnosis of pulmonary nodules from chest X-rays using rotation forest

    Get PDF
    A chest X-ray examination is a painless, non-invasive, and cost effective medical examination performed at present day. A pulmonary nodule is a small round lesion or mass in the lungs which can be indicative of an infection or a neoplasm. Chest X-rays can be used to diagnose pulmonary nodules. This paper proposes a three-layered framework to perform automatic diagnosis of pulmonary nodules. The first layer performs pre-processing of X-ray images. The second layer extracts texture features from the gray-level co-occurrence matrix. Finally, the third layer classifies whether the X-ray contains any signs of nodules using an ensemble technique called rotation forest. Experiments have been carried out on a chest X-ray dataset from the Japanese Society of Radiological Technology. Satisfactory preliminary experimental results demonstrate the efficacy of our computer aided pulmonary nodule diagnosis system

    User Profiling for Search Engines’ Help Systems

    Get PDF
    The Help Systems information provided by search engines can facilitate or hinder its user’s information seeking process. This paper reports a study in how users would like to see search engines’ Help Systems to be organized and presented. Six aspects of Help Systems, including navigation, design elements, technical help, conceptual help, terminological, and strategic aspects, were used as the framework to develop questionnaire for further study in stereotyping search engine users. Overall users do not expect animations, videos and speech as part of a search engine’s Help System, technical help is desirable, and the navigation to find Help page and relevant content is important

    Confirmation of Skywalker Hoolock Gibbon (Hoolock tianxing) in Myanmar extends known geographic range of an endangered primate

    Get PDF
    Characterizing genetically distinct populations of primates is important for protecting biodiversity and effectively allocating conservation resources. Skywalker gibbons (Hoolock tianxing) were first described in 2017, with the only confirmed population consisting of 150 individuals in Mt. Gaoligong, Yunnan Province, China. Based on river geography, the distribution of the skywalker gibbon has been hypothesized to extend into Myanmar between the N’Mai Kha and Ayeyarwaddy Rivers to the west, and the Salween River (named the Thanlwin River in Myanmar and Nujiang River in China) to the east. We conducted acoustic point-count sampling surveys, collected noninvasive samples for molecular mitochondrial cytochrome b gene identification, and took photographs for morphological identification at six sites in Kachin State and three sites in Shan State to determine the presence of skywalker gibbons in predicted suitable forest areas in Myanmar. We also conducted 50 semistructured interviews with members of communities surrounding gibbon range forests to understand potential threats. In Kachin State, we audio-recorded 23 gibbon groups with group densities ranging between 0.57 and 3.6 group/km2. In Shan State, we audio-recorded 21 gibbon groups with group densities ranging between 0.134 and 1.0 group/km2. Based on genetic data obtained from skin and saliva samples, the gibbons were identified as skywalker gibbons (99.54–100% identity). Although these findings increase the species’ known population size and confirmed distribution, skywalker gibbons in Myanmar are threatened by local habitat loss, degradation, and fragmentation. Most of the skywalker gibbon population in Myanmar exists outside protected areas. Therefore, the IUCN Red List status of the skywalker gibbon should remain as Endangered

    Classification of eukaryotic splice-junction genetic sequences using averaged one-dependence estimators with subsumption resolution

    Get PDF
    DNA is the building block of life, which contains encoded genetic instructions for building living organisms. Because of the fact that proteins are constructed in accordance with the genetic instructions encoded in DNAs, errors in RNA synthesis and translation into proteins can cause genetic disorders. Therefore, understanding and recognizing genetic sequences is one step towards the treatment of these genetic disorders. Since the discovery of DNA, there has been a growing interest in the problem of genetic sequence recognition, motivated by its enormous potential to cure a wide range of genetic disorders. The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. This paper describes a state-of-the-art machine learning based approach called averaged one-dependence estimators with subsumption resolution to tackle the problem of recognizing an important class of genetic sequences known as eukaryotic splice junctions. To lower the computational complexity and to increase the generalization capability of the system, we employ a genetic algorithm to select relevant nucleotides that are directly responsible for splice-junction recognition. We carried out experiments on a dataset extracted from the biological literature. This proposed system has achieved an accuracy of 96.68% in classifying splice-junction genetic sequences. The experimental results demonstrate the efficacy of our framework and encourage us to apply the framework on other types of genetic sequences

    Cancer recurrence prediction using machine learning

    No full text
    Cancer is one of the deadliest diseases in the world and is responsible for around 13% of all deaths world-wide. Cancer incidence rate is growing at an alarming rate in the world. Despite the fact that cancer is preventable and curable in early stages, the vast majority of patients are diagnosed with cancer very late. Furthermore, cancer commonly comes back after years of treatment. Therefore, it is of paramount importance to predict cancer recurrence so that specific treatments can be sought. Nonetheless, conventional methods of predicting cancer recurrence rely solely on histopathology and the results are not very reliable. The microarray gene expression technology is a promising technology that could predict cancer recurrence by analyzing the gene expression of sample cells. The microarray technology allows researchers to examine the expression of thousands of genes simultaneously. This paper describes a state-of-the-art machine learning based approach called averaged one-dependence estimators with subsumption resolution to tackle the problem of predicting, from DNA microarray gene expression data, whether a particular cancer will recur within a specific timeframe, which is usually 5 years. To lower the computational complexity, we employ an entropy-based gene selection approach to select relevant prognostic genes that are directly responsible for recurrence prediction. This proposed system has achieved an average accuracy of 98.9% in predicting cancer recurrence over 3 datasets. The experimental results demonstrate the efficacy of our framework

    Bacteria identification from microscopic morphology: a survey

    No full text
    Great knowledge and experience on microbiology are required for accurate bacteria identification. Automation of bacteria identification is required because there might be a shortage of skilled microbiologists and clinicians at a time of great need. We propose an automatic bacteria identification framework that can classify three famous classes of bacteria namely Cocci, Bacilli and Vibrio from microscopic morphology using the Naïve Bayes classifier. The proposed bacteria identification framework comprises two steps. In the first step, the system is trained using a set of microscopic images containing Cocci, Bacilli, and Vibrio. The input images are normalized to emphasize the diameter and shape features. Edge-based descriptors are then extracted from the input images. In the second step, we use the Naïve Bayes classifier to perform probabilistic inference based on the input descriptors. 64 images for each class of bacteria were used as the training set and 222 images consisting of the three classes of bacteria and other random images such as humans and airplanes were used as the test set. There are no images overlapped between the training set and the test set. The system was found to be able to accurately discriminate the three classes of bacteria. Moreover, the system was also found to be able to reject images that did not belong to any of the three classes of bacteria. The preliminary results demonstrate how a simple machine learning classifier with a set of simple image-based features can result in high classification accuracy. The preliminary results also demonstrate the efficacy and efficiency of our two-step automatic bacteria identification approach and motivate us to extend this framework to identify a variety of other types of bacteria

    Cancer recognition from DNA microarray gene expression data using averaged one-dependence estimators

    No full text
    Cancer is a major leading cause of death and responsible for around 13% of all deaths world-wide. Cancer incidence rate is growing at an alarming rate in the world. Despite the fact that cancer is preventable and curable in early stages, the vast majority of patients are diagnosed with cancer very late. Therefore, it is of paramount importance to prevent and detect cancer early. Nonetheless, conventional methods of detecting and diagnosing cancer rely solely on skilled physicians, with the help of medical imaging, to detect certain symptoms that usually appear in the late stages of cancer. The microarray gene expression technology is a promising technology that can detect cancerous cells in early stages of cancer by analyzing gene expression of tissue samples. The microarray technology allows researchers to examine the expression of thousands of genes simultaneously. This paper describes a state-of-the-art machine learning based approach called averaged one-dependence estimators with subsumption resolution to tackle the problem of recognizing cancer from DNA microarray gene expression data. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based gene selection approach to select relevant gene that are directly responsible for cancer discrimination. This proposed system has achieved an average accuracy of 98.94% in recognizing and classifying cancer over 11 benchmark cancer datasets. The experimental results demonstrate the efficacy of our framework
    corecore