2 research outputs found

    Biological Sequence Kernels with Guaranteed Flexibility

    Full text link
    Applying machine learning to biological sequences - DNA, RNA and protein - has enormous potential to advance human health, environmental sustainability, and fundamental biological understanding. However, many existing machine learning methods are ineffective or unreliable in this problem domain. We study these challenges theoretically, through the lens of kernels. Methods based on kernels are ubiquitous: they are used to predict molecular phenotypes, design novel proteins, compare sequence distributions, and more. Many methods that do not use kernels explicitly still rely on them implicitly, including a wide variety of both deep learning and physics-based techniques. While kernels for other types of data are well-studied theoretically, the structure of biological sequence space (discrete, variable length sequences), as well as biological notions of sequence similarity, present unique mathematical challenges. We formally analyze how well kernels for biological sequences can approximate arbitrary functions on sequence space and how well they can distinguish different sequence distributions. In particular, we establish conditions under which biological sequence kernels are universal, characteristic and metrize the space of distributions. We show that a large number of existing kernel-based machine learning methods for biological sequences fail to meet our conditions and can as a consequence fail severely. We develop straightforward and computationally tractable ways of modifying existing kernels to satisfy our conditions, imbuing them with strong guarantees on accuracy and reliability. Our proof techniques build on and extend the theory of kernels with discrete masses. We illustrate our theoretical results in simulation and on real biological data sets

    Biochemical and molecular identification with antimicrobial susceptibility of bacterial species isolated from organs and tissues of Alectoris chukar subspecies Kurdistanica

    No full text
    The current study was conducted on 50 Alectoris chukar subspecies Kurdestanica that was collected from Sulaymaniyah Province, Kurdistan Region, Northern Iraq, during the period of April to the end of September 2016. Samples of liver, gallbladder, spleen, kidneys, heart, lungs, gizzard, breast, and thigh muscle tissues were tested for bacterial isolates. Preliminary characterization of the isolated bacteria was carried out by morphological and biochemical methods. The VITEK 2® system was used to confirm the isolated species, while the polymerase chain reaction (PCR) was used for the detection of the resistance gene in the bacterial isolates. As a result, tested samples showed the presence of Staphylococcus sciuri and Escherichia coli. Additionally, antimicrobial susceptibility test was done to determine the bacterial susceptibility to various antibiotics and as a result; E. coli showed 100% susceptibility to penicillin, azithromycin, tetracycline, and doxycycline and 75% susceptibility to streptomycin. On the other hand, S. sciuri exhibited 75% susceptibility to azithromycin, penicillin, and doxycycline, 50% susceptibility to streptomycin, and 25% susceptibility to tetracycline. Molecular identification showed that only the S. sciuri isolates carried the methicillin-resistant mecA gene. To our knowledge, this is the first record of isolation of the S. sciuri methicillin-resistant mecA gene from A. chukar subspecies Kurdestanica
    corecore