Search CORE

18 research outputs found

MalGrid: Visualization Of Binary Features In Large Malware Corpora

Author: Chandrasekaran Shivkumar
Chikkagoudar Satish
Manjunath B. S.
Mohammed Tajuddin Manhar
Nataraj Lakshmanan
Publication venue
Publication date: 04/11/2022
Field of study

The number of malware is constantly on the rise. Though most new malware are modifications of existing ones, their sheer number is quite overwhelming. In this paper, we present a novel system to visualize and map millions of malware to points in a 2-dimensional (2D) spatial grid. This enables visualizing relationships within large malware datasets that can be used to develop triage solutions to screen different malware rapidly and provide situational awareness. Our approach links two visualizations within an interactive display. Our first view is a spatial point-based visualization of similarity among the samples based on a reduced dimensional projection of binary feature representations of malware. Our second spatial grid-based view provides a better insight into similarities and differences between selected malware samples in terms of the binary-based visual representations they share. We also provide a case study where the effect of packing on the malware data is correlated with the complexity of the packing algorithm.Comment: Submitted version - MILCOM 2022 IEEE Military Communications Conference. The high-quality images in this paper can be found on Github (https://github.com/Mayachitra-Inc/MalGrid

arXiv.org e-Print Archive

PLAST-ncRNA: Partition function Local Alignment Search Tool for non-coding RNA sequences

Author: Altschul
Copeland
D. R. Livesay
Griffiths-Jones
Roshan
S. Chikkagoudar
Smith
U. Roshan
Publication venue: Oxford University Press
Publication date
Field of study

Alignment-based programs are valuable tools for finding potential homologs in genome sequences. Previously, it has been shown that partition function posterior probabilities attuned to local alignment achieve a high accuracy in identifying distantly similar non-coding RNA sequences that are hidden in a large genome. Here, we present an online implementation of that alignment algorithm based on such probabilities. Our server takes as input a query RNA sequence and a large genome sequence, and outputs a list of hits that are above a mean posterior probability threshold. The output is presented in a format suited to local alignment. It can also be viewed within the PLAST alignment viewer applet that provides a list of all hits found and highlights regions of high posterior probability within each local alignment. The server is freely available at http://plastrna.njit.edu

Crossref

PubMed Central

GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores

Author: BJ Keating
H Zhou
HJ Cordell
J He
J Marchini
JE Stone
Kai Wang
L Dematte
MC Schatz
Mingyao Li
NA Davis
S Purcell
Satish Chikkagoudar
T Schupbach
VW Lee
Publication venue: BioMed Central
Publication date: 01/05/2011
Field of study

Abstract Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from <url>http://www.cceb.upenn.edu/~mli/software/GENIE/</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy

Author: AF Bompfunewerer
Alex Liu
B Langmead
B Ma
D Sankoff
E Rivas
E Torarinsson
J Buhler
J Buhler
JH Havgaard
JH Havgaard
Jikai Lei
JS Pedersen
KC Pang
Osama Aljawad
P Gardner
P Schattner
PG Higgs
R Klein
R Li
S Chikkagoudar
S Griffiths-Jones
S Schwartz
S Washietl
SF Altschul
T Coenye
Y Sun
Y Sun
Yanni Sun
ZJ Lu
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

New tests for departures from random behavior in spatial memory experiments

Author: A. C. Kamil
A. N. Pettit
D. Basu
D. F. Sherry
D. J. Olson
D. Raj
D. S. Olton
E. L. Lehmann
H. S. Konijn
J. D. Gibbons
J. D. Gibbons
M. Abramowitz
M. H. Degroot
M. L. Spetch
M. L. Spetch
M. S. Chikkagoudar
N. L. Johnson
R. P. Balda
R. P. Balda
S. C. Hilton
S. D. Healy
S. D. Healy
S. J. Shettleworth
T. J. Ilersich
W. A. Roberts
W. Feller
W. J. Conover
W. J. Conover
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Hydrogen bond networks determine emergent mechanical and thermodynamic properties across a protein family

Author: A Cooper
AA Jensen
AG Murzin
BH Oh
BI Dahiyat
C Chennubhotla
CD Hsiao
Dang H Huynh
Dennis R Livesay
DI Kreimer
DJ Jacobs
DJ Jacobs
DJ Jacobs
DJ Jacobs
DJ Jacobs
DJ Jacobs
DJ Jacobs
Donald J Jacobs
DR Livesay
DR Livesay
DR Livesay
ER Henry
ER Henry
EW Myers
FA de Wolf
GI Makhatadze
GR Hedwig
H Nikaido
IN Shindyalov
JD Madura
JT Guo
KA Dill
LA Joachimiak
LS Busenlehner
M Cieplak
M Torrez
N Alexandrov
N Yao
O Keskin
P Das
PL Privalov
R Tam
RA Jensen
RM de Lorimier
RV Pappu
S Cheluvaraja
S Chikkagoudar
S Doniach
Sargis Dallakyan
SC Flores
U Roshan
V Munoz
V Munoz
VJ Hilser
WE Royer Jr
Y Xu
YJ Sun
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Gram-negative bacteria use periplasmic-binding proteins (bPBP) to transport nutrients through the periplasm. Despite immense diversity within the recognized substrates, all members of the family share a common fold that includes two domains that are separated by a conserved hinge. The hinge allows the protein to cycle between open (apo) and closed (ligated) conformations. Conformational changes within the proteins depend on a complex interplay of mechanical and thermodynamic response, which is manifested as an increase in thermal stability and decrease of flexibility upon ligand binding. Results We use a distance constraint model (DCM) to quantify the give and take between thermodynamic stability and mechanical flexibility across the bPBP family. Quantitative stability/flexibility relationships (QSFR) are readily evaluated because the DCM links mechanical and thermodynamic properties. We have previously demonstrated that QSFR is moderately conserved across a mesophilic/thermophilic RNase H pair, whereas the observed variance indicated that different enthalpy-entropy mechanisms allow similar mechanical response at their respective melting temperatures. Our predictions of heat capacity and free energy show marked diversity across the bPBP family. While backbone flexibility metrics are mostly conserved, cooperativity correlation (long-range couplings) also demonstrate considerable amount of variation. Upon ligand removal, heat capacity, melting point, and mechanical rigidity are, as expected, lowered. Nevertheless, significant differences are found in molecular cooperativity correlations that can be explained by the detailed nature of the hydrogen bond network. Conclusion Non-trivial mechanical and thermodynamic variation across the family is explained by differences within the underlying H-bond networks. The mechanism is simple; variation within the H-bond networks result in altered mechanical linkage properties that directly affect intrinsic flexibility. Moreover, varying numbers of H-bonds and their strengths control the likelihood for energetic fluctuations as H-bonds break and reform, thus directly affecting thermodynamic properties. Consequently, these results demonstrate how unexpected large differences, especially within cooperativity correlation, emerge from subtle differences within the underlying H-bond network. This inference is consistent with well-known results that show allosteric response within a family generally varies significantly. Identifying the hydrogen bond network as a critical determining factor for these large variances may lead to new methods that can predict such effects.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts

Author: A Krogh
AN Tegge
C Notredame
CB Do
DF Feng
DG Higgins
DG Higgins
DG Higgins
DG Higgins
F Jeanmougin
F Wilcoxon
G Pollastri
GH Gonnet
GJ Barton
GP Raghava
GP Raghava
HY Zhou
J Cheng
J Heringa
J Pei
J Pei
J Pei
J Söding
J Söding
JD Thompson
JD Thompson
JD Thompson
JD Thompson
Jianlin Cheng
K Katoh
M Brudno
M Larkin
NK Kim
NS Boutonnet
O Poirot
O Poirot
PHA Sneath
R Chenna
R Durbin
RC Edgar
RC Edgar
RK Bradley
RS Amarendran
RS Amarendran
RS Amarendran
S Chikkagoudar
SE Brenner
SH Sze
T Kawabata
TL Bailey
U Roshan
V Walle
V Walle
Xin Deng
YC Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. Results We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. Conclusion MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at <url>http://sysbio.rnet.missouri.edu/multicom_toolbox/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Locally most powerful rank tests for comparison of two failure rates based on multiple type-II censored data

Author: Biradar B. S.
Chikkagoudar M. S.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2012
Field of study

This article deals with the locally most powerful rank tests for testing the hypothesis that two failure rates are equal against the alternative that one failure rate is greater than the other, when the combined ordered sample is multiple Type-II censored. A modified version of the Dupač and Hájek (1969) theorem is used to establish their asymptotic normality under fixed alternative since the scores generating functions associated with these rank test statistics have a finite number of jump discontinuities. The modified version that leads to a simpler centering constant, is proved by Dupač (1970) using the results of Hájek (1968). The Pitman AREs of these rank tests based on censored data relative to the corresponding tests based on complete data are obtained under some Lehmann-type alternative distributions such that their failure rates dominate the failure rates of the respective null distributions. The AREs are computed numerically for single (left or right) and double censored data, and the extent of loss due to these censoring schemes is discussed. The rank tests considered here include among them the Mann-Whiney-Wilcoxon (MWW) test, the Savage test, and the linear combination of these two tests. In the case of all the tests, except the MWW test, it is found that the loss of efficiency due to left censoring is considerably less than that due to right censoring. In the case of finite samples, Monte Carlo simulation results showing the empirical levels and empirical powers against some Lehmann alternatives are presented

University of Mysore - Digital Repository of Research, Innovation and Scholarship (ePrints@UoM)

Rank Tests for Two-Sample Problems Based on Multiple Type-II Censored Data

Author: Biradar B. S.
Chikkagoudar M. S.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

In this article, we study the effect of censoring on the asymptotic efficiency of the two-sample rank tests based on multiple Type-II censored data. Since the scores generating functions associated with these test statistics have a finite number of jump discontinuities, we use a slightly modified version of a theorem of Dupac and Hajek (1969) to obtain their asymptotic distributions under fixed alternatives. This modified version, which leads to a simpler centering constant, is proved by Dupac (1970) in the light of results of Hoeffding (1968), an earlier version of Hoeffding (1973). Hence, we obtain the Pitman ARE's of these rank tests relative to the corresponding tests based on the complete samples. The ARE's are computed for some well known rank tests for two-sample location and scale problems, when the combined ordered samples from different underlying distributions are censored using triple and lower order Type-II censoring schemes. The effect of all these censoring schemes on the ARE's of the different tests is examined numerically. It is found that there is a gain in efficiency due to censoring in many of the cases considered here. This suggests that in such cases it is possible to improve the efficiency of rank tests by discarding suitable portions of the data

University of Mysore - Digital Repository of Research, Innovation and Scholarship (ePrints@UoM)

The identification of outliers in exponential samples

Author: BALASOORIYA U.
BALASOORIYA U.
CHIKKAGOUDAR M. S.
CHIKKAGOUDAR M. S.
COCHRAN W. G.
DAVIES L.
GATHER U.
GATHER U.
JEEVANAND E. S.
KALE B. K.
KIMBER A. C.
LIKES J.
ROUSSEEUW P. J.
SWEETING T.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref