15 research outputs found

    Not Available

    No full text
    This paper is published under short communication hence no abstract is available.Not Availabl

    Not Available

    No full text
    Not AvailableFive classical clustering method s: four hierarchical-single linkage, average-between linkage, average-within linkage, Wards-and one non-hi era rchical-k-means-using five different distance measures: squared Euclidean, city block, Chebychev's, Pearson correlation and Minkowski have been compared on the basis of simu lated multivariate data on paddy crop genotypes. The performance of different clustering methods was compared based on the average percentage probability of misclassification an its standard error. The performance of different hierarchical clustering methods varied with distance measures used and it was found that squared Euclidean performed best among the five distances followed by city block distance in majority of cases. Among the five methods, the Ward's method performed best with least average percentage probability of misdassification followed by non-hierarchical k-means method irrespective of the sample size. Among the different distance measures used under hierarchical clustering methods, the squared Euclidean distance showed least average percentage probability of misclassification followed by city block distance.Not Availabl

    Not Available

    No full text
    Not AvailableSUMMARY Most crop datasets contain missing values, a fact which can cause severe problems in the analysis and limit the utility of resulting inference. Classification techniques for grouping of crop genotypes are used when the data is complete. However, the presence of missing values limits the utility of these techniques and creates bias in the resulting inferences. In majority of the cases, missing values are handled by deleting the genotype or traits which contain missing values there by losing information on these genotypes. An interesting approach to handle this problem is to impute the missing values. In this paper, we provided some solutions to handle missing data in crop breeding experiments for classification of crop genotypes. The performance of the imputation techniques is assessed by using the hit ratio criteria computed through four different classifiers by using extensive simulation procedure. This paper has also attempted to provide a description of missing data mechanism in agricultural experiments and various imputation techniques for missing data analysis in classification problems. For lower proportions of missing data, all four of the imputation techniques provided satisfactory results for classification of crop genotypes. For moderate level of missingness in the data, regression and multiple imputation techniques provided same levels of precision for classification of crop genotypes. When there is a high proportion of missing data, multiple imputation technique outperformed all imputation techniques for classification of crop genotypes. Among the classifiers, k-th nearest neighbor is the best classification technique in missing data situations.Not Availabl

    Not Available

    No full text
    Not AvailableMost crop datasets contain missing values, a fact which can cause severe problems in the analysis and limit the utility of resulting inference. Classification techniques for grouping of crop genotypes are used when the data is complete. However, the presence of missing values limits the utility of these techniques and creates bias in the resulting inferences. In majority of the cases, missing values are handled by deleting the genotype or traits which contain missing values there by losing information on these genotypes. An interesting approach to handle this problem is to impute the missing values. In this paper, we provided some solutions to handle missing data in crop breeding experiments for classification of crop genotypes.The performance of the imputation techniques is assessed by using the hit ratio criteria computed through four different classifiers by using extensive simulation procedure. This paper has also attempted to provide a description of missing data mechanism in agricultural experiments and various imputation techniques for missing data analysis in classification problems. For lower proportions of missing data, all four of the imputation techniques provided satisfactory results for classification of crop genotypes. For moderate level of missingness in the data, regression and multiple imputation techniques provided same levels of precision for classification of crop genotypes. When there is a high proportion of missing data, multiple imputation technique outperformed all imputation techniques for classification of crop genotypes. Among the classifiers, k-th nearest neighbor is the best classification technique in missing data situations.Not Availabl

    Not Available

    No full text
    Not AvailableHeritability is one of the most important genetic parameter widely used in plant and animal breeding genetic improvement studies. In literature, several methodologies are available for estimation of heritability for different experimental situations. Unfortunately none of these provide always a valid estimate of heritability and the estimate is so inadmissible that no conclusions can be drawn for the inheritance of the trait under consideration. Further in particularly, there is no unique methodology, which is suitable for estimation of heritability in unbalanced situations. Keeping this in view , need has been felt that with the help of computer, the sensitivity and robustness of the very widely used genetic parameter, heritability might be studied at length. The sensitivity here refers to that how the estimate of heritability is dependend upon the aberrant or outliers. The paper contains some of the results as obtained by Bhatia et al (2003).Not Availabl

    Building and Querying Microbial Ontology

    Get PDF
    AbstractThe microbial taxonomy is based on the characteristics of microorganisms that can be objectively observed and measured. There are many scheme of microbial classification, but the latest is the three domain system and is the most accepted. Ontologies are the new form of knowledge representation that acts in synergy with agents and Semantic Web Architecture. Ontologies define domain concepts and the relationships between them, and thus provide a domain language that is meaningful to both humans and machines. The relationships in Ontology are explicitly named and developed with specification of rules and constraints so that they reflect the context of domain for which the knowledge is modelled. Ontologies can be built by using various GUI based software tools, known as Ontology editors. Among all editors Protégé is widely supported by a huge research community. For effective use of Ontology, protégé provides a query interface known as SPARQL query panel. SPARQL is a syntactically-SQL-like language for querying RDF graphs. Microbial Taxonomy Ontology is developed for the three domain system of microbes for the domain Bacteria which will be helpful for the study of Agriculturally Important Microbes (Bacteria). This ontology is built in the Protégé OWL editor from Domain to Genus level. Using this ontology, a query interface can be developed that will help detailed study of microbial taxonomy, classification of microbes as well as exchange knowledge between software agents and systems

    Building and Querying Microbial Ontology, Procedia Technology

    No full text
    Not AvailableThe microbial taxonomy is based on the characteristics of microorganisms that can be objectively observed and measured. There are many scheme of microbial classification, but the latest is the three domain system and is the most accepted. Ontologies are the new form of knowledge representation that acts in synergy with agents and Semantic Web Architecture. Ontologies define domain concepts and the relationships between them, and thus provide a domain language that is meaningful to both humans and machines. The relationships in Ontology are explicitly named and developed with specification of rules and constraints so that they reflect the context of domain for which the knowledge is modelled. Ontologies can be built by using various GUI based software tools, known as Ontology editors. Among all editors Protégé is widely supported by a huge research community. For effective use of Ontology, protégé provides a query interface known as SPARQL query panel. SPARQL is a syntactically-SQL-like language for querying RDF graphs. Microbial Taxonomy Ontology is developed for the three domain system of microbes for the domain Bacteria which will be helpful for the study of Agriculturally Important Microbes (Bacteria). This ontology is built in the Protégé OWL editor from Domain to Genus level. Using this ontology, a query interface can be developed that will help detailed study of microbial taxonomy, classification of microbes as well as exchange knowledge between software agents and systems.Not Availabl

    Not Available

    No full text
    Not AvailableThe demand for food proteins, including plant and animal proteins is increasing at an exponential rate. The demand for animal products will nearly be doubled by 2030. Thus, to improve livestock production and meet the animal protein demand, it is essential to go for application of interventions based on genomics, statistics and informatics. Such interventions are quite often used in the animal improvement programs to develop offspring with desirable traits. More recently, with the emergence of high throughput sequencing technologies, genomes of farm animals, fishes and model organisms were sequenced and the same are available in public domain. Also, with the advent of new silicon technologies, it has become possible to manage the generated data from genome sequencing projects. Now, the challenge lies with the analysis and interpretation of sequence data in a biologically meaningful manner, for which many algorithmic based analytical techniques and high performance computing methods were developed. Here, a brief review is presented on the application of various statistical and computational approaches used in genomic data analysis. Applications of the above mentioned approaches for health management and sustainable animal and fish production from the view point of vaccine and drug designing, disease risk management, epigenomics and whole genome level SNP/CNV associations with traits at are also discussed here. Besides, this paper allows the molecular biologists and other application scientists to analyze overwhelming amount of genomic data by different methods outlined here.Not Availabl
    corecore