5,346 research outputs found
PRT: Parallel program for a full backtranslation of oligopeptides
DNA hybridization methods have become the most widely used tools in molecular biology to identify organisms and evaluate gene expression levels. PCR (Polymerase Chain Reaction)-based methods, fluorescent in situ hybridization (FISH) and the recent development of DNA microarrays as a high throughput technology need efficient primers or probes design. Evaluation of the metabolic capacities of complex microbial communities found in terrestrial or aquatic environments requires new probe design algorithms that reflect the genetic diversity. As only a small part of the microbial diversity is known, gene sequences deposited in international databases do not reflect the entire diversity. In this context we propose to use oligopeptide sequences for the design of complete set of DNA probes that are able to target the entire genetic diversity of genes encoding enzymes. Due to the degenerated genetic code backtranslation must be managed efficiently. To our knowledge no software has been developed to propose a full backtranslation. This complexity is tractable since we only need to focus on short oligopeptides for DNA probe design. We propose new algorithms that perform a high performance oligopeptide backtranslation into all potential nucleic sequences. We use different efficient techniques such as memory mapping to perform such a computing. We also propose a MPI parallel computing that reduces the whole execution time using data load balancing and network file stream distribution on a cluster architecture
AI and OR in management of operations: history and trends
The last decade has seen a considerable growth in the use of Artificial Intelligence (AI) for operations management with the aim of finding solutions to problems that are increasing in complexity and scale. This paper begins by setting the context for the survey through a historical perspective of OR and AI. An extensive survey of applications of AI techniques for operations management, covering a total of over 1200 papers published from 1995 to 2004 is then presented. The survey utilizes Elsevier's ScienceDirect database as a source. Hence, the survey may not cover all the relevant journals but includes a sufficiently wide range of publications to make it representative of the research in the field. The papers are categorized into four areas of operations management: (a) design, (b) scheduling, (c) process planning and control and (d) quality, maintenance and fault diagnosis. Each of the four areas is categorized in terms of the AI techniques used: genetic algorithms, case-based reasoning, knowledge-based systems, fuzzy logic and hybrid techniques. The trends over the last decade are identified, discussed with respect to expected trends and directions for future work suggested
Recommended from our members
A resource aware distributed LSI algorithm for scalable information retrieval
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Semantic Indexing (LSI) is one of the popular techniques in the information retrieval fields. Different from the traditional information retrieval techniques, LSI is not based on the keyword matching simply. It uses statistics and algebraic computations. Based on Singular Value Decomposition (SVD), the higher dimensional matrix is converted to a lower dimensional approximate matrix, of which the noises could be filtered. And also the issues of synonymy and polysemy in the traditional techniques can be overcome based on the investigations of the terms related with the documents. However, it is notable that LSI suffers a scalability issue due to the computing complexity of SVD.
This thesis presents a resource aware distributed LSI algorithm MR-LSI which can solve the scalability issue using Hadoop framework based on the distributed computing model MapReduce. It also solves the overhead issue caused by the involved clustering algorithm. The evaluations indicate that MR-LSI can gain significant enhancement compared to the other strategies on processing large scale of documents. One remarkable advantage of Hadoop is that it supports heterogeneous computing environments so that the issue of unbalanced load among nodes is highlighted. Therefore, a load balancing algorithm based on genetic algorithm for balancing load in static environment is proposed. The results show that it can improve the performance of a cluster according to heterogeneity levels.
Considering dynamic Hadoop environments, a dynamic load balancing strategy with varying window size has been proposed. The algorithm works depending on data selecting decision and modeling Hadoop parameters and working mechanisms. Employing improved genetic algorithm for achieving optimized scheduler, the algorithm enhances the performance of a cluster with certain heterogeneity levels
Performance Modeling and Analysis of a Massively Parallel DIRECT— Part 1
Modeling and analysis techniques are used to investigate
the performance of a massively parallel version
of DIRECT, a global search algorithm widely used
in multidisciplinary design optimization applications.
Several highdimensional
benchmark functions and
real world problems are used to test the design effectiveness
under various problem structures. Theoretical
and experimental results are compared for two
parallel clusters with different system scale and network
connectivity. The present work aims at studying
the performance sensitivity to important parameters
for problem configurations, parallel schemes,
and system settings. The performance metrics
include the memory usage, load balancing, parallel
efficiency, and scalability. An analytical bounding
model is constructed to measure the load balancing
performance under different schemes. Additionally,
linear regression models are used to characterize
two major overhead sources—interprocessor communication
and processor idleness, and also applied
to the isoefficiency functions in scalability analysis.
For a variety of highdimensional
problems and large
scale systems, the massively parallel design has
achieved reasonable performance. The results of
the performance study provide guidance for efficient
problem and scheme configuration. More importantly,
the generalized design considerations and
analysis techniques are beneficial for transforming
many global search algorithms to become effective
large scale parallel optimization tools
Recommended from our members
High performance latent dirichlet allocation for text mining
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space
An Artificial Immune System Strategy for Robust Chemical Spectra Classification via Distributed Heterogeneous Sensors
The timely detection and classification of chemical and biological agents in a wartime environment is a critical component of force protection in hostile areas. Moreover, the possibility of toxic agent use in heavily populated civilian areas has risen dramatically in recent months. This thesis effort proposes a strategy for identifying such agents vis distributed sensors in an Artificial Immune System (AIS) network. The system may be used to complement electronic nose ( E-nose ) research being conducted in part by the Air Force Research Laboratory Sensors Directorate. In addition, the proposed strategy may facilitate fulfillment of a recent mandate by the President of the United States to the Office of Homeland Defense for the provision of a system that protects civilian populations from chemical and biological agents. The proposed system is composed of networked sensors and nodes, communicating via wireless or wired connections. Measurements are continually taken via dispersed, redundant, and heterogeneous sensors strategically placed in high threat areas. These sensors continually measure and classify air or liquid samples, alerting personnel when toxic agents are detected. Detection is based upon the Biological Immune System (BIS) model of antigens and antibodies, and alerts are generated when a measured sample is determined to be a valid toxic agent (antigen). Agent signatures (antibodies) are continually distributed throughout the system to adapt to changes in the environment or to new antigens. Antibody features are determined via data mining techniques in order to improve system performance and classification capabilities. Genetic algorithms (GAs) are critical part of the process, namely in antibody generation and feature subset selection calculations. Demonstrated results validate the utility of the proposed distributed AIS model for robust chemical spectra recognition
A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications
Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms
Hybrid ant colony system and genetic algorithm approach for scheduling of jobs in computational grid
Metaheuristic algorithms have been used to solve scheduling problems in grid computing.However,
stand-alone metaheuristic algorithms do not always show good performance in every problem instance. This study proposes a high level hybrid approach between ant colony system and genetic algorithm for job scheduling in grid computing.The proposed approach is based on a high level hybridization.The proposed hybrid approach is evaluated using the static benchmark problems known as ETC matrix.Experimental results show that the proposed hybridization between the two algorithms outperforms the stand-alone algorithms in terms of best and average
makespan values
- …