1,194 research outputs found
Recommended from our members
Methodology for identifying alternative solutions in a population based data generation approach applied to synthetic biology
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonDesign is an essential component of sustainable development. Computational modelling has
become a useful technique that facilitates the design of complex systems. Variables that characterises
a complex system are encoded into a computational model using mathematical concepts
and through simulation each of these variables alone or in combination are modified to observe
the changes in the outcome. This allows the researchers to make predictions on the behaviour
of the real system that is being studied in response to the changes. The ultimate goal of any
design process is to come up with the best design; as resources are limited, to minimize the cost
and resource consumption, and to maximize the performance, profits and efficiency. To optimize
means to find the best solution, the best compromise among several conflicting demands subject
to predefined requirements. Therefore, computational optimization, modelling and simulation
forms an integrated part of the modern design practice.
This thesis defines a data analytics driven methodology which enables the identification of
alternative solutions of computational design by analysing the generational history of the population
based heuristic search used to generate the templates. While optimisation is focused on
obtaining the optimal solution this methodology focuses on alternative solutions which are sub
optimal by fitness or solutions with similar fitness but different structures. When the optimal
design solution is less robust, alternative solutions can offer a sufficiently good accuracy and an
achievable resource requirement. The main advantage of the methodology is that it exploits the
exploration process of the solution space during a single run, by focusing also on suboptimal
solutions, which usually get neglected in the search for an optimal one. The history of the
heuristic search is analysed for the emergence of alternative solutions and evolving of a solution.
By examining how an initial solution converts to an optimal solution core design patterns are
identified, and these were used to improve the design process. Further, this method limits the
number of runs of the heuristic search as more solution space is covered. The methodology is
generic because it can be used to any instance where a population based heuristic search is applied
to generate optimal designs. The applicability of the methodology is demonstrated using
three case studies from mathematics (building of a mathematical function for a set target) and
biology (obtaining alternative designs for genomic metabolic models [GEM] and DNA walker
circuits). In each case a different heuristic search method was used: Gene expression programming
(mathematical expressions), genetic algorithms (GEM models) and simulated annealing
(DNA walker circuits). Descriptive analytics, visual analytics and clustering was mainly used to build the data analytics driven approach in identifying alternative solutions. This data analytics
driven methodology is useful in optimising the computational design of complex systems
Whole genome sequence and analysis of the Marwari horse breed and its genetic origin
Background: The horse (Equus ferus caballus) is one of the earliest domesticated species and has played an important role in the development of human societies over the past 5,000 years. In this study, we characterized the genome of the Marwari horse, a rare breed with unique phenotypic characteristics, including inwardly turned ear tips. It is thought to have originated from the crossbreeding of local Indian ponies with Arabian horses beginning in the 12th century.
Results: We generated 101 Gb (similar to 30 x coverage) of whole genome sequences from a Marwari horse using the Illumina HiSeq2000 sequencer. The sequences were mapped to the horse reference genome at a mapping rate of similar to 98% and with similar to 95% of the genome having at least 10 x coverage. A total of 5.9 million single nucleotide variations, 0.6 million small insertions or deletions, and 2,569 copy number variation blocks were identified. We confirmed a strong Arabian and Mongolian component in the Marwari genome. Novel variants from the Marwari sequences were annotated, and were found to be enriched in olfactory functions. Additionally, we suggest a potential functional genetic variant in the TSHZ1 gene (p.Ala344>Val) associated with the inward-turning ear tip shape of the Marwari horses.
Conclusions: Here, we present an analysis of the Marwari horse genome. This is the first genomic data for an Asian breed, and is an invaluable resource for future studies of genetic variation associated with phenotypes and diseases in horses.open1
Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles
Previous research in multi-document news summarization has typically
concentrated on collating information that all sources agree upon. However, to
our knowledge, the summarization of diverse information dispersed across
multiple articles about an event has not been previously investigated. The
latter imposes a different set of challenges for a summarization model. In this
paper, we propose a new task of summarizing diverse information encountered in
multiple news articles encompassing the same event. To facilitate this task, we
outlined a data collection schema for identifying diverse information and
curated a dataset named DiverseSumm. The dataset includes 245 news stories,
with each story comprising 10 news articles and paired with a human-validated
reference. Moreover, we conducted a comprehensive analysis to pinpoint the
position and verbosity biases when utilizing Large Language Model (LLM)-based
metrics for evaluating the coverage and faithfulness of the summaries, as well
as their correlation with human assessments. We applied our findings to study
how LLMs summarize multiple news articles by analyzing which type of diverse
information LLMs are capable of identifying. Our analyses suggest that despite
the extraordinary capabilities of LLMs in single-document summarization, the
proposed task remains a complex challenge for them mainly due to their limited
coverage, with GPT-4 only able to cover less than 40% of the diverse
information on average
From Classical to Modern Computational Approaches to Identify Key Genetic Regulatory Components in Plant Biology
The selection of plant genotypes with improved productivity and tolerance to environmental constraints has always been a major concern in plant breeding. Classical approaches based on the generation of variability and selection of better phenotypes from large variant collections have improved their efficacy and processivity due to the implementation of molecular biology techniques, particularly genomics, Next Generation Sequencing and other omics such as proteomics and metabolomics. In this regard, the identification of interesting variants before they develop the phenotype trait of interest with molecular markers has advanced the breeding process of new varieties. Moreover, the correlation of phenotype or biochemical traits with gene expression or protein abundance has boosted the identification of potential new regulators of the traits of interest, using a relatively low number of variants. These important breakthrough technologies, built on top of classical approaches, will be improved in the future by including the spatial variable, allowing the identification of gene(s) involved in key processes at the tissue and cell levels
Do rent-seeking and interregional transfers contribute to urban primacy in sub-Saharan Africa?
We develop an economic geography model where mobile skilled workers choose to either work in a production sector or to become part of an unproductive elite. The elite sets income tax rates to maximize its own welfare by extracting rents, thereby influencing the spatial structure of the economy and changing the available range of consumption goods. We show that either unskilled labor mobility, or rent-seeking behavior, or both, are likely to favor the occurence of agglomeration and of urban primacy. In equilibrium, the elite may tax the unskilled workers but does not tax the skilled workers, and there are rural-urban transfers towards the agglomeration. The size of the elite and the magnitude of the tax burden that falls on the unskilled decrease with product differentiation and with the expenditure share for manufacturing goods. All these results are broadly in line with observed patterns of urban primacy and economic development in sub-Saharan African countries.economic geography; rent-seeking; interregional transfers; urban primacy; Sub-Saharan Africa.
- …