614 research outputs found

    Machine learning-guided directed evolution for protein engineering

    Get PDF
    Machine learning (ML)-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. ML methods use data to predict how sequence maps to function without requiring a detailed model of the underlying physics or biological pathways. To demonstrate ML-guided directed evolution, we introduce the steps required to build ML sequence-function models and use them to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to using ML for protein engineering as well as the current literature and applications of this new engineering paradigm. ML methods accelerate directed evolution by learning from information contained in all measured variants and using that information to select sequences that are likely to be improved. We then provide two case studies that demonstrate the ML-guided directed evolution process. We also look to future opportunities where ML will enable discovery of new protein functions and uncover the relationship between protein sequence and function.Comment: Made significant revisions to focus on aspects most relevant to applying machine learning to speed up directed evolutio

    Shallow Water Equations in Hydraulics: Modeling, Numerics and Applications

    Get PDF
    This Special Issue aims to provide a forum for the latest advances in hydraulic modeling based on the use of shallow water and related models as well as their novel application in practical engineering. Original contributions, including those in but not limited to the following areas, will be considered for publication: new conceptual models and applications, flood inundation and routing, sediment transport and morphodynamic modelling, pollutant transport in water, irrigation and drainage modeling, numerical simulation in hydraulics, novel numerical methods for the shallow water equations and extended models, case studies, and high-performance computing

    전근대 토지대장과 지적도의 대화형 분석을 위한 시각화 설계

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 서진욱.We propose an interactive visualization design tool, called JigsawMap, for analyzing and mapping historical textual cadasters. A cadaster is an official register that records land properties (e.g., location, ownership, value and size) for land valuation and taxation. Such mapping of old and new cadasters can help historians understand the social and economic background of changes in land uses or ownership. JigsawMap can effectively connect the past land survey results to modern cadastral maps. In order to accomplish the connection process, three steps are performed: (1) segmentation of cadastral map, (2) visualization of textual cadastre, (3) and mapping interaction. We conducted usability studies and long term case studies to evaluate JigsawMap, and received positive responses. We summarize the evaluation results and present design guidelines for participatory design projects with historians. Followed by our study on JigsawMap, we further investigated on each components of our tool for more scalable map connection. First, we designed a hybrid algorithm to semi-automatically segment land pieces on cadastral map. The original JigsawMap provides interface for user to segment land pieces and the experiment result shows that segmentation algorithm accurately extracts the regions. Next, we reconsidered the visual encoding and simplified it to make textual cadastre more scalable. Since the former visual encoding relies on traditional map legend, the visual encoding can be selected based on user expert level. Finally, we redesigned layout algorithm to generate a better initial layout. We used evolution algorithm to articulate ambiguity problem of textual cadastre and the result less suffered from overlapping problem. Overall, our visualization design tool will provide an accurate segmentation result, give the user an option to select visual encoding that suits on their expert level, and generate more readable initial layout which gives an overview of cadastre layout.Chapter 1 Introduction 1 1.1 Background & Motivation 1 1.2 Main Contribution 7 1.3 Organization of the Dissertation 8 Chapter 2 Related Work 11 2.1 Map Data Visualization 11 2.2 Graph Layout Algorithms 13 2.3 Collaborative Map Editing Service 14 2.4 Map Image Segmentation 15 2.5 Premodern Cadastral Maps 17 2.6 Assessing Measures for Cartogram 18 Chapter 3 Visualizing and Mapping Premodern Textual Cadasters to Cadastral Maps 20 3.1 Textual Cadastre 21 3.2 Cadastral Maps 24 3.3 Paper-based Mapping Process and Obstacles 24 3.4 Task Flow in JigsawMap 26 3.5 Design Rationale 32 3.6 Evaluation 34 3.7 Discussion 40 3.8 Design Guidelines When Working with Historians 42 Chapter 4 Accurate Segmentation of Land Regions in Historical Cadastral Maps 44 4.1 Segmentation Pipeline 45 4.2 Preprocessing 46 4.3 Removal of Grid Line 48 4.4 Removal of Characters 52 4.5 Reconstruction of Land Boundaries 53 4.6 Generation of Polygons 55 4.7 Experimental Result 56 4.8 Discussion 59 Chapter 5 Approximating Rectangular Cartogram from Premodern Textual Cadastre 62 5.1 Challenges of the Textual Cadastre Layout 62 5.2 Quality Measures for Assessing Rectangular Cartogram 64 5.3 Quality Measures for Assessing Textual Cadastre 65 5.4 Graph Layout Algorithm 66 5.5 Results 72 5.6 Discussion 73 Chapter 6 Design of Scalable Node Representation for a Large Textual Cadastre 78 6.1 Motivation 78 6.2 Visual Encoding in JigsawMa 80 6.3 Challenges of Current Visual Encoding 81 6.4 Compact Visual Encoding 83 6.5 Results 84 6.6 Discussion 86 Chapter 7 Conclusion 88 Bibliography 90 Abstract in Korean 101Docto

    Group implicit concurrent algorithms in nonlinear structural dynamics

    Get PDF
    During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers

    BEYOND ROOTS ALONE: NOVEL METHODOLOGIES FOR ANALYZING COMPLEX SOIL AND MINIRHIZOTRON IMAGERY USING IMAGE PROCESSING AND GIS TOOLS

    Get PDF
    Quantifying belowground dynamics is critical to our understanding of plant and ecosystem function and belowground carbon cycling, yet currently available tools for complex belowground image analyses are insufficient. We introduce novel techniques combining digital image processing tools and geographic information systems (GIS) analysis to permit semi-automated analysis of complex root and soil dynamics. We illustrate methodologies with imagery from microcosms, minirhizotrons, and a rhizotron, in upland and peatland soils. We provide guidelines for correct image capture, a method that automatically stitches together numerous minirhizotron images into one seamless image, and image analysis using image segmentation and classification in SPRING or change analysis in ArcMap. These methods facilitate spatial and temporal root and soil interaction studies, providing a framework to expand a more comprehensive understanding of belowground dynamics

    Generating and auto-tuning parallel stencil codes

    Get PDF
    In this thesis, we present a software framework, Patus, which generates high performance stencil codes for different types of hardware platforms, including current multicore CPU and graphics processing unit architectures. The ultimate goals of the framework are productivity, portability (of both the code and performance), and achieving a high performance on the target platform. A stencil computation updates every grid point in a structured grid based on the values of its neighboring points. This class of computations occurs frequently in scientific and general purpose computing (e.g., in partial differential equation solvers or in image processing), justifying the focus on this kind of computation. The proposed key ingredients to achieve the goals of productivity, portability, and performance are domain specific languages (DSLs) and the auto-tuning methodology. The Patus stencil specification DSL allows the programmer to express a stencil computation in a concise way independently of hardware architecture-specific details. Thus, it increases the programmer productivity by disburdening her or him of low level programming model issues and of manually applying hardware platform-specific code optimization techniques. The use of domain specific languages also implies code reusability: once implemented, the same stencil specification can be reused on different hardware platforms, i.e., the specification code is portable across hardware architectures. Constructing the language to be geared towards a special purpose makes it amenable to more aggressive optimizations and therefore to potentially higher performance. Auto-tuning provides performance and performance portability by automated adaptation of implementation-specific parameters to the characteristics of the hardware on which the code will run. By automating the process of parameter tuning — which essentially amounts to solving an integer programming problem in which the objective function is the number representing the code's performance as a function of the parameter configuration, — the system can also be used more productively than if the programmer had to fine-tune the code manually. We show performance results for a variety of stencils, for which Patus was used to generate the corresponding implementations. The selection includes stencils taken from two real-world applications: a simulation of the temperature within the human body during hyperthermia cancer treatment and a seismic application. These examples demonstrate the framework's flexibility and ability to produce high performance code

    TensorIR: An Abstraction for Automatic Tensorized Program Optimization

    Full text link
    Deploying deep learning models on various devices has become an important topic. The wave of hardware specialization brings a diverse set of acceleration primitives for multi-dimensional tensor computations. These new acceleration primitives, along with the emerging machine learning models, bring tremendous engineering challenges. In this paper, we present TensorIR, a compiler abstraction for optimizing programs with these tensor computation primitives. TensorIR generalizes the loop nest representation used in existing machine learning compilers to bring tensor computation as the first-class citizen. Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives. Experimental results show that TensorIR compilation automatically uses the tensor computation primitives for given hardware backends and delivers performance that is competitive to state-of-art hand-optimized systems across platforms.Comment: Accepted to ASPLOS 202

    A μ-mode BLAS approach for multidimensional tensor-structured problems

    Get PDF
    In this manuscript, we present a common tensor framework which can be used to generalize one-dimensional numerical tasks to arbitrary dimension d by means of tensor product formulas. This is useful, for example, in the context of multivariate interpolation, multidimensional function approximation using pseudospectral expansions and solution of stiff differential equations on tensor product domains. The key point to obtain an efficient-to-implement BLAS formulation consists in the suitable usage of the mu-mode product (also known as tensor-matrix product or mode-n product) and related operations, such as the Tucker operator. Their MathWorks MATLAB (R)/GNU Octave implementations are discussed in the paper, and collected in the package KronPACK. We present numerical results on experiments up to dimension six from different fields of numerical analysis, which show the effectiveness of the approach

    An Evolutionary Approach to Adaptive Image Analysis for Retrieving and Long-term Monitoring Historical Land Use from Spatiotemporally Heterogeneous Map Sources

    Get PDF
    Land use changes have become a major contributor to the anthropogenic global change. The ongoing dispersion and concentration of the human species, being at their orders unprecedented, have indisputably altered Earth’s surface and atmosphere. The effects are so salient and irreversible that a new geological epoch, following the interglacial Holocene, has been announced: the Anthropocene. While its onset is by some scholars dated back to the Neolithic revolution, it is commonly referred to the late 18th century. The rapid development since the industrial revolution and its implications gave rise to an increasing awareness of the extensive anthropogenic land change and led to an urgent need for sustainable strategies for land use and land management. By preserving of landscape and settlement patterns at discrete points in time, archival geospatial data sources such as remote sensing imagery and historical geotopographic maps, in particular, could give evidence of the dynamic land use change during this crucial period. In this context, this thesis set out to explore the potentials of retrospective geoinformation for monitoring, communicating, modeling and eventually understanding the complex and gradually evolving processes of land cover and land use change. Currently, large amounts of geospatial data sources such as archival maps are being worldwide made online accessible by libraries and national mapping agencies. Despite their abundance and relevance, the usage of historical land use and land cover information in research is still often hindered by the laborious visual interpretation, limiting the temporal and spatial coverage of studies. Thus, the core of the thesis is dedicated to the computational acquisition of geoinformation from archival map sources by means of digital image analysis. Based on a comprehensive review of literature as well as the data and proposed algorithms, two major challenges for long-term retrospective information acquisition and change detection were identified: first, the diversity of geographical entity representations over space and time, and second, the uncertainty inherent to both the data source itself and its utilization for land change detection. To address the former challenge, image segmentation is considered a global non-linear optimization problem. The segmentation methods and parameters are adjusted using a metaheuristic, evolutionary approach. For preserving adaptability in high level image analysis, a hybrid model- and data-driven strategy, combining a knowledge-based and a neural net classifier, is recommended. To address the second challenge, a probabilistic object- and field-based change detection approach for modeling the positional, thematic, and temporal uncertainty adherent to both data and processing, is developed. Experimental results indicate the suitability of the methodology in support of land change monitoring. In conclusion, potentials of application and directions for further research are given
    corecore