83 research outputs found
Autonomous In-Orbit Satellite Assembly from a Modular Heterogeneous Swarm
This paper presents a decentralized, distributed guidance and control scheme to combine a heterogeneous swarm of component satellites into a large satellite structure. The component satellites for the heterogeneous swarm are chosen to promote flexibility in final shape inspired by crystal structures and Islamic tile art. After the ideal fundamental building blocks are selected, basic nanosatellite-class satellite designs are made to assist in simulations involving attitude control. The Swarm Orbital Construction Algorithm (SOCA) is a guidance and control algorithm to allow for the limited type heterogeneity and docking ability required for in-orbit assembly. The algorithm consists of two parts, a distributed auction which uses barrier functions to ensure the proper agent selection for each target, and a trajectory generation portion which leverages model predictive control and sequential convex programming to achieve optimal collision-free trajectories to the desired target point even with nonlinear system dynamics. The optimization constraints use a boundary layer to determine whether the collision avoidance or the docking constraints should be applied. The algorithm was tested in a simulated perturbed 6-DOF spacecraft dynamic environment for planar and out-of-plane final structures and on two robotic platforms, including a swarm of frictionless spacecraft simulation robots
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
With the rise of artificial intelligence in recent years, Deep Neural
Networks (DNNs) have been widely used in many domains. To achieve high
performance and energy efficiency, hardware acceleration (especially inference)
of DNNs is intensively studied both in academia and industry. However, we still
face two challenges: large DNN models and datasets, which incur frequent
off-chip memory accesses; and the training of DNNs, which is not well-explored
in recent accelerator designs. To truly provide high throughput and energy
efficient acceleration for the training of deep and large models, we inevitably
need to use multiple accelerators to explore the coarse-grain parallelism,
compared to the fine-grain parallelism inside a layer considered in most of the
existing architectures. It poses the key research question to seek the best
organization of computation and dataflow among accelerators. In this paper, we
propose a solution HyPar to determine layer-wise parallelism for deep neural
network training with an array of DNN accelerators. HyPar partitions the
feature map tensors (input and output), the kernel tensors, the gradient
tensors, and the error tensors for the DNN accelerators. A partition
constitutes the choice of parallelism for weighted layers. The optimization
target is to search a partition that minimizes the total communication during
training a complete DNN. To solve this problem, we propose a communication
model to explain the source and amount of communications. Then, we use a
hierarchical layer-wise dynamic programming method to search for the partition
for each layer.Comment: To appear in the 2019 25th International Symposium on
High-Performance Computer Architecture (HPCA 2019
Recommended from our members
Shape Design and Optimization for 3D Printing
In recent years, the 3D printing technology has become increasingly popular, with wide-spread uses in rapid prototyping, design, art, education, medical applications, food and fashion industries. It enables distributed manufacturing, allowing users to easily produce customized 3D objects in office or at home. The investment in 3D printing technology continues to drive down the cost of 3D printers, making them more affordable to consumers.
As 3D printing becomes more available, it also demands better computer algorithms to assist users in quickly and easily generating 3D content for printing. Creating 3D content often requires considerably more efforts and skills than creating 2D content. In this work, I will study several aspects of 3D shape design and optimization for 3D printing. I start by discussing my work in geometric puzzle design, which is a popular application of 3D printing in recreational math and art. Given user-provided input figures, the goal is to compute the minimum (or best) set of geometric shapes that can satisfy the given constraints (such as dissection constraints). The puzzle design also has to consider feasibility, such as avoiding interlocking pieces. I present two optimization-based algorithms to automatically generate customized 3D geometric puzzles, which can be directly printed for users to enjoy. They are also great tools for geometry education.
Next, I discuss shape optimization for printing functional tools and parts. Although current 3D modeling software allows a novice user to easily design 3D shapes, the resulting shapes are not guaranteed to meet required physical strength. For example, a poorly designed stool may easily collapse when a person sits on the stool; a poorly designed wrench may easily break under force. I study new algorithms to help users strengthen functional shapes in order to meet specific physical properties. The algorithm uses an optimization-based framework — it performs geometric shape deformation and structural optimization iteratively to minimize mechanical stresses in the presence of forces assuming typical use scenarios. Physically-based simulation is performed at run-time to evaluate the functional properties of the shape (e.g., mechanical stresses based on finite element methods), and the optimizer makes use of this information to improve the shape. Experimental results show that my algorithm can successfully optimize various 3D shapes, such as chairs, tables, utility tools, to withstand higher forces, while preserving the original shape as much as possible.
To improve the efficiency of physics simulation for general shapes, I also introduce a novel, SPH-based sampling algorithm, which can provide better tetrahedralization for use in the physics simulator. My new modeling algorithm can greatly reduce the design time, allowing users to quickly generate functional shapes that meet required physical standards
Hierarchical categorisation of tags for delicious
In the scenario of social bookmarking, a user browsing the Web bookmarks web pages and assigns free-text labels (i.e., tags) to them according to their personal preferences.
In this technical report, we approach one of the practical aspects when it comes to represent users' interests from their tagging activity, namely the categorization of tags into high-level categories of interest. The reason is that the representation of user profiles on the basis of the myriad of tags available on the Web is certainly unfeasible from various practical perspectives; mainly concerning the unavailability of data to reliably, accurately measure interests across such fine-grained categorisation, and, should the data be available, its overwhelming computational intractability. Motivated by this, our study presents the results of a categorization process whereby a collection of tags posted at Delicious #http://delicious.com# are classified into 200 subcategories of interest.Preprin
Recommended from our members
Predicting multibody assembly of proteins
textThis thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design.Computer Science
Hierarchical categorisation of web tags for Delicious
In the scenario of social bookmarking, a user browsing the Web bookmarks web pages and assigns free-text labels (i.e., tags) to them according to their personal preferences. The benefits of social tagging are clear – tags enhance Web content browsing and search. However, since these tags may be publicly available to any Internet user, a privacy attacker may collect this information and extract an accurate snapshot of users’ interests or user profiles, containing sensitive information, such as health-related information, political preferences, salary or religion. In order to hinder attackers in their efforts to profile users, this report focuses on the practical aspects of capturing user interests from their tagging activity. More accurately, we study how to categorise a collection of tags posted by users in one of the most popular bookmarking services, Delicious (http://delicious.com).Preprin
- …