Search CORE

13 research outputs found

Rack-Scale Memory Pooling for Datacenters

Author: Novakovic Stanko
Publication venue: Lausanne, EPFL
Publication date: 24/05/2017
Field of study

The rise of web-scale services has led to a staggering growth in user data on the Internet. To transform such a vast raw data into valuable information for the user and provide quality assurances, it is important to minimize access latency and enable in-memory processing. For more than a decade, the only practical way to accommodate for ever-growing data in memory has been to scale out server resources, which has led to the emergence of large-scale datacenters and distributed non-relational databases (NoSQL). Such horizontal scaling of resources translates to an increasing number of servers that participate in processing individual user requests. Typically, each user request results in hundreds of independent queries targeting different NoSQL nodes - servers, and the larger the number of servers involved, the higher the fan-out. To complete a single user request, all of the queries associated with that request have to complete first, and thus, the slowest query determines the completion time. Because of skewed popularity distributions and resource contention, the more servers we have, the harder it is to achieve high throughput and facilitate server utilization, without violating service level objectives. This thesis proposes rack-scale memory pooling (RSMP), a new scaling technique for future datacenters that reduces networking overheads and improves the performance of core datacenter software. RSMP is an approach to building larger, rack-scale capacity units for datacenters through specialized fabric interconnects with support for one-sided operations, and using them, in lieu of conventional servers (e.g. 1U), to scale out. We define an RSMP unit to be a server rack connecting 10s to 100s of servers to a secondary network enabling direct, low-latency access to the global memory of the rack. We, then, propose a new RSMP design - Scale-Out NUMA that leverages integration and a NUMA fabric to bridge the gap between local and remote memory to only 5× difference in access latency. Finally, we show how RSMP impacts NoSQL data serving, a key datacenter service used by most web-scale applications today. We show that using fewer larger data shards leads to less load imbalance and higher effective throughput, without violating applications¿ service level objectives. For example, by using Scale-Out NUMA, RSMP improves the throughput of a key-value store up to 8.2× over a traditional scale-out deployment

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

Logic, parallelism and semantic networks : the binary predicate execution model

Author: Lee Craig Alexander
Publication venue: eScholarship, University of California
Publication date: 01/01/1988
Field of study

This thesis develops the Binary Predicate Execution Model; a distributed, massively-parallel system for semantic networks and knowledge bases that is built on a subset of first-order predicate logic. The use of logic gives the model an easily-understood programming paradigm and a well-defined semantics of execution. When expressed in binary predicates, a simple graphical interpretation can be used. All program facts are represented in an assertion graph. Each vertex is associated with a term appearing in a fact and the edges are labeled with the predicate names. Similar graphs are also associated with each rule body and the query. Finding all possible solutions corresponds to finding all possible matches between the query graph and the assertion graph. Invoking a rule corresponds to substituting the graph of its body constrained by the dependencies between its arguments. This can be implemented in a parallel, message-passing fashion where the assertion graph vertices are active processing elements which asynchronously exchange messages identifying different parts of the query that remain to be matched and containing any binding information from previous matching required to accomplish this. The model is data-driven since every message can be immediately processed without the need for any centralized control or centralized memory. By restricting how functional terms can occur, distributed data structures and remote data look-ups for unification are eliminated. Thus, the model's performance on increasingly larger problems scales-up given increasingly larger machines in most cases. Architectural support for the model is investigated and simulation results of a relatively simple software implementation are reported. This suggests performance on the order of 10^5 logical inferences per second for 256 processing elements in an n-cube configuration. Further research directions, including that of increasing efficiency, are discussed

eScholarship - University of California

Models and algorithms for parallel text retrieval

Author: Cambazoğlu Berkant Barla
Publication venue: Bilkent University
Publication date: 01/01/2006
Field of study

Cataloged from PDF version of article.In the last decade, search engines became an integral part of our lives. The current state-of-the-art in search engine technology relies on parallel text retrieval. Basically, a parallel text retrieval system is composed of three components: a crawler, an indexer, and a query processor. The crawler component aims to locate, fetch, and store the Web pages in a local document repository. The indexer component converts the stored, unstructured text into a queryable form, most often an inverted index. Finally, the query processing component performs the search over the indexed content. In this thesis, we present models and algorithms for efficient Web crawling and query processing. First, for parallel Web crawling, we propose a hybrid model that aims to minimize the communication overhead among the processors while balancing the number of page download requests and storage loads of processors. Second, we propose models for documentand term-based inverted index partitioning. In the document-based partitioning model, the number of disk accesses incurred during query processing is minimized while the posting storage is balanced. In the term-based partitioning model, the total amount of communication is minimized while, again, the posting storage is balanced. Finally, we develop and evaluate a large number of algorithms for query processing in ranking-based text retrieval systems. We test the proposed algorithms over our experimental parallel text retrieval system, Skynet, currently running on a 48-node PC cluster. In the thesis, we also discuss the design and implementation details of another, somewhat untraditional, grid-enabled search engine, SE4SEE. Among our practical work, we present the Harbinger text classification system, used in SE4SEE for Web page classification, and the K-PaToH hypergraph partitioning toolkit, to be used in the proposed models.Cambazoğlu, Berkant BarlaPh.D

Bilkent University Institutional Repository

Third International Symposium on Artificial Intelligence, Robotics, and Automation for Space 1994

Author
Publication venue
Publication date
Field of study

The Third International Symposium on Artificial Intelligence, Robotics, and Automation for Space (i-SAIRAS 94), held October 18-20, 1994, in Pasadena, California, was jointly sponsored by NASA, ESA, and Japan's National Space Development Agency, and was hosted by the Jet Propulsion Laboratory (JPL) of the California Institute of Technology. i-SAIRAS 94 featured presentations covering a variety of technical and programmatic topics, ranging from underlying basic technology to specific applications of artificial intelligence and robotics to space missions. i-SAIRAS 94 featured a special workshop on planning and scheduling and provided scientists, engineers, and managers with the opportunity to exchange theoretical ideas, practical results, and program plans in such areas as space mission control, space vehicle processing, data analysis, autonomous spacecraft, space robots and rovers, satellite servicing, and intelligent instruments

NASA Technical Reports Server

Structural Diversity of Biological Ligands and their Binding Sites in Proteins

Author: Stockwell GR
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2005
Field of study

The phenomenon of molecular recognition, which underpins almost all biological processes, is dynamic, complex and subtle. Establishing an interaction between a pair of molecules involves mutual structural rearrangements guided by a highly convoluted energy landscape, the accurate mapping of which continues to elude us. The analysis of interactions between proteins and small molecules has been a focus of intense interest for many years, offering as it does the promise of increased insight into many areas of biology, and the potential for greatly improved drug design methodologies. Computational methods for predicting which types of ligand a given protein may bind, and what conformation two molecules will adopt once paired, are particularly sought after. The work presented in this thesis aims to quantify the amount of structural variability observed in the ways in which proteins interact with ligands. This diversity is considered from two perspectives: to what extent ligands bind to different proteins in distinct conformations, and the degree to which binding sites specific for the same ligand have different atomic structures. The first study could be of value to approaches which aim to predict the bound pose of a ligand, since by cataloguing the range of conformations previously observed, it may be possible to better judge the biological likelihood of a newly predicted molecular arrangement. The findings show that several common biological ligands exhibit considerable conformational diversity when bound to proteins. Although binding in predominantly extended conformations, the analysis presented here highlights several cases in which the biological requirements of a given protein force its ligand to adopt a highly compact form. Comparing the conformational diversity observed within several protein families, the hypothesis that homologous proteins tend to bind ligands in a similar arrangement is generally upheld, but several families are identified in which this is demonstrably not the case. Consideration of diversity in the binding site itself, on the other hand, may be useful in guiding methods which search for binding sites in uncharacterised protein structures: identifying those regions of known sites which are less variable could help to focus the search only on the most important features. Analysis of the diversity of a non-redundant dataset of adenine binding sites shows that a small number of key interactions are conserved, with the majority of the fragment environment being highly variable. Just as ligand conformation varies between protein families, so the degree of binding site diversity is observed to be significantly higher in some families than others. Taken together, the results of this work suggest that the repertoire of strategies produced by nature for the purposes of molecular recognition are extremely extensive. Moreover, the importance of a given ligand conformation or pattern of interaction appears to vary greatly depending on the function of the particular group of proteins studied. As such, it is proposed that diversity analysis may form a significant part of future large-scale studies of ligand-protein interactions

UCL Discovery

OpenGrey Repository

Recommended from our members

Critical Connections: Communication for the Future

Author: United States. Congress. Office of Technology Assessment.
Publication venue: United States. Congress. Office of Technology Assessment.
Publication date: 01/02/1990
Field of study

The U.S. communication infrastructure is changing rapidly as a result of technological advances, deregulation, and an economic climate that is increasingly competitive. This change is affecting the way in which information is created, processed, transmitted, and provided to individuals and institutions. The report analyzes the implications of new communication technologies for business, politics, culture, and individuals, and suggests possible strategies and options for congressional consideration

UNT Digital Library