38 research outputs found
Analyzing Clustered Latent Dirichlet Allocation
Dynamic Topic Models (DTM) are a way to extract time-variant information from a collection of documents. The only available implementation of this is slow, taking days to process a corpus of 533,588 documents. In order to see how topics - both their key words and their proportional size in all documents - change over time, we analyze Clustered Latent Dirichlet Allocation (CLDA) as an alternative to DTM. This algorithm is based on existing parallel components, using Latent Dirichlet Allocation (LDA) to extract topics at local times, and k-means clustering to combine topics from dierent time periods. This method is two orders of magnitude faster than DTM, and allows for more freedom of experiment design. Results show that most topics generated by this algorithm are similar to those generated by DTM at both the local and global level using the Jaccard index and Sørensen-Dice coecient, and that this method\u27s perplexity compares favorably to DTM. We also explore tradeos in CLDA method parameters
Automated Cluster Provisioning And Workflow Management for Parallel Scientific Applications in the Cloud
Many commercial cloud providers and tools are available that researchers could utilize to advance computational science research. However, adoption by the research community has been slow. In this paper we describe the automated Pro-visioning And Workflow (PAW) management tool for parallel scientific applications in the cloud. PAW is a comprehensive resource provisioning and workflow tool that automates the steps of dynamically provisioning a large scale cluster environment in the cloud, executing a set of jobs or a custom workflow and, after the jobs have completed, de-provisioning the cluster environment in a single operation. A key characteristic of PAW is that it separates the provisioning of cluster resources in the cloud from the management of scientific workflow on these resources, which enables fine-grained decisions about performance and cost trade-offs in a commercial cloud environment. This paper describes our initial AWS implementation of PAW for executing a large parameter sweep workflow. We demonstrate this using an MPI-based topic modeling application. PAW provides a standardized, simplified, and pluggable interface that can easily be expanded to support a variety of underlying cloud or cluster hardware environments, user-facing scheduling systems, workflows, and scientific applications
Random Access in Nondelimited Variable-length Record Collections for Parallel Reading with Hadoop
The industry standard Packet CAPture (PCAP) format for storing network packet traces is normally only readable in serial due to its lack of delimiters, indexing, or blocking. This presents a challenge for parallel analysis of large networks, where packet traces can be many gigabytes in size. In this work we present RAPCAP, a novel method for random access into variable-length record collections like PCAP by identifying a record boundary within a small number of bytes of the access point. Unlike related heuristic methods that can limit scalability with a nonzero probability of error, the new method offers a correctness guarantee with a well formed file and does not rely on prior knowledge of the contents. We include a practical implementation of the algorithm with an extension to the Hadoop framework, and a performance comparison to serial ingestion. Finally, we present a number of similar storage types that could utilize a modified version of RAPCAP for random access
Effects of Hypothalamic Neurodegeneration on Energy Balance
Normal aging in humans and rodents is accompanied by a progressive increase in adiposity. To investigate the role of hypothalamic neuronal circuits in this process, we used a Cre-lox strategy to create mice with specific and progressive degeneration of hypothalamic neurons that express agouti-related protein (Agrp) or proopiomelanocortin (Pomc), neuropeptides that promote positive or negative energy balance, respectively, through their opposing effects on melanocortin receptor signaling. In previous studies, Pomc mutant mice became obese, but Agrp mutant mice were surprisingly normal, suggesting potential compensation by neuronal circuits or genetic redundancy. Here we find that Pomc-ablation mice develop obesity similar to that described for Pomc knockout mice, but also exhibit defects in compensatory hyperphagia similar to what occurs during normal aging. Agrp-ablation female mice exhibit reduced adiposity with normal compensatory hyperphagia, while animals ablated for both Pomc and Agrp neurons exhibit an additive interaction phenotype. These findings provide new insight into the roles of hypothalamic neurons in energy balance regulation, and provide a model for understanding defects in human energy balance associated with neurodegeneration and aging
Recommended from our members
Quantum-centric supercomputing for materials science: A perspective on challenges and future directions
Computational models are an essential tool for the design, characterization, and discovery of novel materials. Computationally hard tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their resources for simulation, analysis, and data processing. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of the computational tasks needed for materials science. In order to do that, the quantum technology must interact with conventional high-performance computing in several ways: approximate results validation, identification of hard problems, and synergies in quantum-centric supercomputing. In this paper, we provide a perspective on how quantum-centric supercomputing can help address critical computational problems in materials science, the challenges to face in order to solve representative use cases, and new suggested directions
Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions
Computational models are an essential tool for the design, characterization,
and discovery of novel materials. Hard computational tasks in materials science
stretch the limits of existing high-performance supercomputing centers,
consuming much of their simulation, analysis, and data resources. Quantum
computing, on the other hand, is an emerging technology with the potential to
accelerate many of the computational tasks needed for materials science. In
order to do that, the quantum technology must interact with conventional
high-performance computing in several ways: approximate results validation,
identification of hard problems, and synergies in quantum-centric
supercomputing. In this paper, we provide a perspective on how quantum-centric
supercomputing can help address critical computational problems in materials
science, the challenges to face in order to solve representative use cases, and
new suggested directions.Comment: 60 pages, 14 figures; comments welcom
The Bankruptcy Abuse Prevention and Consumer Protection Act: Means-Testing or Mean Spirited?
Thousands of U.S. households filed for bankruptcy just before the bankruptcy law changed in 2005. That rush-to-file was more pronounced, we find, in states with more generous bankruptcy exemptions and lower credit scores. We take that finding as evidence that the new law effectively reduces exemptions, which in turn should reduce the “demand” for bankruptcy and the resulting losses to suppliers of consumer credit. We expect the savings to suppliers will be shared with borrowers by way of lower credit card rates, although credit card spreads have not yet fallen. If cheaper credit is the upside of the new law, the downside is reduced bankruptcy “insurance” against bad luck. The overall impact of the new law on the average household depends on how one weighs those two sides