2,100 research outputs found

    Multidimensional catalogs for systematic exploration of component-based design spaces

    Get PDF
    Most component-based approaches to elaborate software require complete and consistent descriptions of components, but in practical settings components information is incomplete, imprecise and changing, and requirements may be likewise. More realistically deployable are approaches that combine exploration of candidate architectures with their evaluation vis-a-vis requirements, and deal with the fuzzyness of available component information. This article presents an approach to systematic generation, evaluation and re-generation of component assemblies, using potentially incomplete, imprecise, unreliable and changing descriptions of requirements and components. The key ideas are representation of NFRs using architectural policies, systematic reification of policies into mechanisms and components that implement them, multi-dimensional characterizations of these three levels, and catalogs of them. The Azimut framework embodies these ideas and enables traceability of architecture by supporting architecture-level reasoning, and allows architects to engage into systematic exploration of design spaces. A detailed illustrative example illustrates the approach.1st International Workshop on Advanced Software Engineering: Expanding the Frontiers of Software Technology - Session 1: Software ArchitectureRed de Universidades con Carreras en Informática (RedUNCI

    Multidimensional catalogs for systematic exploration of component-based design spaces

    Get PDF
    Most component-based approaches to elaborate software require complete and consistent descriptions of components, but in practical settings components information is incomplete, imprecise and changing, and requirements may be likewise. More realistically deployable are approaches that combine exploration of candidate architectures with their evaluation vis-a-vis requirements, and deal with the fuzzyness of available component information. This article presents an approach to systematic generation, evaluation and re-generation of component assemblies, using potentially incomplete, imprecise, unreliable and changing descriptions of requirements and components. The key ideas are representation of NFRs using architectural policies, systematic reification of policies into mechanisms and components that implement them, multi-dimensional characterizations of these three levels, and catalogs of them. The Azimut framework embodies these ideas and enables traceability of architecture by supporting architecture-level reasoning, and allows architects to engage into systematic exploration of design spaces. A detailed illustrative example illustrates the approach.1st International Workshop on Advanced Software Engineering: Expanding the Frontiers of Software Technology - Session 1: Software ArchitectureRed de Universidades con Carreras en Informática (RedUNCI

    Data Management and Mining in Astrophysical Databases

    Full text link
    We analyse the issues involved in the management and mining of astrophysical data. The traditional approach to data management in the astrophysical field is not able to keep up with the increasing size of the data gathered by modern detectors. An essential role in the astrophysical research will be assumed by automatic tools for information extraction from large datasets, i.e. data mining techniques, such as clustering and classification algorithms. This asks for an approach to data management based on data warehousing, emphasizing the efficiency and simplicity of data access; efficiency is obtained using multidimensional access methods and simplicity is achieved by properly handling metadata. Clustering and classification techniques, on large datasets, pose additional requirements: computational and memory scalability with respect to the data size, interpretability and objectivity of clustering or classification results. In this study we address some possible solutions.Comment: 10 pages, Late

    Exploration of Parameter Spaces in a Virtual Observatory

    Get PDF
    Like every other field of intellectual endeavor, astronomy is being revolutionised by the advances in information technology. There is an ongoing exponential growth in the volume, quality, and complexity of astronomical data sets, mainly through large digital sky surveys and archives. The Virtual Observatory (VO) concept represents a scientific and technological framework needed to cope with this data flood. Systematic exploration of the observable parameter spaces, covered by large digital sky surveys spanning a range of wavelengths, will be one of the primary modes of research with a VO. This is where the truly new discoveries will be made, and new insights be gained about the already known astronomical objects and phenomena. We review some of the methodological challenges posed by the analysis of large and complex data sets expected in the VO-based research. The challenges are driven both by the size and the complexity of the data sets (billions of data vectors in parameter spaces of tens or hundreds of dimensions), by the heterogeneity of the data and measurement errors, including differences in basic survey parameters for the federated data sets (e.g., in the positional accuracy and resolution, wavelength coverage, time baseline, etc.), various selection effects, as well as the intrinsic clustering properties (functional form, topology) of the data distributions in the parameter spaces of observed attributes. Answering these challenges will require substantial collaborative efforts and partnerships between astronomers, computer scientists, and statisticians.Comment: Invited review, 10 pages, Latex file with 4 eps figures, style files included. To appear in Proc. SPIE, v. 4477 (2001

    Massive Datasets in Astronomy

    Get PDF
    Astronomy has a long history of acquiring, systematizing, and interpreting large quantities of data. Starting from the earliest sky atlases through the first major photographic sky surveys of the 20th century, this tradition is continuing today, and at an ever increasing rate. Like many other fields, astronomy has become a very data-rich science, driven by the advances in telescope, detector, and computer technology. Numerous large digital sky surveys and archives already exist, with information content measured in multiple Terabytes, and even larger, multi-Petabyte data sets are on the horizon. Systematic observations of the sky, over a range of wavelengths, are becoming the primary source of astronomical data. Numerical simulations are also producing comparable volumes of information. Data mining promises to both make the scientific utilization of these data sets more effective and more complete, and to open completely new avenues of astronomical research. Technological problems range from the issues of database design and federation, to data mining and advanced visualization, leading to a new toolkit for astronomical research. This is similar to challenges encountered in other data-intensive fields today. These advances are now being organized through a concept of the Virtual Observatories, federations of data archives and services representing a new information infrastructure for astronomy of the 21st century. In this article, we provide an overview of some of the major datasets in astronomy, discuss different techniques used for archiving data, and conclude with a discussion of the future of massive datasets in astronomy.Comment: 46 Pages, 21 Figures, Invited Review for the Handbook of Massive Datasets, editors J. Abello, P. Pardalos, and M. Resende. Due to space limitations this version has low resolution figures. For full resolution review see http://www.astro.caltech.edu/~rb/publications/hmds.ps.g

    Report from the Tri-Agency Cosmological Simulation Task Force

    Full text link
    The Tri-Agency Cosmological Simulations (TACS) Task Force was formed when Program Managers from the Department of Energy (DOE), the National Aeronautics and Space Administration (NASA), and the National Science Foundation (NSF) expressed an interest in receiving input into the cosmological simulations landscape related to the upcoming DOE/NSF Vera Rubin Observatory (Rubin), NASA/ESA's Euclid, and NASA's Wide Field Infrared Survey Telescope (WFIRST). The Co-Chairs of TACS, Katrin Heitmann and Alina Kiessling, invited community scientists from the USA and Europe who are each subject matter experts and are also members of one or more of the surveys to contribute. The following report represents the input from TACS that was delivered to the Agencies in December 2018.Comment: 36 pages, 3 figures. Delivered to NASA, NSF, and DOE in Dec 201

    Some statistical and computational challenges, and opportunities in astronomy

    Get PDF
    The data complexity and volume of astronomical findings have increased in recent decades due to major technological improvements in instrumentation and data collection methods. The contemporary astronomer is flooded with terabytes of raw data that produce enormous multidimensional catalogs of objects (stars, galaxies, quasars, etc.) numbering in the billions, with hundreds of measured numbers for each object. The astronomical community thus faces a key task: to enable efficient and objective scientific exploitation of enormous multifaceted data sets and the complex links between data and astrophysical theory. In recognition of this task, the National Virtual Observatory (NVO) initiative recently emerged to federate numerous large digital sky archives, and to develop tools to explore and understand these vast volumes of data. The effective use of such integrated massive data sets presents a variety of new challenging statistical and algorithmic problems that require methodological advances. An interdisciplinary team of statisticians, astronomers and computer scientists from The Pennsylvania State University, California Institute of Technology and Carnegie Mellon University is developing statistical methodology for the NVO. A brief glimpse into the Virtual Observatory and the work of the Penn State-led team is provided here

    Immersive and Collaborative Data Visualization Using Virtual Reality Platforms

    Get PDF
    Effective data visualization is a key part of the discovery process in the era of big data. It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding. Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in helping to identify and remove bad data from the analysis. However, a high complexity or a high dimensionality of modern data sets represents a critical obstacle. How do we visualize interesting structures and patterns that may exist in hyper-dimensional data spaces? A better understanding of how we can perceive and interact with multi dimensional information poses some deep questions in the field of cognition technology and human computer interaction. To this effect, we are exploring the use of immersive virtual reality platforms for scientific data visualization, both as software and inexpensive commodity hardware. These potentially powerful and innovative tools for multi dimensional data visualization can also provide an easy and natural path to a collaborative data visualization and exploration, where scientists can interact with their data and their colleagues in the same visual space. Immersion provides benefits beyond the traditional desktop visualization tools: it leads to a demonstrably better perception of a datascape geometry, more intuitive data understanding, and a better retention of the perceived relationships in the data.Comment: 6 pages, refereed proceedings of 2014 IEEE International Conference on Big Data, page 609, ISBN 978-1-4799-5665-

    Navigating Diverse Datasets in the Face of Uncertainty

    Get PDF
    When exploring big volumes of data, one of the challenging aspects is their diversity of origin. Multiple files that have not yet been ingested into a database system may contain information of interest to a researcher, who must curate, understand and sieve their content before being able to extract knowledge. Performance is one of the greatest difficulties in exploring these datasets. On the one hand, examining non-indexed, unprocessed files can be inefficient. On the other hand, any processing before its understanding introduces latency and potentially un- necessary work if the chosen schema matches poorly the data. We have surveyed the state-of-the-art and, fortunately, there exist multiple proposal of solutions to handle data in-situ performantly. Another major difficulty is matching files from multiple origins since their schema and layout may not be compatible or properly documented. Most surveyed solutions overlook this problem, especially for numeric, uncertain data, as is typical in fields like astronomy. The main objective of our research is to assist data scientists during the exploration of unprocessed, numerical, raw data distributed across multiple files based solely on its intrinsic distribution. In this thesis, we first introduce the concept of Equally-Distributed Dependencies, which provides the foundations to match this kind of dataset. We propose PresQ, a novel algorithm that finds quasi-cliques on hypergraphs based on their expected statistical properties. The probabilistic approach of PresQ can be successfully exploited to mine EDD between diverse datasets when the underlying populations can be assumed to be the same. Finally, we propose a two-sample statistical test based on Self-Organizing Maps (SOM). This method can outperform, in terms of power, other classifier-based two- sample tests, being in some cases comparable to kernel-based methods, with the advantage of being interpretable. Both PresQ and the SOM-based statistical test can provide insights that drive serendipitous discoveries

    A systematic approach for integrated product, materials, and design-process design

    Get PDF
    Designers are challenged to manage customer, technology, and socio-economic uncertainty causing dynamic, unquenchable demands on limited resources. In this context, increased concept flexibility, referring to a designer s ability to generate concepts, is crucial. Concept flexibility can be significantly increased through the integrated design of product and material concepts. Hence, the challenge is to leverage knowledge of material structure-property relations that significantly affect system concepts for function-based, systematic design of product and materials concepts in an integrated fashion. However, having selected an integrated product and material system concept, managing complexity in embodiment design-processes is important. Facing a complex network of decisions and evolving analysis models a designer needs the flexibility to systematically generate and evaluate embodiment design-process alternatives. In order to address these challenges and respond to the primary research question of how to increase a designer s concept and design-process flexibility to enhance product creation in the conceptual and early embodiment design phases, the primary hypothesis in this dissertation is embodied as a systematic approach for integrated product, materials and design-process design. The systematic approach consists of two components i) a function-based, systematic approach to the integrated design of product and material concepts from a systems perspective, and ii) a systematic strategy to design-process generation and selection based on a decision-centric perspective and a value-of-information-based Process Performance Indicator. The systematic approach is validated using the validation-square approach that consists of theoretical and empirical validation. Empirical validation of the framework is carried out using various examples including: i) design of a reactive material containment system, and ii) design of an optoelectronic communication system.Ph.D.Committee Chair: Allen, Janet K.; Committee Member: Aidun, Cyrus K.; Committee Member: Klein, Benjamin; Committee Member: McDowell, David L.; Committee Member: Mistree, Farrokh; Committee Member: Yoder, Douglas P
    corecore