7,501 research outputs found
BigExcel: A Web-Based Framework for Exploring Big Data in Social Sciences
This paper argues that there are three fundamental challenges that need to be
overcome in order to foster the adoption of big data technologies in
non-computer science related disciplines: addressing issues of accessibility of
such technologies for non-computer scientists, supporting the ad hoc
exploration of large data sets with minimal effort and the availability of
lightweight web-based frameworks for quick and easy analytics. In this paper,
we address the above three challenges through the development of 'BigExcel', a
three tier web-based framework for exploring big data to facilitate the
management of user interactions with large data sets, the construction of
queries to explore the data set and the management of the infrastructure. The
feasibility of BigExcel is demonstrated through two Yahoo Sandbox datasets. The
first dataset is the Yahoo Buzz Score data set we use for quantitatively
predicting trending technologies and the second is the Yahoo n-gram corpus we
use for qualitatively inferring the coverage of important events. A
demonstration of the BigExcel framework and source code is available at
http://bigdata.cs.st-andrews.ac.uk/projects/bigexcel-exploring-big-data-for-social-sciences/.Comment: 8 page
Recommended from our members
ROMOP: a light-weight R package for interfacing with OMOP-formatted electronic health record data.
Objectives:Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both data science and EHR structure. The Observational Medical Out-comes Partnership (OMOP) common data model (CDM) standardizes the language and structure of EHR data to promote interoperability of EHR data for research. While the OMOP CDM is valuable and more attuned to research purposes, it still requires extensive domain knowledge to utilize effectively, potentially limiting more widespread adoption of EHR data for research and quality improvement. Materials and methods:We have created ROMOP: an R package for direct interfacing with EHR data in the OMOP CDM format. Results:ROMOP streamlines typical EHR-related data processes. Its functions include exploration of data types, extraction and summarization of patient clinical and demographic data, and patient searches using any CDM vocabulary concept. Conclusion:ROMOP is freely available under the Massachusetts Institute of Technology (MIT) license and can be obtained from GitHub (http://github.com/BenGlicksberg/ROMOP). We detail instructions for setup and use in the Supplementary Materials. Additionally, we provide a public sandbox server containing synthesized clinical data for users to explore OMOP data and ROMOP (http://romop.ucsf.edu)
Surfing the Waves: Live Audio Mosaicing of an Electric Bass Performance as a Corpus Browsing Interface
In this paper, the authors describe how they use an electric bass as a subtle, expressive and intuitive interface to browse the rich sample bank available to most laptop owners. This is achieved by audio mosaicing of the live bass performance audio, through corpus-based concatenative synthesis (CBCS) techniques, allowing a mapping of the multi-dimensional expressivity of the performance onto foreign audio material, thus recycling the virtuosity acquired on the electric instrument with a trivial learning curve. This design hypothesis is contextualised and assessed within the Sandbox#n series of bass+laptop meta-instruments, and the authors describe technical means of the implementation through the use of the open-source CataRT CBCS system adapted for live mosaicing. They also discuss their encouraging early results and provide a list of further explorations to be made with that rich new interface
SWISH: SWI-Prolog for Sharing
Recently, we see a new type of interfaces for programmers based on web
technology. For example, JSFiddle, IPython Notebook and R-studio. Web
technology enables cloud-based solutions, embedding in tutorial web pages,
atractive rendering of results, web-scale cooperative development, etc. This
article describes SWISH, a web front-end for Prolog. A public website exposes
SWI-Prolog using SWISH, which is used to run small Prolog programs for
demonstration, experimentation and education. We connected SWISH to the
ClioPatria semantic web toolkit, where it allows for collaborative development
of programs and queries related to a dataset as well as performing maintenance
tasks on the running server and we embedded SWISH in the Learn Prolog Now!
online Prolog book.Comment: International Workshop on User-Oriented Logic Programming (IULP
2015), co-located with the 31st International Conference on Logic Programming
(ICLP 2015), Proceedings of the International Workshop on User-Oriented Logic
Programming (IULP 2015), Editors: Stefan Ellmauthaler and Claudia Schulz,
pages 99-113, August 201
Three-dimensional multifractal analysis of trabecular bone under clinical computed tomography
Purpose: An adequate understanding of bone structural properties is critical for predicting fragility conditions caused by diseases such as osteoporosis, and in gauging the success of fracture prevention treatments. In this work we aim to develop multiresolution image analysis techniques to extrapolate high-resolution images predictive power to images taken in clinical conditions. Methods: We performed multifractal analysis (MFA) on a set of 17 ex vivo human vertebrae clinical CT scans. The vertebræ failure loads (FFailure) were experimentally measured. We combined bone mineral density (BMD) with different multifractal dimensions, and BMD with multiresolution statistics (e.g., skewness, kurtosis) of MFA curves, to obtain linear models to predict FFailure. Furthermore we obtained short- and long-term precisions from simulated in vivo scans, using a clinical CT scanner. Ground-truth data - high-resolution images - were obtained with a High-Resolution Peripheral Quantitative Computed Tomography (HRpQCT) scanner. Results: At the same level of detail, BMD combined with traditional multifractal descriptors (Lipschitz-Hölder exponents), and BMD with monofractal features showed similar prediction powers in predicting FFailure (87%, adj. R2). However, at different levels of details, the prediction power of BMD with multifractal features raises to 92% (adj. R2) of FFailure. Our main finding is that a simpler but slightly less accurate model, combining BMD and the skewness of the resulting multifractal curves, predicts 90% (adj. R2) of FFailure. Conclusions: Compared to monofractal and standard bone measures, multifractal analysis captured key insights in the conditions leading to FFailure. Instead of raw multifractal descriptors, the statistics of multifractal curves can be used in several other contexts, facilitating further research.Fil: Baravalle, Rodrigo Guillermo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Thomsen, Felix Sebastian Leo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional del Sur; ArgentinaFil: Delrieux, Claudio Augusto. Universidad Nacional del Sur; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Lu, Yongtao. Dalian University of Technology; ChinaFil: Gómez, Juan Carlos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Stošić, Borko. Universidade Federal Rural Pernambuco; BrasilFil: Stošić, Tatijana. Universidade Federal Rural Pernambuco; Brasi
- …