1,288 research outputs found

    DataHub: Collaborative Data Science & Dataset Version Management at Scale

    Get PDF
    Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system. We outline the challenges in providing dataset version control at scale.Comment: 7 page

    Hypercarbons in polyhedral structures

    Get PDF
    Though carbon is mostly tetravalent and tetracoordinated, there are several examples where the coordination number exceeds four. Structural varieties that exhibit hypercarbons in polyhedral structures such as polyhedral carboranes, sandwich complexes, encapsulated polyhedral structures and novel planar aromatic systems with atoms embedded in the middle are reviewed here. The structural variety anticipated with hypercoordinate carbon among carboranes is large as there are many modes of condensation that could lead to large number of new patterns. The relative stabilities of positional isomers of polyhedral carboranes, sandwich structures, and endohedral carboranes are briefly described. The mno rule accounts for the variety of structural patterns. Wheel-shaped and planar hypercoordinated molecules are recent theoretical developments in this area

    Operationalizing Machine Learning: An Interview Study

    Full text link
    Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of operationalizing ML, or MLOps, consists of a continual loop of (i) data collection and labeling, (ii) experimentation to improve ML performance, (iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in production. When considered together, these responsibilities seem staggering -- how does anyone do MLOps, what are the unaddressed challenges, and what are the implications for tool builders? We conducted semi-structured ethnographic interviews with 18 MLEs working across many applications, including chatbots, autonomous vehicles, and finance. Our interviews expose three variables that govern success for a production ML deployment: Velocity, Validation, and Versioning. We summarize common practices for successful ML experimentation, deployment, and sustaining production performance. Finally, we discuss interviewees' pain points and anti-patterns, with implications for tool design.Comment: 20 pages, 4 figure

    Low-mass Solitons from Fractional Charges in Quantum Chromodynamics

    Full text link
    Slansky, Goldman, and Shaw have proposed a model to account for the observation of fractionally charged states. We show that in this model, there are expected to be several low-mass solitons (four being in the mass range ∼20-60 MeV) associated with the third homotopy group π3(SU(3)/SO(3))=Z4, besides a low-mass (∼30 MeV) Z2 monopole. Confirmation of these levels and hence of the model has important implications for Cabrera\u27s results on the magnetic monopole. An efficient algorithm for the calculation of π3(G/H) for a general Lie group G and a subgroup H is developed. It is pointed out that solitons associated with the third homotopy group are predicted by some grand-unified-theory scenarios

    Soliton States in the Quantum-Chromodynamic Effective Lagrangian

    Full text link
    The work of Skyrme has shown that the SU(2)×SU(2) chiral model has nontrivial topological sectors which admit solitons for generic chiral Lagrangians. In this paper, we study such models in the presence of baryon fields. The baryon number and strangeness of the solitons, and the bound states of the nucleon to the soliton are investigated. It is found that long-lived levels with large baryon number B and strangeness (≳6 in magnitude) and masses somewhere in the range 1.8 to 5.6 GeV must exist. Some of these levels have half-integral electric charge and exotic relation between B and spin s (e.g., even B and half-integer s). It is speculated that these levels may be related to the anomalous nuclei whose existence has been confirmed in cosmic-ray and LBL Bevalac experiments

    Revisiting Prompt Engineering via Declarative Crowdsourcing

    Full text link
    Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been an advent of toolkits and recipes centered around so-called prompt engineering-the process of asking an LLM to do something via a series of prompts. However, for LLM-powered data processing workflows, in particular, optimizing for quality, while keeping cost bounded, is a tedious, manual process. We put forth a vision for declarative prompt engineering. We view LLMs like crowd workers and leverage ideas from the declarative crowdsourcing literature-including leveraging multiple prompting strategies, ensuring internal consistency, and exploring hybrid-LLM-non-LLM approaches-to make prompt engineering a more principled process. Preliminary case studies on sorting, entity resolution, and imputation demonstrate the promise of our approac
    • …
    corecore