Search CORE

1,288 research outputs found

DataHub: Collaborative Data Science & Dataset Version Management at Scale

Author: Bhardwaj Anant
Bhattacherjee Souvik
Chavan Amit
Deshpande Amol
Elmore Aaron J.
Madden Samuel
Parameswaran Aditya G.
Publication venue
Publication date: 02/09/2014
Field of study

Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system. We outline the challenges in providing dataset version control at scale.Comment: 7 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Hypercarbons in polyhedral structures

Author: Jayasree Elambalassery G.
Jemmis Eluvathingal D.
Parameswaran Pattiyil
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2006
Field of study

Though carbon is mostly tetravalent and tetracoordinated, there are several examples where the coordination number exceeds four. Structural varieties that exhibit hypercarbons in polyhedral structures such as polyhedral carboranes, sandwich complexes, encapsulated polyhedral structures and novel planar aromatic systems with atoms embedded in the middle are reviewed here. The structural variety anticipated with hypercoordinate carbon among carboranes is large as there are many modes of condensation that could lead to large number of new patterns. The relative stabilities of positional isomers of polyhedral carboranes, sandwich structures, and endohedral carboranes are briefly described. The mno rule accounts for the variety of structural patterns. Wheel-shaped and planar hypercoordinated molecules are recent theoretical developments in this area

Operationalizing Machine Learning: An Interview Study

Author: Garcia Rolando
Hellerstein Joseph M.
Parameswaran Aditya G.
Shankar Shreya
Publication venue
Publication date: 16/09/2022
Field of study

Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of operationalizing ML, or MLOps, consists of a continual loop of (i) data collection and labeling, (ii) experimentation to improve ML performance, (iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in production. When considered together, these responsibilities seem staggering -- how does anyone do MLOps, what are the unaddressed challenges, and what are the implications for tool builders? We conducted semi-structured ethnographic interviews with 18 MLEs working across many applications, including chatbots, autonomous vehicles, and finance. Our interviews expose three variables that govern success for a production ML deployment: Velocity, Validation, and Versioning. We summarize common practices for successful ML experimentation, deployment, and sustaining production performance. Finally, we discuss interviewees' pain points and anti-patterns, with implications for tool design.Comment: 20 pages, 4 figure

arXiv.org e-Print Archive

Low-mass Solitons from Fractional Charges in Quantum Chromodynamics

Author: Balachandran A. P.
Nair V. Parameswaran
Panchapakesan N.
Rajeev S. G.
Publication venue: CUNY Academic Works
Publication date: 01/12/1983
Field of study

Slansky, Goldman, and Shaw have proposed a model to account for the observation of fractionally charged states. We show that in this model, there are expected to be several low-mass solitons (four being in the mass range ∼20-60 MeV) associated with the third homotopy group π3(SU(3)/SO(3))=Z4, besides a low-mass (∼30 MeV) Z2 monopole. Confirmation of these levels and hence of the model has important implications for Cabrera\u27s results on the magnetic monopole. An efficient algorithm for the calculation of π3(G/H) for a general Lie group G and a subgroup H is developed. It is pointed out that solitons associated with the third homotopy group are predicted by some grand-unified-theory scenarios

City University of New York

Soliton States in the Quantum-Chromodynamic Effective Lagrangian

Author: Balachandran A. P.
Nair V. Parameswaran
Rajeev S. G.
Stern A.
Publication venue: CUNY Academic Works
Publication date: 01/03/1983
Field of study

The work of Skyrme has shown that the SU(2)×SU(2) chiral model has nontrivial topological sectors which admit solitons for generic chiral Lagrangians. In this paper, we study such models in the presence of baryon fields. The baryon number and strangeness of the solitons, and the bound states of the nucleon to the soliton are investigated. It is found that long-lived levels with large baryon number B and strangeness (≳6 in magnitude) and masses somewhere in the range 1.8 to 5.6 GeV must exist. Some of these levels have half-integral electric charge and exotic relation between B and spin s (e.g., even B and half-integer s). It is speculated that these levels may be related to the anomalous nuclei whose existence has been confirmed in cosmic-ray and LBL Bevalac experiments

City University of New York

Revisiting Prompt Engineering via Declarative Crowdsourcing

Author: Asawa Parth
Jain Naman
Parameswaran Aditya G.
Shankar Shreya
Wang Yujie
Publication venue
Publication date: 07/08/2023
Field of study

Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been an advent of toolkits and recipes centered around so-called prompt engineering-the process of asking an LLM to do something via a series of prompts. However, for LLM-powered data processing workflows, in particular, optimizing for quality, while keeping cost bounded, is a tedious, manual process. We put forth a vision for declarative prompt engineering. We view LLMs like crowd workers and leverage ideas from the declarative crowdsourcing literature-including leveraging multiple prompting strategies, ensuring internal consistency, and exploring hybrid-LLM-non-LLM approaches-to make prompt engineering a more principled process. Preliminary case studies on sorting, entity resolution, and imputation demonstrate the promise of our approac

arXiv.org e-Print Archive