4,773 research outputs found
Metamorphic Domain-Specific Languages: A Journey Into the Shapes of a Language
External or internal domain-specific languages (DSLs) or (fluent) APIs?
Whoever you are -- a developer or a user of a DSL -- you usually have to choose
your side; you should not! What about metamorphic DSLs that change their shape
according to your needs? We report on our 4-years journey of providing the
"right" support (in the domain of feature modeling), leading us to develop an
external DSL, different shapes of an internal API, and maintain all these
languages. A key insight is that there is no one-size-fits-all solution or no
clear superiority of a solution compared to another. On the contrary, we found
that it does make sense to continue the maintenance of an external and internal
DSL. The vision that we foresee for the future of software languages is their
ability to be self-adaptable to the most appropriate shape (including the
corresponding integrated development environment) according to a particular
usage or task. We call metamorphic DSL such a language, able to change from one
shape to another shape
A Data Science Course for Undergraduates: Thinking with Data
Data science is an emerging interdisciplinary field that combines elements of
mathematics, statistics, computer science, and knowledge in a particular
application domain for the purpose of extracting meaningful information from
the increasingly sophisticated array of data available in many settings. These
data tend to be non-traditional, in the sense that they are often live, large,
complex, and/or messy. A first course in statistics at the undergraduate level
typically introduces students with a variety of techniques to analyze small,
neat, and clean data sets. However, whether they pursue more formal training in
statistics or not, many of these students will end up working with data that is
considerably more complex, and will need facility with statistical computing
techniques. More importantly, these students require a framework for thinking
structurally about data. We describe an undergraduate course in a liberal arts
environment that provides students with the tools necessary to apply data
science. The course emphasizes modern, practical, and useful skills that cover
the full data analysis spectrum, from asking an interesting question to
acquiring, managing, manipulating, processing, querying, analyzing, and
visualizing data, as well communicating findings in written, graphical, and
oral forms.Comment: 21 pages total including supplementary material
A Grammar for Reproducible and Painless Extract-Transform-Load Operations on Medium Data
Many interesting data sets available on the Internet are of a medium
size---too big to fit into a personal computer's memory, but not so large that
they won't fit comfortably on its hard disk. In the coming years, data sets of
this magnitude will inform vital research in a wide array of application
domains. However, due to a variety of constraints they are cumbersome to
ingest, wrangle, analyze, and share in a reproducible fashion. These
obstructions hamper thorough peer-review and thus disrupt the forward progress
of science. We propose a predictable and pipeable framework for R (the
state-of-the-art statistical computing environment) that leverages SQL (the
venerable database architecture and query language) to make reproducible
research on medium data a painless reality.Comment: 30 pages, plus supplementary material
Salience and Market-aware Skill Extraction for Job Targeting
At LinkedIn, we want to create economic opportunity for everyone in the
global workforce. To make this happen, LinkedIn offers a reactive Job Search
system, and a proactive Jobs You May Be Interested In (JYMBII) system to match
the best candidates with their dream jobs. One of the most challenging tasks
for developing these systems is to properly extract important skill entities
from job postings and then target members with matched attributes. In this
work, we show that the commonly used text-based \emph{salience and
market-agnostic} skill extraction approach is sub-optimal because it only
considers skill mention and ignores the salient level of a skill and its market
dynamics, i.e., the market supply and demand influence on the importance of
skills. To address the above drawbacks, we present \model, our deployed
\emph{salience and market-aware} skill extraction system. The proposed \model
~shows promising results in improving the online performance of job
recommendation (JYMBII) ( job apply) and skill suggestions for job
posters ( suggestion rejection rate). Lastly, we present case studies to
show interesting insights that contrast traditional skill recognition method
and the proposed \model~from occupation, industry, country, and individual
skill levels. Based on the above promising results, we deployed the \model
~online to extract job targeting skills for all M job postings served at
LinkedIn.Comment: 9 pages, to appear in KDD202
- âŠ