10 research outputs found
Career Transitions and Trajectories: A Case Study in Computing
From artificial intelligence to network security to hardware design, it is
well-known that computing research drives many important technological and
societal advancements. However, less is known about the long-term career paths
of the people behind these innovations. What do their careers reveal about the
evolution of computing research? Which institutions were and are the most
important in this field, and for what reasons? Can insights into computing
career trajectories help predict employer retention?
In this paper we analyze several decades of post-PhD computing careers using
a large new dataset rich with professional information, and propose a versatile
career network model, R^3, that captures temporal career dynamics. With R^3 we
track important organizations in computing research history, analyze career
movement between industry, academia, and government, and build a powerful
predictive model for individual career transitions. Our study, the first of its
kind, is a starting point for understanding computing research careers, and may
inform employer recruitment and retention mechanisms at a time when the demand
for specialized computational expertise far exceeds supply.Comment: To appear in KDD 201
CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction
Knowledge graph (KG) link prediction is a fundamental task in artificial
intelligence, with applications in natural language processing, information
retrieval, and biomedicine. Recently, promising results have been achieved by
leveraging cross-modal information in KGs, using ensembles that combine
knowledge graph embeddings (KGEs) and contextual language models (LMs).
However, existing ensembles are either (1) not consistently effective in terms
of ranking accuracy gains or (2) impractically inefficient on larger datasets
due to the combinatorial explosion problem of pairwise ranking with deep
language models. In this paper, we propose a novel tiered ranking architecture
CascadER to maintain the ranking accuracy of full ensembling while improving
efficiency considerably. CascadER uses LMs to rerank the outputs of more
efficient base KGEs, relying on an adaptive subset selection scheme aimed at
invoking the LMs minimally while maximizing accuracy gain over the KGE.
Extensive experiments demonstrate that CascadER improves MRR by up to 9 points
over KGE baselines, setting new state-of-the-art performance on four benchmarks
while improving efficiency by one or more orders of magnitude over competitive
cross-modal baselines. Our empirical analyses reveal that diversity of models
across modalities and preservation of individual models' confidence signals
help explain the effectiveness of CascadER, and suggest promising directions
for cross-modal cascaded architectures. Code and pretrained models are
available at https://github.com/tsafavi/cascader.Comment: AKBC 202
S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
The traditional Dialogue State Tracking (DST) problem aims to track user
preferences and intents in user-agent conversations. While sufficient for
task-oriented dialogue systems supporting narrow domain applications, the
advent of Large Language Model (LLM)-based chat systems has introduced many
real-world intricacies in open-domain dialogues. These intricacies manifest in
the form of increased complexity in contextual interactions, extended dialogue
sessions encompassing a diverse array of topics, and more frequent contextual
shifts. To handle these intricacies arising from evolving LLM-based chat
systems, we propose joint dialogue segmentation and state tracking per segment
in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a
true open-domain dialogue system, we propose S3-DST, a structured prompting
technique that harnesses Pre-Analytical Recollection, a novel grounding
mechanism we designed for improving long context tracking. To demonstrate the
efficacy of our proposed approach in joint segmentation and state tracking, we
evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as
well as publicly available DST and segmentation datasets. Across all datasets
and settings, S3-DST consistently outperforms the state-of-the-art,
demonstrating its potency and robustness the next generation of LLM-based chat
systems
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers
Powerful large language models have facilitated the development of writing
assistants that promise to significantly improve the quality and efficiency of
composition and communication. However, a barrier to effective assistance is
the lack of personalization in LLM outputs to the author's communication style
and specialized knowledge. In this paper, we address this challenge by
proposing PEARL, a retrieval-augmented LLM writing assistant personalized with
a generation-calibrated retriever. Our retriever is trained to select historic
user-authored documents for prompt augmentation, such that they are likely to
best personalize LLM generations for a user request. We propose two key
novelties for training our retriever: 1) A training data selection method that
identifies user requests likely to benefit from personalization and documents
that provide that benefit; and 2) A scale-calibrating KL-divergence objective
that ensures that our retriever closely tracks the benefit of a document for
personalized generation. We demonstrate the effectiveness of PEARL in
generating personalized workplace social media posts and Reddit comments.
Finally, we showcase the potential of a generation-calibrated retriever to
double as a performance predictor and further improve low-quality generations
via LLM chaining.Comment: Pre-print, work in progres
Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies
Log data can reveal valuable information about how users interact with web
search services, what they want, and how satisfied they are. However, analyzing
user intents in log data is not easy, especially for new forms of web search
such as AI-driven chat. To understand user intents from log data, we need a way
to label them with meaningful categories that capture their diversity and
dynamics. Existing methods rely on manual or ML-based labeling, which are
either expensive or inflexible for large and changing datasets. We propose a
novel solution using large language models (LLMs), which can generate rich and
relevant concepts, descriptions, and examples for user intents. However, using
LLMs to generate a user intent taxonomy and apply it to do log analysis can be
problematic for two main reasons: such a taxonomy is not externally validated,
and there may be an undesirable feedback loop. To overcome these issues, we
propose a new methodology with human experts and assessors to verify the
quality of the LLM-generated taxonomy. We also present an end-to-end pipeline
that uses an LLM with human-in-the-loop to produce, refine, and use labels for
user intent analysis in log data. Our method offers a scalable and adaptable
way to analyze user intents in web-scale log data with minimal human effort. We
demonstrate its effectiveness by uncovering new insights into user intents from
search and chat logs from Bing
Augmenting Structure with Text for Improved Graph Learning
Many important problems in machine learning and data mining, such as knowledge base reasoning, personalized entity recommendation, and scientific hypothesis generation, may be framed as learning and inference over a graph data structure. Such problems represent exciting opportunities for advancing graph learning, but also entail significant challenges. Because graphs are typically sparse and defined by a schema, they often do not fully capture the underlying complex relationships in the data. Models that combine graphs with rich auxiliary textual modalities have higher potential for expressiveness, but jointly processing such disparate modalities--that is, sparse structured relations and dense unstructured text--is not straightforward.
In this thesis, we consider the important problem of improving graph learning by combining structure and text. The first part of the thesis considers relational knowledge representation and reasoning tasks, demonstrating the great potential of pretrained contextual language models to add renewed depth and richness to graph-structured knowledge bases. The second part of the thesis goes beyond knowledge bases, toward improving graph learning tasks that arise in information retrieval and recommender systems by jointly modeling document interactions and content. Our proposed methodologies consistently improve accuracy over both single-modality and cross-modality baselines, suggesting that, with appropriately chosen inductive biases and careful model design, we can exploit the unique complementary aspects of structure and text to great effect.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/174515/1/tsafavi_1.pd