Search CORE

10 research outputs found

Career Transitions and Trajectories: A Case Study in Computing

Author: Davoodi Maryam
Koutra Danai
Safavi Tara
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/05/2018
Field of study

From artificial intelligence to network security to hardware design, it is well-known that computing research drives many important technological and societal advancements. However, less is known about the long-term career paths of the people behind these innovations. What do their careers reveal about the evolution of computing research? Which institutions were and are the most important in this field, and for what reasons? Can insights into computing career trajectories help predict employer retention? In this paper we analyze several decades of post-PhD computing careers using a large new dataset rich with professional information, and propose a versatile career network model, R^3, that captures temporal career dynamics. With R^3 we track important organizations in computing research history, analyze career movement between industry, academia, and government, and build a powerful predictive model for individual career transitions. Our study, the first of its kind, is a starting point for understanding computing research careers, and may inform employer recruitment and retention mechanisms at a time when the demand for specialized computational expertise far exceeds supply.Comment: To appear in KDD 201

arXiv.org e-Print Archive

Crossref

CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction

Author: Downey Doug
Hope Tom
Safavi Tara
Publication venue
Publication date: 23/09/2022
Field of study

Knowledge graph (KG) link prediction is a fundamental task in artificial intelligence, with applications in natural language processing, information retrieval, and biomedicine. Recently, promising results have been achieved by leveraging cross-modal information in KGs, using ensembles that combine knowledge graph embeddings (KGEs) and contextual language models (LMs). However, existing ensembles are either (1) not consistently effective in terms of ranking accuracy gains or (2) impractically inefficient on larger datasets due to the combinatorial explosion problem of pairwise ranking with deep language models. In this paper, we propose a novel tiered ranking architecture CascadER to maintain the ranking accuracy of full ensembling while improving efficiency considerably. CascadER uses LMs to rerank the outputs of more efficient base KGEs, relying on an adaptive subset selection scheme aimed at invoking the LMs minimally while maximizing accuracy gain over the KGE. Extensive experiments demonstrate that CascadER improves MRR by up to 9 points over KGE baselines, setting new state-of-the-art performance on four benchmarks while improving efficiency by one or more orders of magnitude over competitive cross-modal baselines. Our empirical analyses reveal that diversity of models across modalities and preservation of individual models' confidence signals help explain the effectiveness of CascadER, and suggest promising directions for cross-modal cascaded architectures. Code and pretrained models are available at https://github.com/tsafavi/cascader.Comment: AKBC 202

arXiv.org e-Print Archive

S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs

Author: Andersen Reid
Buscher Georg
Das Sarkar Snigdha Sarathi
Neville Jennifer
Safavi Tara
Shah Chirag
Wan Mengting
Yang Longqi
Publication venue
Publication date: 15/09/2023
Field of study

The traditional Dialogue State Tracking (DST) problem aims to track user preferences and intents in user-agent conversations. While sufficient for task-oriented dialogue systems supporting narrow domain applications, the advent of Large Language Model (LLM)-based chat systems has introduced many real-world intricacies in open-domain dialogues. These intricacies manifest in the form of increased complexity in contextual interactions, extended dialogue sessions encompassing a diverse array of topics, and more frequent contextual shifts. To handle these intricacies arising from evolving LLM-based chat systems, we propose joint dialogue segmentation and state tracking per segment in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a true open-domain dialogue system, we propose S3-DST, a structured prompting technique that harnesses Pre-Analytical Recollection, a novel grounding mechanism we designed for improving long context tracking. To demonstrate the efficacy of our proposed approach in joint segmentation and state tracking, we evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as well as publicly available DST and segmentation datasets. Across all datasets and settings, S3-DST consistently outperforms the state-of-the-art, demonstrating its potency and robustness the next generation of LLM-based chat systems

arXiv.org e-Print Archive

PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

Author: Baghaee Tina
Gonzalez Emmanuel Barajas
Lu Zhuoran
Menezes Steve
Mysore Sheshera
Neville Jennifer
Safavi Tara
Wan Mengting
Yang Longqi
Publication venue
Publication date: 15/11/2023
Field of study

Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication. However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge. In this paper, we address this challenge by proposing PEARL, a retrieval-augmented LLM writing assistant personalized with a generation-calibrated retriever. Our retriever is trained to select historic user-authored documents for prompt augmentation, such that they are likely to best personalize LLM generations for a user request. We propose two key novelties for training our retriever: 1) A training data selection method that identifies user requests likely to benefit from personalization and documents that provide that benefit; and 2) A scale-calibrating KL-divergence objective that ensures that our retriever closely tracks the benefit of a document for personalized generation. We demonstrate the effectiveness of PEARL in generating personalized workplace social media posts and Reddit comments. Finally, we showcase the potential of a generation-calibrated retriever to double as a performance predictor and further improve low-quality generations via LLM chaining.Comment: Pre-print, work in progres

arXiv.org e-Print Archive

Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies

Author: Andersen Reid
Buscher Georg
Counts Scott
Das Sarkar Snigdha Sarathi
Manivannan Sathish
Montazer Ali
Neville Jennifer
Ni Xiaochuan
Rangan Nagu
Safavi Tara
Shah Chirag
Suri Siddharth
Wan Mengting
Wang Leijie
White Ryen W.
Yang Longqi
Publication venue
Publication date: 14/09/2023
Field of study

Log data can reveal valuable information about how users interact with web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especially for new forms of web search such as AI-driven chat. To understand user intents from log data, we need a way to label them with meaningful categories that capture their diversity and dynamics. Existing methods rely on manual or ML-based labeling, which are either expensive or inflexible for large and changing datasets. We propose a novel solution using large language models (LLMs), which can generate rich and relevant concepts, descriptions, and examples for user intents. However, using LLMs to generate a user intent taxonomy and apply it to do log analysis can be problematic for two main reasons: such a taxonomy is not externally validated, and there may be an undesirable feedback loop. To overcome these issues, we propose a new methodology with human experts and assessors to verify the quality of the LLM-generated taxonomy. We also present an end-to-end pipeline that uses an LLM with human-in-the-loop to produce, refine, and use labels for user intent analysis in log data. Our method offers a scalable and adaptable way to analyze user intents in web-scale log data with minimal human effort. We demonstrate its effectiveness by uncovering new insights into user intents from search and chat logs from Bing

arXiv.org e-Print Archive

Children must be protected from the tobacco industry's marketing tactics.

Author: Abdallah Said
Absalom Gareth
Adams Nick
Adeboyeku David
Agate Rowan
Agrawal Sanjay
Ahmed Liju
Allen Bev
Anderson Paul
Anwar Sadia
Arabnia Gilda
Ayres Jon
Badman Ros
Bafadhel Mona
Baker Emma
Banerjee Tushar
Barker Jack
Barnes Peter
Barnett Emma
Barnham Kate
Barr Laura
Barry Simon
Bartoszewicz Charlotte
Beasley Victoria
Berry Matthew
Bhowmik Angshu
Bingham Lyn
Bintcliffe Oliver
Bloch Susannah
Boland Alison
Bolton Charlotte
Booth Andrew
Booton Richard
Bothamley Graham
Bott Julia
Bourke Stephen
Bradley Bethia
Breen Ronan
Brennan Amanda
Bridges Diana
Britton John
Brown Joanna
Brown Louise
Brown Sarah
Browne Emma
Brunet Martin
Burge Sherwood
Burgin Karen
Burke Hannah
Bush Andrew
Butcher Jane
Buttery Robert
Buxton Maria
Calverley Peter
Campbell Colin
Campbell Ian
Capstick Toby
Carlin Chris
Casswell Georgina
Chaudhry Nauman
Chavasse Richard
Christensen Anne
Christopher Anne
Chua Felix
Clarke Sarah
Clive Amelia
Colin Campbell
Collett Anne
Collins Andrea
Collins Nicola
Connell David
Connellan Stephen
Connelly Carol
Connett Gary
Connolly Bronwen
Cookson John
Cooper Angela
Cooper David
Costello Carol Ann
Coutts Ian
Cowie Calvin
Cox Nicola
Crawfurd Laura
Creagh-Brown Ben
Crooks Gillian
Crooks Gillian
Crooks Stephen
Crossingham Iain
Cullen Kathy
Cummings Natalie
Curry Denise
Curtis Katrina
Daniels Tracey
Davey Claire
Davidson Philip
Davies Carol-Anne
Davies Lisa
Davies Peter
Davies Sarah
Davison John
Davison Tony
Dayer Mark
Deegan Paul
Denniston Sarah
Deveson Pete
Dewar Maria
Diana Bridges
Dobson Lee
Docherty Marianne
Dodd James
Doe Simon
Dow Claire
Downs Janis
Dowson Lee
Durham Stephen
Durrington Hannah
Dutta P
Dwarakanath Akshay
Eaden James
Edwards Adrian
Edwards Catherine
Edwards John
Edwards Sheila
Edwina Wooler
Elizabeth Ringrose
Elkin Sarah
Elston Will
Emerson Peter
English Peter
Esterbrook Georgina
Everett Caroline
Farmer Ray
Feary Johanna
Fern Karen
Fleming Louise
Fletcher Tim
Ford David
Ford Kate
Forrest Ian
Forster Georgia
Fowler Robert William
Fowler Stephen
Fraser Una
Freeman Debbie
Fuld Jonathan
Furness John
Gaduzo Stephen
Gardiner Karen
Georgiadi Adamantia
Ghosh Dipansu
Gibson John
Gill Alison
Goldman Jon
Goode Chris
Graham Alison
Graham Annika
Graves Jennifer
Gray Barry
Gray Linda
Greenhalgh Trish
Greening Neil
Griffiths Mark
Griffiths Paul
Griffiths Valerie
Grove Alison
Guadagno Alison
Guyatt Fran
Hailes Karen
Haja Mydin Helmy
Haney Sarah
Hankinson Jenny
Haqqee Raana
Harle Amelie
Harrison Brian
Harrris Ann
Hart Nicholas
Harvey John
Haslam Patricia L
Heslop Karen
Hewitt Lee
Hickman Katherine
Hicks Alex
Higgins Bernard
Hilton Kay
Hoadley Jacky
Hodgson Luke
Hodson Matthew
Holden Emma
Holmes Steve
Hopkinson Nicholas
Hosker Harold
Hough Alexandra
Houghton Catherine
Howell Tim
Hughes John
Hutchinson John
Hynes Gareth
Iftikhar Ahsan
Iles Peter
Ioakim Shona
Janes Sam
Jennings David
Jha Akhilesh
John Edwards
Johnston Carol
Jolley Caroline
Jones Helen
Kaul Sundeep
Keaney Niall
Keilty Sarah
Kemp Samuel
Ketchell Robert Ian
Khan Katie
Kim Randall
Kowal Julia
Kumar Sanjay
Lacy-Colson Amruta
Lane Amber
Lane Ian
Lebus Jenny
Lenney Warren
Lenney Warren
Leonard Andrew
Leonard Helen
Lightbody Darren
Lipman Marc
Lithgow Anna
Lloyd Jananee
Lloyd Julie
Lock Sara
Lomas Dean David
Long Alex
Lord Victoria
Lordan James
Lowe Lesley
Lyall Rebecca
Lyons Elizabeth
Lytton Stephen
MacBean Vicky
Macdonald Ian
Mackenzie Nesta
Mackinlay Carolyn
Macmillan Alison Bennett
Maddocks Matthew
Majeed Azeem
Malin Adam
Mallia-Milanes Brendan
Malone Sarah
Man William
Maqsood Usman
Marsden Paul
Marshall Robert
Martin Katharine
Martin Sarah
Matthew Hodson
Matusiewicz Simon
Maxwell David
McAuley Danny
McCarvill Iona
McHugh Martin
McKee Martin
McNicholl Diarmuid
Meadows Chris
Merritt Simon
Milburn Heather
Miller Joy
Miller Lauren
Mir Misbah
Mitchell Sally
Mitchelmore Philip
Mohd Noh Mohd Shahrin
Molloy Alanna
Monaghan Valerie
Montgomery Hugh
Montgomery Mary
Moore Caroline
Moore-Gillon John
Morgan David
Morrison Douglas
Moxham John
Munday Maureen
Mungall Sarah
Murray Clare
Mustfa Naveed
Myerson James
Narasimhan Divya
Naveed Shams-un-nisa
Nelson Christopher
Newey Alison
Nicholl David
Nickol Annabel
Nicoll Debby
Nikolic Marko
Nyman Cyril
O'Callaghan Una
O'Connor Sally
O'Driscoll Ronan
O'Hara Doreen
O'Kane Cecilia
Ohri Chandra
Olive Sandra
Olivier Sharon
Olley Amy
Ong Yee Ean
Ormerod Lawrence
Padmanaban Vijay
Parker Samuel
Parkes Marilyn
Parmar Sonia
Parry Diane
Partridge Martyn
Partridge Samuel
Patel Anant
Patel Irem
Paterson Ian
Paton James Y
Pawulska Barbara
Peedell Clive
Pengo Martino
Phitidis Elpida
Pieri-Davies Sue
Pillai Anilkumar
Powell Helen
Powrie Duncan
Primhak Robert
Prowse Keith
Prowse Keith
Purcell Helen
Puthucheary Zudin
Pye Kathy
Quint Jennifer
Radnan Paul
Rafferty Gerrard
Rajhan Ashwin
Raju Raghu
Ramsay Michelle
Range Simon
Redington Anthony
Reilly Jacqui
Restrick Louise
Ridyard John
Rigge Lucy
Ringrose Elizabeth
Robinson Grace
Robinson Louise
Rodman Anne
Roots Debbie
Roots Debbie
Rosalind Backham
Ross David
Round Jonathan
Ruiz Gary
Russell Richard
Ryan Helen
Safavi Shahideh
Saini Sarvesh
Sapey Elizabeth
Satia Imran
Satkunam Karnan
Scaffardi Anthony
Scheele Kate
Seaton Douglas
Serlin Matthew
Seymour John
Shaheen Fizah
Sharkey Emma
Sharp Charles
Sherrington Rebecca
Shields Michael
Shribman Jonathan
Shrikrishna Dinesh
Simon Ward
Simonds Anita
Simpson John
Sloper Katherine
Smith David
Smith Peter
Spencer Charlotte
Spencer David
Spiro Stephen
Stableforth David
Stanton Andrew
Starren Elizabeth
Steel Mark
Steier Joerg
Stern Myra
Stewart Tara
Stockley Robert
Stocks Janet
Stratford- Martin James
Sturney Sharon
Suh Eui-Sik
Summers Geoffrey
Sumner Helen
Sutherland Tim
Sylvester Karl
Szram Joanna
Tate Paul
Taylor Amy
Templeton-Wright Suzanne
Thomas Ajit
Thomas Angela
Thomas Matt
Thomson Fiona
Thomson Neil
Tilbrook Sean
Timlett Amber
Towns Rebecca
Tracey Mathieson
Treharne Jo
Tunnicliffe Georgia
Turner Sally
Turner Steve
Vaughan Sophie
Voase Nia
Walker Paul
Walker Woolf
Wallis Colin
Walmsley Sandy
Walters Nicola
Ward Ann
Ward Simon
Warner John
Warwick Geoffrey
Watson John
Wedzicha Wisia
Wheatley Iain
White John
Whitfield Ruth
Wicker Jacquelyne
Wieboldt Jason
Williams Ruth
Williamson Kathryn
Wilson Patrick
Wilson Robert
Winter Barbara
Winter Robert
Winter-Burke Alice
Wood Fraser
Wood Marion
Woodcock Ashley
Woodhead Mark
Wooler Edwina
Woolhouse Ian
Wright Joanne
Yasso Razouqi
Zoumot Zaid
Zurek Andrew
Publication venue: 'BMJ'
Publication date: 01/01/2013
Field of study

Queen's University Belfast Research Portal

LSHTM Research Online

Oxford University Research Archive

King's Research Portal

Augmenting Structure with Text for Improved Graph Learning

Author: Safavi Tara
Publication venue
Publication date: 01/01/2022
Field of study

Many important problems in machine learning and data mining, such as knowledge base reasoning, personalized entity recommendation, and scientific hypothesis generation, may be framed as learning and inference over a graph data structure. Such problems represent exciting opportunities for advancing graph learning, but also entail significant challenges. Because graphs are typically sparse and defined by a schema, they often do not fully capture the underlying complex relationships in the data. Models that combine graphs with rich auxiliary textual modalities have higher potential for expressiveness, but jointly processing such disparate modalities--that is, sparse structured relations and dense unstructured text--is not straightforward. In this thesis, we consider the important problem of improving graph learning by combining structure and text. The first part of the thesis considers relational knowledge representation and reasoning tasks, demonstrating the great potential of pretrained contextual language models to add renewed depth and richness to graph-structured knowledge bases. The second part of the thesis goes beyond knowledge bases, toward improving graph learning tasks that arise in information retrieval and recommender systems by jointly modeling document interactions and content. Our proposed methodologies consistently improve accuracy over both single-modality and cross-modality baselines, suggesting that, with appropriately chosen inductive biases and careful model design, we can exploit the unique complementary aspects of structure and text to great effect.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/174515/1/tsafavi_1.pd

Deep Blue Documents at the University of Michigan

On Proximity and Structural Role-based Embeddings in Networks

Author: Adhikari Bijaya
Ahmed Nesreen K.
Ahmed Nesreen K.
Airoldi Edoardo M.
Almeida Hélio
Backstrom L.
Bilgic M.
Bruna Joan
Chakrabarti D.
Chen Lei
Danai Koutra
Di Jin
Duvenaud David K.
Fu Wenjie
Gallagher B.
Heimann Mark
Idé Tsuyoshi
Jin Di
John Boaz Lee
Keim Daniel
Kloster Kyle
Koyutürk Mehmet
Lai Yi-An
Landesberger Tatiana Von
Lee John Boaz
Li Qing
Lin Shuyang
Ma Guixiang
McDowell Luke K.
Meila Marina
Nesreen K. Ahmed
Neville J.
Neville Jennifer
Neville Jennifer
Ng Andrew Y.
Nicosia Vincenzo
Niepert Mathias
Nikolentzos Giannis
Ribeiro Leonardo F. R.
Rossi Ryan A.
Ryan
Ryan A. Rossi
Safavi Tara
Sarwar Badrul M.
Satorras Victor Garcia
Shah Neil
Sun J.
Sungchul Kim
Taheri Aynaz
Thomas
Veličković Petar
Wang Xiao
Xuewei Ma
Yang Cheng
Ying Zhitao
Ying Zhitao
You Jiaxuan
Zhang Daokun
Zhang Jiawei
Zhao Yu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref