Search CORE

5 research outputs found

SumREN: Summarizing Reported Speech about Events in News

Author: Chan Hou Pong
Elfardy Heba
Gangi Reddy Revanth
Ji Heng
Small Kevin
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i.e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i.e., reported statements). However, existing work on news summarization almost exclusively focuses on the event details. In this work, we propose the novel task of summarizing the reactions of different speakers, as expressed by their reported statements, to a given event. To this end, we create a new multi-document summarization benchmark, SumREN, comprising 745 summaries of reported statements from various public figures obtained from 633 news articles discussing 132 events. We propose an automatic silver-training data generation approach for our task, which helps smaller models like BART achieve GPT-3 level performance on this task. Finally, we introduce a pipeline-based framework for summarizing reported speech, which we empirically show to generate summaries that are more abstractive and factual than baseline query-focused summarization approaches

Association for the Advancement of Artificial Intelligence: AAAI Publications

A hybrid system for code switch point detection in informal Arabic text

Author: Cotterell R.
Elfardy H.
Eskander R.
Habash N.
Habash N.
Heba Elfardy
Mohamed Al-Badrashiny
Mona Diab
Stolcke
Zaidan O.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

PLAtE: A Large-scale Dataset for List Page Web Extraction

Author: Atluri Sandeep
Bakus Jan
Ciemiewicz David
Elfardy Heba
Ji Yangfeng
Lockard Colin
San Aidan
Small Kevin
Publication venue
Publication date: 24/05/2022
Field of study

Recently, neural models have been leveraged to significantly improve the performance of information extraction from semi-structured websites. However, a barrier for continued progress is the small number of datasets large enough to train these models. In this work, we introduce the PLAtE (Pages of Lists Attribute Extraction) dataset as a challenging new web extraction task. PLAtE focuses on shopping data, specifically extractions from product review pages with multiple items. PLAtE encompasses both the tasks of: (1) finding product-list segmentation boundaries and (2) extracting attributes for each product. PLAtE is composed of 53, 905 items from 6, 810 pages, making it the first large-scale list page web extraction dataset. We construct PLAtE by collecting list pages from Common Crawl, then annotating them on Mechanical Turk. Quantitative and qualitative analyses are performed to demonstrate PLAtE has high-quality annotations. We establish strong baseline performance on PLAtE with a SOTA model achieving an F1-score of 0.750 for attribute classification and 0.915 for segmentation, indicating opportunities for future research innovations in web extraction

arXiv.org e-Print Archive

Speech-Driven End-to-End Language Discrimination toward Chinese Dialects

Author: Ayan Necip Fazil
Dehak Najim
Elfardy Heba
Grefenstette Gregory
Ljubešić Nikola
Lui Marco
Malmasi Shervin
Malmasi Shervin
Simões Alberto
Stolcke Andreas
Takezawa Toshiyuki
Tiedemann Jörg
Xu Fan
Xu Fan
Zampieri Marcos
Zampieri Marcos
Çöltekin Çağrı
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Natural Language Processing for Social Media, Second Edition

Author: Abdul-Mageed Muhammad
Akhtar Md Shad
Al-Gaphari Galeb H.
Ali Tanveer
Allan James
Allan James
Artstein Ron
Arunachalam Ravi
Atefeh Farzindar
Avudaiappan Neela
Baccianella Stefano
Bakr Hitham Abo
Balasubramanyan Ramnath
Baldwin Timothy
Balikas Georgios
Barman Utsab
Baroni Marco
Becker Hila
Becker Hila
Becker Hila
Bellaachia Abdelghani
Benson Edward
Benton Adrian
Berger Adam L.
Bergsma Shane
Bermingham Adam
Beverungen Gary
Bing Li
Bizer Christian
Blei David M.
Bollen Johan
Bollen Jonah
Bontcheva Kalina
Boujelbane Rahma
Brantingham Richard
Brew Chris
Burfoot Clinton
Caragea Cornelia
Carletta Jean
Carter Simon
Celli Fabio
Chen Hailiang
Chen Hailiang
Chen Zheng
Chilet Jorge Ale
Choudhury Munmun De
Colbaugh Richard
Coppersmith Glen
Cordeiro Mário
Cucerzan S.
Cunningham Hamish
Daumé Hal
Davidov Dmitry
Debnath Pragna
Delort Jean-Yves
Demir Seniz
Derczynski Leon
Derczynski Leon
Diab Mona
Diana Inkpen
Dlugolinský Stefan
Dodds Peter Sheridan
Dredze Mark
Duan Yajuan
Dunning Ted
Eisenstein Jacob
Eisenstein Jacob
Eisenstein Jacob
Eisenstein Jacob
Ekman Paul
Elfardy Heba
Farzindar Atefeh
Farzindar Atefeh
Farzindar Atefeh
Ferragina Paolo
Fokkens Antske
Ford Dominey Peter
Foster George
Foster Jennifer
Friedman Jerome H.
Gella Spandana
Ghazi Diman
Gil Gonzalo Blazquez
González-Ibáñez Roberto
Gotti Fabrizio
Gotti Fabrizio
Guo Weiwei
Habash Nizar
Han Bo
Han Bo
Han Bo
Harabagiu Sanda
Harrison Phillip G.
He Hangfeng
Hecht Brent
Henrich Verena
Heravi Bahareh Rahmanzadeh
Hoffart Johannes
Holzman Lars E.
Horsmann Tobias
Howes Christine
Hsieh Wen-Tai
Hu Meishan
Huang Fei
Imran Muhammad
Inouye David
Izard Caroll E.
Jehl Laura
Jehl Laura Elisabeth
Jin Xiaotian
Judd Joel
Kashyap Ranjitha
Khabiri Elham
Khan Mohammad
Kim Sang Erik Tjong
Kokkos Athanasios
Lafferty John D.
Lampos Vasileios
Leonard
Lewis Will
Li Jiwei
Li Jiwei
Li Jiwei
Limsopatham Nut
Lin Hui
Ling Wang
Liu Bing
Liu Ji
Liu Wendy
Liu Xiaohua
Liu Xiaohua
Llewellyn Clare
Long Rui
Lui Marco
Lui Marco
Lukin Stephanie
Lösch Uta
Ma Jing
Mao Huina
Marchetti-Bowick Micol
Marcus Mitchell P.
Margaret
Maynard Diana
Metzler Donald
Mishne Gilad
Moghaddam Samaneh
Mohammad Saif M.
Mohammad Saif M.
Mohammady Ehsan
Mohay George
Moro Andrea
Mubarak Hamdy
Munro Robert
Neviarouskaya Alena
Nguyen Dong
Nikfarjam Azadeh
O'Connor Brendan
Oberlander Jon
Ovrelid Lilja
Owoputi Olutobi
Pajzs Julia
Pak Alexander
Pak Alexander
Paranjpe Deepa
Park Minsu
Paul Michael
Peng Fuchun
Peng Nanyun
Pennebaker James W.
Persing Isaac
Petrovic Sasha
Pla Ferran
Plutchik Robert
Poese Ingmar
Popescu Adrian
Popescu Ana-Maria
Porshnev Alexabder
Power Robert
Prapula G.
Ramage Daniel
Rao Delip
Razmara Majid
Riloff Ellen
Ritter Alan
Roller Stephen
Rowe Matthew
Rubin Victoria
Sawaf Hassan
Schler Jonathan
Seddah Djamé
Shaalan Khaled
Shamma D. A.
Sharifi Beaux
Shickel Benjamin
Simsek M. U.
Sinha Priyanka
Sokolova Marina
Strapparava Carlo
Strapparava Carlo
Sul Hong Keel
Titov Ivan
Tkachenko Alexander
Tromp Erik
Uzuner Özlem
Vallor Shannon
Verma Sudha
Wan Stephen
Wang Na
Wang Pidong
Washington
Weerkamp Wouter
Weng Jianshu
William
Wing Benjamin
Witten Ian
Wu Wei
Xie Wei
Yan Rui
Yang Steve Y.
Zbib Rabih
Zesch Torsten
Zhao Wayne Xin
Zhou Liang
Zhou Ning
Publication venue: 'Morgan & Claypool Publishers LLC'
Publication date
Field of study

Crossref