Search CORE

19 research outputs found

The TechQA Dataset

Author: Castelli Vittorio
Chakravarti Rishav
Dana Saswati
Ferritto Anthony
Florian Radu
Franz Martin
Garg Dinesh
Khandelwal Dinesh
McCarley Scott
McCawley Mike
Nasr Mohamed
Pan Lin
Pendus Cezar
Pitrelli John
Pujar Saurabh
Roukos Salim
Sakrajda Andrzej
Sil Avirup
Uceda-Sosa Rosario
Ward Todd
Zhang Rong
Publication venue
Publication date: 07/11/2019
Field of study

We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.Comment: Long version of conference paper to be submitte

arXiv.org e-Print Archive

Crossref

Speech Communication

Author: Allen Jonathan
Alwan Abeer A.
Anderson M.
Bickley Corine A.
Carrell T.
Chapin Ringo Carol
Cohen M.
Daly Nancy
Delgutte Bertrand
Dubois S.
Espy-Wilson Carol Y.
Glass James R.
Goldhor R.
Greene B.
Halle Morris
Hillman Robert E.
Holmberg Eva B.
Huang Caroline B.
Kassel Rob
Kaufman D. H.
Kawasaki Haruko
Keyser Samuel J.
Klatt Dennis H.
Lamel Lori
Larkey Leah S.
Lauritzen N.
Leung Hong
Locke John L.
Makhoull J. I.
Manuel Sharon Y.
Menyuk Paula
Miller J. L.
Pastel L.
Perkell Joseph S.
Pisoni David B.
Pitrelli John
Price P. J.
Randolph Mark A.
Seneff Stephanie
Shattuck-Hufnagel Stephanie
Stevens Kenneth N.
Wheeler L.
Wilson T.
Yoshida K.
Zue Victor W.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 01/01/1987
Field of study

Contains reports on five research projects.C.J. Lebel FellowshipNational Institutes of Health (Grant 5 T32 NS07040)National Institutes of Health (Grant 5 R01 NS04332)National Science Foundation (Grant 1ST 80-17599)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0254)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0341)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0290

DSpace@MIT

Speech Communication

Author: Abramson Katie
Allen Jonathan
Alwan Abeer A.
Bateman Nicholas P. T.
Bickley Corine A.
Boyce Suzanne E.
Chapin Ringo Carol
Daly Nancy
Espy-Wilson Carol Y.
Forestell Ann F.
Furtado Xavier
Glass James R.
Glicksman Laura B.
Goldhor Richard S.
Hall Seth M.
Halle Morris
Hillman Robert E.
Hirahara Tatsuya
Holmberg Eva B.
Howitt Andrew William
Huang Caroline B.
Ihionu Peter
Isaacs Katy
Jankowski Charles
Kassel Rob
Kawasaki Haruko
Kennedy Fred G.
Keyser Samuel J.
Klatt Dennis H.
Kuru Tunay
Lamel Lori
Lane Harlan L.
Larkey Leah S.
Leung Hong
Locke John L.
Makhoul John I.
Manuel Sharon Y.
Marcus Jeff
Matelli Joan
McCandless Michael K.
Menard Hope
Meng Helen
Mitra Haruko
North Keith
Ono Shigeru
Palay Vicky
Pao Christine
Perkell Joseph S.
Phillips Michael
Pitrelli John
Randolph Mark A.
Seneff Stephanie
Shattuck-Hufnagel Stephanie
Shaw Andy
Stevens Kenneth N.
Suzuki Noriko
Svirsky Mario A.
Takeda Kasuya
Webster Jane W.
Whitney Dave
Wilde Lorin F.
Wong Davin
Zue Victor W.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains table of contents for Part IV, table of contents for Section 1 and reports on five research projects.Apple Computer, Inc.C.J. Lebel FellowshipNational Institutes of Health (Grant T32-NS07040)National Institutes of Health (Grant R01-NS04332)National Institutes of Health (Grant R01-NS21183)National Institutes of Health (Grant P01-NS23734)U.S. Navy / Naval Electronic Systems Command (Contract N00039-85-C-0254)U.S. Navy - Office of Naval Research (Contract N00014-82-K-0727

DSpace@MIT

Speech Communication

Author: Abramson Katie
Allen Jonathan
Alwan Abeer A.
Anderson M.
Bateman Nicholas P. T.
Bickley Corine A.
Blush M.
Boyce Suzanne E.
Chapin Ringo Carol
Chasaide A. ni
Cohen M.
Daly Nancy
Dubois S.
Espy-Wilson Carol Y.
Forestell Ann F.
Glass James R.
Glicksman Laura B.
Goldhor R.
Gosy M.
Halle Morris
Hillman Robert E.
Hirahara T.
Hirose K.
Holmberg Eva B.
Hopkins G.
Howitt Andrew William
Huang Caroline B.
Isaacs Katy
Jankowski Charles
Kassel Rob
Kaufman D. H.
Kawasaki Haruko
Key K. K.
Keyser Samuel J.
Klatt Dennis H.
Kline K.
Lamel Lori
Landry Joseph
Lane Harlan L.
Larkey Leah S.
Lauritzen N.
Leung Hong
Lim A.
Locke John L.
Makhoul J. I.
Manuel Sharon Y.
Marcus J. N.
McCandless Michael K.
Mitra H.
North Keith
Pao Christine
Pastel L.
Perkell Joseph S.
Phillips M.
Pitrelli John
Randolph Mark A.
Seneff Stephanie
Shattuck-Hufnagel Stephanie
Shaw A.
Stevens Kenneth N.
Suzuki Noriko
Tierney S.
Volaitis L.
Webster Jane W.
Wheeler L.
Whitney Dave
Wilson T.
Wint Arlene E.
Wong Albert K.
Wong D.
Yoshida K.
Zue Victor W.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains reports on five research projects.C.J. Lebel FellowshipNational Institutes of Health (Grant 5 T32 NSO7040)National Institutes of Health (Grant 5 R01 NS04332)National Institutes of Health (Grant 5 R01 NS21183)National Institutes of Health (Grant 5 P01 NS13126)National Institutes of Health (Grant 1 PO1-NS23734)National Science Foundation (Grant BNS 8418733)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0254)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0341)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0290)National Institutes of Health (Grant RO1-NS21183), subcontract with Boston UniversityNational Institutes of Health (Grant 1 PO1-NS23734), subcontract with the Massachusetts Eye and Ear Infirmar

DSpace@MIT

Creating word-level language models for large-vocabulary handwriting recognition

Author: Amit Roy
John F. Pitrelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Confidence-Scoring Post-Processing for Off-Line Handwritten-Character Recognition Verification

Author: John F. Pitrelli
Michael P. Perrone
Publication venue
Publication date
Field of study

We apply confidence-scoring techniques to verify the output of an off-line handwritten-character recognizer. We evaluate a variety of scoring functions, including likelihood ratios and estimated posterior probabilities of correctness, in a post-processing mode, to generate confidence scores. Using the post-processor in conjunction with a neural-netbased recognizer, on mixed-case letters, receiver-operatingcharacteristic (ROC) curves reveal that our post-processor is able to reject correctly 90% of recognizer errors while only falsely rejecting 18.6% of correctly-recognized letters. For isolated-digit recognition, we achieve a correct rejection rate of 95% while keeping false rejection down to 8.7%. 1

CiteSeerX

Classifier combination techniques applied to coreference resolution

Author: Imed Zitouni
John F. Pitrelli
Smita Vemulapalli
Xiaoqiang Luo
Publication venue
Publication date: 01/01/2009
Field of study

This paper examines the applicability of classifier combination approaches such as bagging and boosting for coreference resolution. To the best of our knowledge, this is the first effort that utilizes such techniques for coreference resolution. In this paper, we provide experimental evidence which indicates that the accuracy of the coreference engine can potentially be increased by use of bagging and boosting methods, without any additional features or training data. We implement and evaluate combination techniques at the mention, entity and document level, and also address issues like entity alignment, that are specific to coreference resolution.

CiteSeerX

Crossref