Search CORE

28 research outputs found

CloudScan - A configuration-free invoice analysis system using recurrent neural networks

Author: Laws Florian
Palm Rasmus Berg
Winther Ole
Publication venue
Publication date: 01/01/2017
Field of study

We present CloudScan; an invoice analysis system that requires zero configuration or upfront annotation. In contrast to previous work, CloudScan does not rely on templates of invoice layout, instead it learns a single global model of invoices that naturally generalizes to unseen invoice layouts. The model is trained using data automatically extracted from end-user provided feedback. This automatic training data extraction removes the requirement for users to annotate the data precisely. We describe a recurrent neural network model that can capture long range context and compare it to a baseline logistic regression model corresponding to the current CloudScan production system. We train and evaluate the system on 8 important fields using a dataset of 326,471 invoices. The recurrent neural network and baseline model achieve 0.891 and 0.887 average F1 scores respectively on seen invoice layouts. For the harder task of unseen invoice layouts, the recurrent neural network model outperforms the baseline with 0.840 average F1 compared to 0.788.Comment: Presented at ICDAR 201

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Attend, Copy, Parse -- End-to-end information extraction from documents

Author: Laws Florian
Palm Rasmus Berg
Winther Ole
Publication venue
Publication date: 01/01/2019
Field of study

Document information extraction tasks performed by humans create data consisting of a PDF or document image input, and extracted string outputs. This end-to-end data is naturally consumed and produced when performing the task because it is valuable in and of itself. It is naturally available, at no additional cost. Unfortunately, state-of-the-art word classification methods for information extraction cannot use this data, instead requiring word-level labels which are expensive to create and consequently not available for many real life tasks. In this paper we propose the Attend, Copy, Parse architecture, a deep neural network model that can be trained directly on end-to-end data, bypassing the need for word-level labels. We evaluate the proposed architecture on a large diverse set of invoices, and outperform a state-of-the-art production system based on word classification. We believe our proposed architecture can be used on many real life information extraction tasks where word classification cannot be used due to a lack of the required word-level labels

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

End-to-end information extraction without token-level supervision

Author: Hovy Dirk
Laws Florian
Palm Rasmus Berg
Winther Ole
Publication venue
Publication date: 01/01/2017
Field of study

Most state-of-the-art information extraction approaches rely on token-level labels to find the areas of interest in text. Unfortunately, these labels are time-consuming and costly to create, and consequently, not available for many real-life IE tasks. To make matters worse, token-level labels are usually not the desired output, but just an intermediary step. End-to-end (E2E) models, which take raw text as input and produce the desired output directly, need not depend on token-level labels. We propose an E2E model based on pointer networks, which can be trained directly on pairs of raw input and output text. We evaluate our model on the ATIS data set, MIT restaurant corpus and the MIT movie corpus and compare to neural baselines that do use token-level labels. We achieve competitive results, within a few percentage points of the baselines, showing the feasibility of E2E information extraction without the need for token-level labels. This opens up new possibilities, as for many tasks currently addressed by human extractors, raw input and output data are available, but not token-level labels

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

Crossref

Online Research Database In Technology

Significant benefits of AIP testing and clinical screening in familial isolated and young-onset pituitary tumors

Author: Agnieszka Lebek-Szatanska
Ajith V Kumar
Akira Matsuno
Aled Rees
Alex Bickerton
Ali Imran
Alia Munir
Alper Gürlek
Amar Agha
Amy Ronaldson
Anand Kumar
Andrew Green
Andrew Levy
Angelos Kyriaku
Anna Aulinas-Masó
Anna Spada
Anne Katrin Lampe
Anthony P Goldstone
Antonia Brooke
António Ribeiro de Oliveira
Aparna Pal
Ariel Barkan
Arla Ogilivie
Ashley B Grossman
Astrid Weber
Atik Baborie
Attila Patócs
Barbara McGowan
Barbara Mosterman
Beatriz S Soares
Beckers
Bernard Khoo
Bijay Vaidya
Biju Jose
Blair
Brede Swantje
Brew Atkinson
Britt Edén-Engström
Caimari
Candy Sze
Carmen Bernal-González
Carmen Georgescu
Caroline Brain
Catherine Choong
Catherine Gilkes
Catherine Lyons
Catherine Patterson
Cathy Kiraly-Borri
Cazabat
Celia Rodd
Chandi Idampitiya
Charlotte Höybye
Cheri Deal
Chris Thompson
Christian Strasburger
Christina Daousi
Christine Burren
Christof Schöfl
Chung Thong Lim
Colin Johnston
Corin Badiu
Cuny
Daly
Daly
Daly
Daly
Daly
Damian G Morris
Daniel Cuthbertson
Darko Kastelan
David A Hilton
David Carty
David Collier
David Cove
de Laat
Debbie Shears
Dinesh Nagi
Dominic Cavlan
Donato Iacovazzo
Dragana Miljic
Dutta
Edward R Laws
Elaine Tham
Eleanor Hay
Eleanor Lin
Elena Aflorei
Elizabeth Crowne
Elizabeth Forsythe
Ellard
Emese Mezősi
Evelien Gevers
Farida Chentli
Federico Roncaroli
Felicity Kaplan
Fiona Ryan
Florian Wernig
Francesca Pecori Giraldi
Francesca Swords
Francisca Caimari
Fraser Pirie
Freda
Gina Twine
Graeme Suthers
Graham Dow
Graham Leese
Greenman
Greg Hong
Gregory Kaltsas
Gul Bano
Gábor L Kovács
Hakan Widell
Hamish Courtney
Hani Marcus
Harpal Randeva
Harvinder S Chahal
Helen L Storr
Helen Spoudeas
Hernández-Ramírez
Hilde Von Esch
Hoong-Wei Gan
Iacovazzo
Imre Zoltan Kun
Ioana Lambrescu
Ionela Pascanu
Isabelle Suter-Widmer
Jacob Dal
James Ahlquist
James E Goldman
James Sperber
Janet Lo
Janos Vadasz
Jo Blair
Joan Grieve
Joel Capraro
John A Wass
John Barton
John Monson
John Newell-Price
John S Bevan
Joshi
João Anselmo
Judit Dénes
Julian Barwell
Julian Davis
Juliana Drummond
Juliet Taylor
Justin Davies
Karen K Miller
Karen Stals
Karin Bradley
Katalin Horváth
Kate Ellis
Katie Snape
Katsuhiko Yoshimoto
Katznelson
Ken Darzy
Kesson Magid
Kevin Shotliff
Kevin Yuen
Khash Nikookam
Korbonits
Krystallenia Alexandraki
Krzysztof Lewandowski
Larisa Dzeranova
Laura C Hernández-Ramírez
Laura de Marinis
Lisa Bradley
Lisa Nachtigall
Louise Emmerson
Louise Izatt
Luis Gustavo Perez-Rivas
Luis V Syro
Luiz Griz
Lynette Penney
Lynn Greenhalgh
Madalina Musat
Mangupli
Maralyn Druce
Marek Bolanowski
Marek Niedziela
Margaret de Castro
Maria Elfving
Maria Elisabeth Street
Maria Fleseriu
Maria Stelmachowska-Banas
Marianne Elston
Marija Pfeifer
Marinella Tzanela
Mark Cohen
Mark E Molitch
Mark Gurnell
Marques
Marques
Martin O´Weickert
Mary Gainsborough
Mary Lee Vance
Mehtap Cakir
Mehul Dattani
Melanie Kershaw
Michael Besser
Michael Buchfelder
Michael O Thorner
Michael Powell
Michaela Davies
Michelle Katz
Miklós Góth
Miklós Tóth
Miles J Levy
Milica Medic-Stojanoska
Mirjam Christ-Crain
Mirtha Guitelman
Mohamad Maghnie
Moisés Mercado-Atri
Molitch
Mona Waterhouse
Mothojakan
Márta Korbonits
Mónica Gadelha
Nadezhda Dalantaeva
Naomi Fersht
Narendra Reddy
Natalie Canham
Neil Dorward
Niamh Martin
Nicola N Zammitt
Nicola Poplawski
Nieman
Nigel Glynn
Nigel Mendoza
Niki Karavitaki
Niki Maartens
Nina Musolino
Noel Somasundaram
Oriola
Pamela Freda
Patricia Gallego
Patrick J Morrison
Patrick Yap
Paul Carroll
Paul Dimitri
Pedro Marques
Peter Bates
Peter Clayton
Peter J Trainer
Peter Pullan
Peter Shane Hamblin
Philip Harding
Philip Yeoh
Philippa Carter
Philippe F Backeljauw
Pierre Bouloux
Pinaki Dutta
Prakash Abraham
Preda
Péter Igaz
Rajesh V Thakker
Ramesh Nair
Rasa Verkauskiene
Richard J M Ross
Richard N Clayton
Richard Nelson
Richard Quinton
Richards
Robert D Murray
Robert Skelly
Robertas Knispelis
Roberto Salvatori
Roger Brown
Ronald M Lechan
Rosalind Eeles
Rostomyan
Sachith Mettananda
Salenave
Scott A Akker
Sema Yarman
Serban Radian
Shereen Ezzat
Shozo Yamada
Sian Ellard
Silvia Modenesi
Simon H Pearce
Simon Howell
Simon J Aylwin
Simona Fica
Siobhán E McQuaid
Stefan Fischli
Stephanie Baldeweg
Stephen Gallacher
Steve Ball
Steve M Orme
Steven Hunter
Stylianos Tsagarakis
Sujatha Jagadeesh
Susan Stewart
Susan Webb
Svetozar Damjanovic
Sándor Alföldi
Taffy Makaya
Takeo Iwata
Tara Kearney
Teng-Teng Chung
Thakker
Theodore Friedman
Tichomirowa
Tim Cheetham
Trevor A Howlett
Tristan Richardson
Vaclav Hana
Valerie Renals
Vera Popovic
Vierimaa
Vladimir Vaks
Warrick J Inder
Wiebke Arlt
William Drake
William Foulkes
Williams
Winnie Ho
Wu
Publication venue: 'The Endocrine Society'
Publication date: 01/06/2020
Field of study

Context Germline mutations in the aryl hydrocarbon receptor-interacting protein (AIP) gene are responsible for a subset of familial isolated pituitary adenoma (FIPA) cases and sporadic pituitary neuroendocrine tumors (PitNETs). Objective To compare prospectively diagnosed AIP mutation-positive (AIPmut) PitNET patients with clinically presenting patients and to compare the clinical characteristics of AIPmut and AIPneg PitNET patients. Design 12-year prospective, observational study. Participants & Setting We studied probands and family members of FIPA kindreds and sporadic patients with disease onset ≤18 years or macroadenomas with onset ≤30 years (n = 1477). This was a collaborative study conducted at referral centers for pituitary diseases. Interventions & Outcome AIP testing and clinical screening for pituitary disease. Comparison of characteristics of prospectively diagnosed (n = 22) vs clinically presenting AIPmut PitNET patients (n = 145), and AIPmut (n = 167) vs AIPneg PitNET patients (n = 1310). Results Prospectively diagnosed AIPmut PitNET patients had smaller lesions with less suprasellar extension or cavernous sinus invasion and required fewer treatments with fewer operations and no radiotherapy compared with clinically presenting cases; there were fewer cases with active disease and hypopituitarism at last follow-up. When comparing AIPmut and AIPneg cases, AIPmut patients were more often males, younger, more often had GH excess, pituitary apoplexy, suprasellar extension, and more patients required multimodal therapy, including radiotherapy. AIPmut patients (n = 136) with GH excess were taller than AIPneg counterparts (n = 650). Conclusions Prospectively diagnosed AIPmut patients show better outcomes than clinically presenting cases, demonstrating the benefits of genetic and clinical screening. AIP-related pituitary disease has a wide spectrum ranging from aggressively growing lesions to stable or indolent disease course

RD&E Research Repository

Crossref

UCL Discovery

The University of Manchester - Institutional Repository

King's Research Portal

Queen Mary Research Online

White Rose Research Online

University of Melbourne Institutional Repository

Surgery of highly eloquent gliomas primarily assessed as non-resectable: risks and benefits in a cohort study

Author: A De Benedictis
Bernhard Meyer
C Cedzich
Doris Droese
EF Chang
Ehab Shiban
ER Laws Jr
Florian Ringel
G Neuloh
G Neuloh
G Neuloh
G Ojemann
GE Keles
H Duffau
H Duffau
J Martino
Jens Gempt
JF Xu
JL Clarke
KM Scheufler
Lea Schnurbus
M Barrie
M Taniguchi
MM Haglund
N Sanai
N Sanai
N Sanai
N Sanai
N Sanai
N Sollmann
N Thon
Niels Buchmann
O Schnell
OL Chinot
P Beauchesne
P Lioumis
R Soffietti
S Dutzmann
S Takahashi
Sandro M Krieg
SM Krieg
SM Krieg
SM Krieg
T Kombos
T Kombos
T Krings
T Picht
Thomas Obermueller
W Penfield
W Stummer
W Stummer
WWD Huber
YN Yordanova
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Estimation of conditional Probabilities with Decision Trees and an Application to Fine-Grained POS Tagging

Author: Florian Laws
Helmut Schmid
Publication venue
Publication date: 01/01/2008
Field of study

We present a HMM part-of-speech tagging method which is particularly suited for POS tagsets with a large number of fine-grained tags. It is based on three ideas: (1) splitting of the POS tags into attribute vectors and decomposition of the contextual POS probabilities of the HMM into a product of attribute probabilities, (2) estimation of the contextual probabilities with decision trees, and (3) use of high-order HMMs. In experiments on German and Czech data, our tagger outperformed state-of-the-art POS taggers

CiteSeerX

Crossref

Stopping criteria for active learning of named entity recognition

Author: Florian Laws
Hinrich Schütze
Publication venue
Publication date: 01/01/2008
Field of study

Active learning is a proven method for reducing the cost of creating the training sets that are necessary for statistical NLP. However, there has been little work on stopping criteria for active learning. An operational stopping criterion is necessary to be able to use active learning in NLP applications. We investigate three different stopping criteria for active learning of named entity recognition (NER) and show that one of them, gradient-based stopping, (i) reliably stops active learning, (ii) achieves nearoptimal NER performance, (iii) and needs only about 20 % as much training data as exhaustive labeling.

CiteSeerX

Crossref

Attend, copy, parse end-to-end information extraction from documents

Author: Laws Florian
Palm Rasmus Berg
Winther Ole
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Crossref

Online Research Database In Technology