Search CORE

32 research outputs found

Authorship Analysis Approaches

Author: A Abbasi
A Abbasi
DI Holmes
E Stamatatos
F Sebastiani
GU Yule
H Baayen
J Diederich
J Rudman
JF Burrows
JR Quinlan
LM Manevitz
M Koppel
M Koppel
N Cristianini
O De Vel
R Agrawal
R Zheng
SE Robertson
SL Salzberg
T Kucukyilmaz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/12/2020
Field of study

This chapter presents an overview of authorship analysis from multiple standpoints. It includes historical perspective, description of stylometric features, and authorship analysis techniques and their limitations

ZU Scholars (Zayed University)

Crossref

Exploiting likely-positive and unlabeled data to improve the identification of protein-protein interaction articles

Author: A Yakushiji
A Zanzoni
AR Mendelsohn
B Liu
BJ Breitkreutz
EM Marcotte
GD Bader
Hong-Jie Dai
Hsi-Chuan Hung
I Xenarios
J Thomas
JA Hanley
JM Temkin
LM Manevitz
M Krallinger
M Lan
N Cristianini
Richard Tzong-Han Tsai
S Fields
S Fujita
S Peri
S Robertson
T Joachims
T Ono
U Güldener
Wen-Lian Hsu
Y Hao
Yi-Wen Lin
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Experimentally verified protein-protein interactions (PPI) cannot be easily retrieved by researchers unless they are stored in PPI databases. The curation of such databases can be made faster by ranking newly-published articles' relevance to PPI, a task which we approach here by designing a machine-learning-based PPI classifier. All classifiers require labeled data, and the more labeled data available, the more reliable they become. Although many PPI databases with large numbers of labeled articles are available, incorporating these databases into the base training data may actually reduce classification performance since the supplementary databases may not annotate exactly the same PPI types as the base training data. Our first goal in this paper is to find a method of selecting likely positive data from such supplementary databases. Only extracting likely positive data, however, will bias the classification model unless sufficient negative data is also added. Unfortunately, negative data is very hard to obtain because there are no resources that compile such information. Therefore, our second aim is to select such negative data from unlabeled PubMed data. Thirdly, we explore how to exploit these likely positive and negative data. And lastly, we look at the somewhat unrelated question of which term-weighting scheme is most effective for identifying PPI-related articles. Results To evaluate the performance of our PPI text classifier, we conducted experiments based on the BioCreAtIvE-II IAS dataset. Our results show that adding likely-labeled data generally increases AUC by 3~6%, indicating better ranking ability. Our experiments also show that our newly-proposed term-weighting scheme has the highest AUC among all common weighting schemes. Our final model achieves an F-measure and AUC 2.9% and 5.0% higher than those of the top-ranking system in the IAS challenge. Conclusion Our experiments demonstrate the effectiveness of integrating unlabeled and likely labeled data to augment a PPI text classification system. Our mixed model is suitable for ranking purposes whereas our hierarchical model is better for filtering. In addition, our results indicate that supervised weighting schemes outperform unsupervised ones. Our newly-proposed weighting scheme, TFBRF, which considers documents that do not contain the target word, avoids some of the biases found in traditional weighting schemes. Our experiment results show TFBRF to be the most effective among several other top weighting schemes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comparative Study of One-Class Classifiers for Item-based Filtering

Author: A. Gunawardana
LM Manevitz
N Japkowicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An improved one-class support vector machine classifier for outlier detection

Author: Chen C
Manevitz LM
Quinlan MJ
Roth V
Schölkopf B
Schölkopf B
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

One‐class classification using a support vector machine with a quasi‐linear kernel

Author: Alashwal H
Cohen G
Cohen G
Manevitz LM
Robert P
Vert J‐P
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Enhancing Labeled Data Using Unlabeled Data for Topic Tracking

Author: DM Blei
J Allan
JG Fiscus
K Markert
K Nigam
LM Manevitz
M Belkin
RE Schapire
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The Finite Element Method and Soft Computing

Author: A Razzaque
AR Gallant
B Irons
B Widrow
E Cuthill
G Cybenko
HL Pina
IP King
L Manevitz
L Manevitz
LM Manevitz
M Leshno
M Shoham
NE Gibbs
OC Zienkiewicz
RE Bank
RJ Collins
SJ Fenves
Sloan and Randolph
SW Sloan
T Kohonen
TJR Hughes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1999
Field of study

Crossref

User identification via neural network based language models

Author: Alvarez M
Bengio Y
Collobert R
Lewis DD
Maaten Lvd
Manevitz LM
Ric M
Stone‐Gross B
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Unsupervised anomaly detection using optimal transport for predictive maintenance

Author: A Atkinson
C Villani
F Camci
GA Susto
H Aguinis
J Zhang
LM Manevitz
M Abdel-Sayed
M Baptista
M Goldstein
MC Garcia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2019
Field of study

International audienceAnomaly detection is of crucial importance in industrial environment , especially in the context of predictive maintenance. As it is very costly to add an extra monitoring layer on production machines, non-invasive solutions are favored to watch for precursory clue indicating the possible need for a maintenance operation. Those clues are to be detected in evolving and highly variable working environment, calling for online and unsupervised methods. This contribution proposes a framework grounded in optimal transport, for the specific characterization of a system and the automatic detection of abnormal events. This method is evaluated on acoustic dataset and demonstrate the superiority of met-rics derived from optimal transport on the Euclidean ones. The proposed method is shown to outperform one-class SVM on real datasets, which is the state-of-the-art method for anomaly detection

Crossref

HAL UVSQ

Improving the classification of call center service dialogue with key utterences

Author: B Xu
CC Aggarwal
F Sebastiani
GX Yuan
H Gao
K Kowsari
L Li
LM Manevitz
P Domingos
T Pranckevičius
W Zhang
X Ma
X Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref