Search CORE

18 research outputs found

MCIndoor20000: A fully-labeled image dataset to advance indoor objects detection

Author: Ahmad P. Tafti
Eric LaRose
Fereshteh S. Bashiri
Peggy Peissig
Publication venue: 'Elsevier BV'
Publication date: 01/04/2018
Field of study

A fully-labeled image dataset provides a unique resource for reproducible research inquiries and data analyses in several computational fields, such as computer vision, machine learning and deep learning machine intelligence. With the present contribution, a large-scale fully-labeled image dataset is provided, and made publicly and freely available to the research community. The current dataset entitled MCIndoor20000 includes more than 20,000 digital images from three different indoor object categories, including doors, stairs, and hospital signs. To make a comprehensive dataset addressing current challenges that exist in indoor objects modeling, we cover a multiple set of variations in images, such as rotation, intra-class variation plus various noise models. The current dataset is freely and publicly available at https://github.com/bircatmcri/MCIndoor20000. Keywords: Image dataset, Large-scale dataset, Image classification, Supervised learning, Indoor objects, Deep learnin

Directory of Open Access Journals

3DSEM: A Dataset for 3D SEM Surface Reconstruction

Author: Holz Jessica D
Kirkpatrick Andrew B
Owen Heather A
Tafti Ahmad P
Yu Zeyun
Publication venue: Harvard Dataverse
Publication date: 02/12/2015
Field of study

The Scanning Electron Microscope (SEM) as 2D imaging instrument has been widely used in biological, mechanical, and materials sciences to determine the surface attributes (e.g., compositions or geometries) of microscopic specimens. A SEM offers an excellent capability to overcome the limitation of human eyes by achieving increased magnification, contrast, and resolution greater than 1 nanometer. However, SEM micrographs still remain two-dimensional (2D). Having truly three-dimensional (3D) shapes from SEM micrographs would provide anatomic surfaces allowing for quantitative measurements and informative visualization of the objects being investigated. In biology, for example, 3D SEM surface reconstructions would enable researchers to investigate surface characteristics and recognize roughness, flatness, and waviness of a biological structure. There are also various applications in material and mechanical engineering in which 3D representations of material properties would allow us to accurately measure a fractal dimension and surface roughness and design a micro article which needs to fit into a tiny appliance. 3D SEM surface reconstruction employs several computational technologies, such as multi-view geometry, computer vision, optimization strategies, and machine learning to tackle the inverse problem going from 2D to 3D. In this contribution, an attempt is made to provide a 3D microscopy dataset along with the underlying algorithms publicly and freely available at http://selibcv.org/3dsem/ for the research community

Harvard Dataverse Network

3DSEM: A 3D microscopy dataset

Author: Ahmad P. Tafti
Andrew B. Kirkpatrick
Heather A. Owen
Jessica D. Holz
Zeyun Yu
Publication venue: Elsevier
Publication date: 02/12/2015
Field of study

The Scanning Electron Microscope (SEM) as a 2D imaging instrument has been widely used in many scientific disciplines including biological, mechanical, and materials sciences to determine the surface attributes of microscopic objects. However the SEM micrographs still remain 2D images. To effectively measure and visualize the surface properties, we need to truly restore the 3D shape model from 2D SEM images. Having 3D surfaces would provide anatomic shape of micro-samples which allows for quantitative measurements and informative visualization of the specimens being investigated. The 3DSEM is a dataset for 3D microscopy vision which is freely available at [1] for any academic, educational, and research purposes. The dataset includes both 2D images and 3D reconstructed surfaces of several real microscopic samples. Keywords: 3D microscopy dataset, 3D microscopy vision, 3D SEM surface reconstruction, Scanning Electron Microscope (SEM

Elsevier - Publisher Connector

Directory of Open Access Journals

PubMed Central

Tribological study in microscale using 3D SEM surface reconstruction

Author: Afsaneh Dorri Moghadam
Agarwal
Ahmad P. Tafti
Altman
Basri
Bogner
Chakraborty
Cover
de Jonge
Dhillon
Dhillon
Emad Omrani
Fathi
Feoktistov
Fischler
Gadelmawla
Goldstein
Goldstein
Goldstein
Hartely
Hartley
Holt
Keller
Lowe
Menezes
Mojtaba F. Fathi
Park
Pradeep Rohatgi
Roshan M. D'Souza
Sindhu
Tafti
Tafti
Tafti
Tafti
Triggs
Wöhler
Zeyun Yu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

SparkText: Biomedical Text Mining on Big Data Framework

Author: Ahmad P. Tafti (3184725)
Kai Wang (21246)
Karen Y. He (3184719)
Max M. He (3184722)
Zhan Ye (655055)
Publication venue
Publication date: 01/01/2016
Field of study

<div>BackgroundMany new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.ResultsIn this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes.ConclusionsThis study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.</div

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Comparing the time efficiency results, SparkText outperformed other available text mining tools with speeds up to 132 times faster on the larger dataset that included 29,437 full-text articles.

Author: Ahmad P. Tafti (3184725)
Kai Wang (21246)
Karen Y. He (3184719)
Max M. He (3184722)
Zhan Ye (655055)
Publication venue
Publication date
Field of study

Comparing the time efficiency results, SparkText outperformed other available text mining tools with speeds up to 132 times faster on the larger dataset that included 29,437 full-text articles.</p

The Francis Crick Institute

The ROC curves for the dataset “Full-text Articles II”: the area under the curve for the SVM classifier represents a better result compare to that of the Naïve Bayes and Logistic Regression algorithms.

Author: Ahmad P. Tafti (3184725)
Kai Wang (21246)
Karen Y. He (3184719)
Max M. He (3184722)
Zhan Ye (655055)
Publication venue
Publication date
Field of study

The ROC curves for the dataset “Full-text Articles II”: the area under the curve for the SVM classifier represents a better result compare to that of the Naïve Bayes and Logistic Regression algorithms.</p

The Francis Crick Institute

An example of a bag-of-words representation.

Author: Ahmad P. Tafti (3184725)
Kai Wang (21246)
Karen Y. He (3184719)
Max M. He (3184722)
Zhan Ye (655055)
Publication venue
Publication date
Field of study

The terms “biology”, “biopsy”, “biolab”, “biotin”, and “almost” are unigrams, but “cancer-surviv”, and “cancer-stage” are bigrams. Using TF/IDF weighting scores, the feature value of the term “almost” equals to zero.</p

The Francis Crick Institute

SparkText: Biomedical Text Mining on Big Data Framework - Fig 5

Author: Ahmad P. Tafti (3184725)
Kai Wang (21246)
Karen Y. He (3184719)
Max M. He (3184722)
Zhan Ye (655055)
Publication venue
Publication date
Field of study

Quantitative comparisons of the prediction models on text mining: (A) the accuracy, precision, and recall obtained from 19,681 abstracts; (B) the accuracy, precision, and recall on 12,902 full-text articles; and (C) the accuracy, precision, and recall on 29,437 full-text articles. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0162721#pone.0162721.t002" target="_blank">Table 2</a> provides the details on these 3 datasets. Five-fold cross validation was used in all analyses.</p

The Francis Crick Institute