229 research outputs found

    Investigating Sociodemographic Disparities in Cancer Risk Using Web-Based Informatics

    Get PDF
    Cancer health disparities due to demographic and socioeconomic factors are an area of great interest in the epidemiological community. Adjusting for such factors is important when developing cancer risk models. However, for digital epidemiology studies relying on online sources such information is not readily available. This paper presents a novel method for extracting demographic and socioeconomic information from openly available online obituaries. The method relies on tailored language processing rules and a probabilistic scheme to map subjects’ occupation history to the occupation classification codes and related earnings provided by the U.S. Census Bureau. Using this information, a case-control study is executed fully in silico to investigate how age, gender, parity, and income level impact breast and lung cancer risk. Based on 48,368 online obituaries (4,643 for breast cancer, 6,274 for lung cancer, and 37,451 cancer-free) collected automatically and a generalized cancer risk model, our study shows strong association between age, parity, and socioeconomic status and cancer risk. Although for breast cancer the observed trends are very consistent with traditional epidemiological studies, some inconsistency is observed for lung cancer with respect to socioeconomic status

    Deep Gaze Velocity Analysis During Mammographic Reading for Biometric Identification of Radiologists

    Get PDF
    Several studies have confirmed that the gaze velocity of the human eye can be utilized as a behavioral biometric or personalized biomarker. In this study, we leverage the local feature representation capacity of convolutional neural networks (CNNs) for eye gaze velocity analysis as the basis for biometric identification of radiologists performing breast cancer screening. Using gaze data collected from 10 radiologists reading 100 mammograms of various diagnoses, we compared the performance of a CNN-based classification algorithm with two deep learning classifiers, deep neural network and deep belief network, and a previously presented hidden Markov model classifier. The study showed that the CNN classifier is superior compared to alternative classification methods based on macro F1-scores derived from 10-fold cross-validation experiments. Our results further support the efficacy of eye gaze velocity as a biometric identifier of medical imaging experts

    Automatic intensity windowing of mammographic images based on a perceptual metric

    Full text link
    [EN] Purpose: Initial auto-adjustment of the window level WL and width WW applied to mammographic images. The proposed intensity windowing (IW) method is based on the maximization of the mutual information (MI) between a perceptual decomposition of the original 12-bit sources and their screen displayed 8-bit version. Besides zoom, color inversion and panning operations, IW is the most commonly performed task in daily screening and has a direct impact on diagnosis and the time involved in the process. Methods: The authors present a human visual system and perception-based algorithm named GRAIL (Gabor-relying adjustment of image levels). GRAIL initially measures a mammogram's quality based on the MI between the original instance and its Gabor-filtered derivations. From this point on, the algorithm performs an automatic intensity windowing process that outputs the WL/WW that best displays each mammogram for screening. GRAIL starts with the default, high contrast, wide dynamic range 12-bit data, and then maximizes the graphical information presented in ordinary 8-bit displays. Tests have been carried out with several mammogram databases. They comprise correlations and an ANOVA analysis with the manual IW levels established by a group of radiologists. A complete MATLAB implementation of GRAIL is available at . Results: Auto-leveled images show superior quality both perceptually and objectively compared to their full intensity range and compared to the application of other common methods like global contrast stretching (GCS). The correlations between the human determined intensity values and the ones estimated by our method surpass that of GCS. The ANOVA analysis with the upper intensity thresholds also reveals a similar outcome. GRAIL has also proven to specially perform better with images that contain micro-calcifications and/or foreign X-ray-opaque elements and with healthy BI-RADS A-type mammograms. It can also speed up the initial screening time by a mean of 4.5 s per image. Conclusions: A novel methodology is introduced that enables a quality-driven balancing of the WL/WW of mammographic images. This correction seeks the representation that maximizes the amount of graphical information contained in each image. The presented technique can contribute to the diagnosis and the overall efficiency of the breast screening session by suggesting, at the beginning, an optimal and customized windowing setting for each mammogram. (C) 2017 American Association of Physicists in MedicineThis work has the support of IST S.L., University of Valencia (CPI15170), Consolider (CPAN13TR01), MINETUR (TSI1001012013019) and IFIC (Severo Ochoa Centre of Excellence SEV20140398). The authors would also like to thank C. Bellot M.D., M. Brouzet M.D., C. Calabuig M.D., J. Camps M.D., J. Coloma M.D., D. Erades M.D., Mr. V. Gutierrez, J. Herrero M.D., Dr. I. Maestre, Dr. A. Neco M.D., C. Ortola M.D., A. Rubio M.D., Dr. R. Sanchez, Dr. F. Sellers, A. Segura M.D., and the Spanish Cancer Association (AECC) for their effort, participation, counseling, and commitment in this research study. The authors report no conflicts of interest in conducting the research.Albiol Colomer, A.; Corbi, A.; Albiol Colomer, F. (2017). Automatic intensity windowing of mammographic images based on a perceptual metric. Medical Physics. 44(4):1369-1378. https://doi.org/10.1002/mp.12144S13691378444Maidment, A. D. A., Fahrig, R., & Yaffe, M. J. (1993). Dynamic range requirements in digital mammography. Medical Physics, 20(6), 1621-1633. doi:10.1118/1.596949Kimpe, T., & Tuytschaever, T. (2006). Increasing the Number of Gray Shades in Medical Display Systems—How Much is Enough? Journal of Digital Imaging, 20(4), 422-432. doi:10.1007/s10278-006-1052-3ACR, AAPM, and SIIM Practice parameter for determinants of image quality in digital mammography 2014Committee DS PS3.3 information object definitions 2015Pisano, E. D., Chandramouli, J., Hemminger, B. M., Glueck, D., Johnston, R. E., Muller, K., … Pizer, S. (1997). The effect of intensity windowing on the detection of simulated masses embedded in dense portions of digitized mammograms in a laboratory setting. Journal of Digital Imaging, 10(4), 174-182. doi:10.1007/bf03168840Börjesson, S., Håkansson, M., Båth, M., Kheddache, S., Svensson, S., Tingberg, A., … Månsson, L. G. (2005). A software tool for increased efficiency in observer performance studies in radiology. Radiation Protection Dosimetry, 114(1-3), 45-52. doi:10.1093/rpd/nch550Sahidan, S. I., Mashor, M. Y., Wahab, A. S. W., Salleh, Z., & Ja’afar, H. (s. f.). Local and Global Contrast Stretching For Color Contrast Enhancement on Ziehl-Neelsen Tissue Section Slide Images. 4th Kuala Lumpur International Conference on Biomedical Engineering 2008, 583-586. doi:10.1007/978-3-540-69139-6_146Ganesan, K., Acharya, U. R., Chua, C. K., Min, L. C., Abraham, K. T., & Ng, K.-H. (2013). Computer-Aided Breast Cancer Detection Using Mammograms: A Review. IEEE Reviews in Biomedical Engineering, 6, 77-98. doi:10.1109/rbme.2012.2232289Papadopoulos, A., Fotiadis, D. I., & Costaridou, L. (2008). Improvement of microcalcification cluster detection in mammography utilizing image enhancement techniques. Computers in Biology and Medicine, 38(10), 1045-1055. doi:10.1016/j.compbiomed.2008.07.006Panetta, K., Yicong Zhou, Agaian, S., & Hongwei Jia. (2011). Nonlinear Unsharp Masking for Mammogram Enhancement. IEEE Transactions on Information Technology in Biomedicine, 15(6), 918-928. doi:10.1109/titb.2011.2164259Rogowska, J., Preston, K., & Sashin, D. (1988). Evaluation of digital unsharp masking and local contrast stretching as applied to chest radiographs. IEEE Transactions on Biomedical Engineering, 35(10), 817-827. doi:10.1109/10.7288Ramponi, G. (1998). Rational unsharp masking technique. Journal of Electronic Imaging, 7(2), 333. doi:10.1117/1.482649Rangayyan, R. M., Liang Shen, Yiping Shen, Desautels, J. E. L., Bryant, H., Terry, T. J., … Rose, M. S. (1997). Improvement of sensitivity of breast cancer diagnosis with adaptive neighborhood contrast enhancement of mammograms. IEEE Transactions on Information Technology in Biomedicine, 1(3), 161-170. doi:10.1109/4233.654859Tang, J., Liu, X., & Sun, Q. (2009). A Direct Image Contrast Enhancement Algorithm in the Wavelet Domain for Screening Mammograms. IEEE Journal of Selected Topics in Signal Processing, 3(1), 74-80. doi:10.1109/jstsp.2008.2011108LINGURARU, M., MARIAS, K., ENGLISH, R., & BRADY, M. (2006). A biologically inspired algorithm for microcalcification cluster detection. Medical Image Analysis, 10(6), 850-862. doi:10.1016/j.media.2006.07.004Tsai, D.-Y., Lee, Y., & Matsuyama, E. (2007). Information Entropy Measure for Evaluation of Image Quality. Journal of Digital Imaging, 21(3), 338-347. doi:10.1007/s10278-007-9044-5Sheikh, H. R., & Bovik, A. C. (2006). Image information and visual quality. IEEE Transactions on Image Processing, 15(2), 430-444. doi:10.1109/tip.2005.859378Tourassi, G. D., Vargas-Voracek, R., Catarious, D. M., & Floyd, C. E. (2003). Computer-assisted detection of mammographic masses: A template matching scheme based on mutual information. Medical Physics, 30(8), 2123-2130. doi:10.1118/1.1589494Tourassi, G. D., Harrawood, B., Singh, S., Lo, J. Y., & Floyd, C. E. (2006). Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms. Medical Physics, 34(1), 140-150. doi:10.1118/1.2401667Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13(4), 600-612. doi:10.1109/tip.2003.819861Choi LK Goodall T Bovik AC Perceptual Image Enhancement. Encyclopedia of Image ProcessingFogel, I., & Sagi, D. (1989). Gabor filters as texture discriminator. Biological Cybernetics, 61(2). doi:10.1007/bf00204594Jain, A. K., Ratha, N. K., & Lakshmanan, S. (1997). Object detection using gabor filters. Pattern Recognition, 30(2), 295-309. doi:10.1016/s0031-3203(96)00068-4Vazquez-Fernandez, E., Dacal-Nieto, A., Martin, F., & Torres-Guijarro, S. (2010). Entropy of Gabor Filtering for Image Quality Assessment. Image Analysis and Recognition, 52-61. doi:10.1007/978-3-642-13772-3_6Rangayyan, R. M., Ayres, F. J., & Leo Desautels, J. E. (2007). A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs. Journal of the Franklin Institute, 344(3-4), 312-348. doi:10.1016/j.jfranklin.2006.09.003Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., … Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045-1057. doi:10.1007/s10278-013-9622-7Task Group 18 Imaging Informatics Subcommittee Assessment of display performance for medical imaging systems 2005A. C. of Radiology Committee Bi-rads atlas 5th edition 2014Hochberg, Y., & Benjamini, Y. (1990). More powerful procedures for multiple significance testing. Statistics in Medicine, 9(7), 811-818. doi:10.1002/sim.4780090710Keselman, H. J., & Keselman, J. C. (1984). The analysis of repeated measures designs in medical research. Statistics in Medicine, 3(2), 185-195. doi:10.1002/sim.4780030211Mauchly, J. W. (1940). Significance Test for Sphericity of a Normal nn-Variate Distribution. The Annals of Mathematical Statistics, 11(2), 204-209. doi:10.1214/aoms/1177731915Samei, E., Badano, A., Chakraborty, D., Compton, K., Cornelius, C., Corrigan, K., … Willis, C. E. (2005). Assessment of display performance for medical imaging systems: Executive summary of AAPM TG18 report. Medical Physics, 32(4), 1205-1225. doi:10.1118/1.1861159Haghighat, M., Zonouz, S., & Abdel-Mottaleb, M. (2015). CloudID: Trustworthy cloud-based and cross-enterprise biometric identification. Expert Systems with Applications, 42(21), 7905-7916. doi:10.1016/j.eswa.2015.06.02

    Osteology and relationships of Rhinopycnodus gabriellae gen. et sp. nov. (Pycnodontiformes) from the marine Late Cretaceous of Lebanon

    Get PDF
    The osteology of Rhinopycnodus gabriellae gen. and sp. nov., a pycnodontiform fish from the marine Cenomanian (Late Cretaceous) of Lebanon, is studied in detail. This new fossil genus belongs to the family Pycnodontidae, as shown by the presence of a posterior brush-like process on its parietal. Its long and broad premaxilla, bearing one short and very broad tooth is the principal autapomorphy of this fish. Within the phylogeny of Pycnodontidae, Rhinopycnodus occupies an intermediate position between Ocloedus and Tepexichthys

    Using Case-Level Context to Classify Cancer Pathology Reports

    Get PDF
    Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We test our approach on a corpus of 431,433 cancer pathology reports, and we show that incorporating case-level context significantly boosts classification accuracy across six classification tasks-site, subsite, laterality, histology, behavior, and grade. We expect that with minimal modifications, our add-on can be applied towards a wide range of other clinical text-based tasks

    Adiabatic Quantum Support Vector Machines

    Full text link
    Adiabatic quantum computers can solve difficult optimization problems (e.g., the quadratic unconstrained binary optimization problem), and they seem well suited to train machine learning models. In this paper, we describe an adiabatic quantum approach for training support vector machines. We show that the time complexity of our quantum approach is an order of magnitude better than the classical approach. Next, we compare the test accuracy of our quantum approach against a classical approach that uses the Scikit-learn library in Python across five benchmark datasets (Iris, Wisconsin Breast Cancer (WBC), Wine, Digits, and Lambeq). We show that our quantum approach obtains accuracies on par with the classical approach. Finally, we perform a scalability study in which we compute the total training times of the quantum approach and the classical approach with increasing number of features and number of data points in the training dataset. Our scalability results show that the quantum approach obtains a 3.5--4.5 times speedup over the classical approach on datasets with many (millions of) features

    Deep Active Learning for Classifying Cancer Pathology Reports

    Get PDF
    Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. Conclusions: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling

    Limitations of Transformers on Clinical Text Classification

    Get PDF
    Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures -- a word-level convolutional neural network and a hierarchical self-attention network -- and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT -- pretraining and WordPiece tokenization -- may actually be inhibiting BERT\u27s performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text
    • …
    corecore