Search CORE

1,492 research outputs found

Challenges of Big Data Analysis

Author: Fan Jianqing
Han Fang
Liu Han
Publication venue: 'Oxford University Press (OUP)'
Publication date: 06/02/2014
Field of study

Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

BrainIAK: The Brain Imaging Analysis Kit

Author: Anderson M.
Antony J.
Baldassano C.
Brooks P.
Cai M.
Capota M.
Chen P.
Cohen J.
Ellis C.
Hasson U.
Henselman-Petrusek G.
Huberdeau D.
Hutchinson J.
Kumar M.
Li K.
Li P.
Lu Q.
Manning J.
Mennen A.
Nastase S.
Norman K.
Ramadge P.
Richard H.
Schapiro A.
Schuck N.
Shvartsman M.
Sundaram N.
Suo D.
Turek J.
Turk-Browne N.
Vo V.
Wallace G.
Wang Y.
Willke T.
Zhang H.
Zhu X.
Publication venue: 'Center for Open Science'
Publication date: 01/01/2020
Field of study

MPG.PuRe

The 5th International Conference on Biomedical Engineering and Biotechnology (ICBEB 2016)

Author: Ailong Cai
Baiying Lei
Baiying Lei
Baodong Gai
Baoliang Sun
Bin Wang
Bin Yan
Binquan Li
Changyu Tu
Chengxin Yan
Chiehhsuan Wei
Chunlan Yang
Chunlan Yang
Chunlan Yang
Cong Xu
Daisheng Luo
Daisheng Luo
Dong Ni
Dongyan Yang
Fang Han
Farnaz Farokhian
Farnaz Farokhian
Feng Shi
Feng Zhao
Fuwen Lai
Guanyu Li
Guixue Liu
Haibing Bu
Haijun Lei
Haizhu Xie
Hao Fang
Hasan Demirel
Hua Zhong
Huihong Gong
Huihui Yang
Iman Beheshti
Ioannis Manousakas
Jian Zhang
Jianping Yin
Jie Yang
Jie Yang
Jie Yang
Jiechuan Ren
Jiejue Ma
Jing Xiong
Jingke Zhang
Jingwen Zhuang
Junghua Ho
Junzheng Zheng
Juyoung Park
Ke Gan
Ke Gan
Keming Mao
Keming Mao
Kuan Li
Kyungtae Kang
Lanhua Zhang
Lei Li
Lili Zhao
Linyuan Wang
LiSha Tan
Manning Wang
Mao Wang
Mei Bai
Meixia Su
Minghua Zhao
Mingwu Jin
Mingyue Ding
Nan Fu
Nan Fu
Nan Fu
Ning Mao
Ping Sun
Preetha Phillips
Qi Mao
Qiang Liu
Qingchun Li
Qun Wang
Qun Wang
Rongmao Li
Rongmao Li
Shaode Yu
Shaode Yu
Shaode Yu
Shaomao Lv
Shaoqing Wang
Shaowu Li
Shaoyin Duan
Shengli Li
Shihou Sheng
Shuguang Zhao
Shuicai Wu
Shuicai Wu
Shuihua Wang
Shuo Li
Shuo Li
Shuo Li
Shuwen Chen
Sidan Du
Simin Lin
Siping Chen
Song Gao
Soyeun Kim
Tao Gong
Tao Gong
Tao Gong
Tianfu Wang
Tianxu Zhang
Wan Li
Wan Li
Wan Li
Wangsheng Lu
Wei Liu
Wei Peng
Wensheng Li
Wenyu Liang
Xianbin Cheng
Xiancun Yang
Xiaohui Hu
Xiaolei Song
Xiaolong Sun
Xin Zhang
Xin Zhang
Xinnuan Mu
Xuming Zhang
Y. F. Li
Yafeng Zhan
Yan Zhang
Yanchun Zhu
Yanchun Zhu
Yanchun Zhu
Yanhong Zhou
Yanhui Ding
Yaoqin Xie
Yaoqin Xie
Yaoqin Xie
Yaping Wang
Yifei Liu
Yijie Ren
Yin Chang
Yingnan Nie
Yingnan Nie
Yixian Liu
Yongchao Wang
Yonghong Liu
Yongxin Zhang
Yudong Zhang
Yulu Song
Yun Liang
Yupei Chen
Yuxiang Wu
Zeyuan Lu
Zhang Yang
Zhen Yu
Zhengchao Dong
Zhenghao Shi
Zhenghua Huang
Zhijian Song
ZhiJun Gao
Zhimin Chen
Zhuofu Deng
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector

Institutional Repository of Yantai Institute of Coastal Zone Research, CAS

Large-scale Data Analysis and Deep Learning Using Distributed Cyberinfrastructures and High Performance Computing

Author: Platania Richard Dodge
Publication venue: LSU Digital Commons
Publication date: 27/06/2019
Field of study

Data in many research fields continues to grow in both size and complexity. For instance, recent technological advances have caused an increased throughput in data in various biological-related endeavors, such as DNA sequencing, molecular simulations, and medical imaging. In addition, the variance in the types of data (textual, signal, image, etc.) adds an additional complexity in analyzing the data. As such, there is a need for uniquely developed applications that cater towards the type of data. Several considerations must be made when attempting to create a tool for a particular dataset. First, we must consider the type of algorithm required for analyzing the data. Next, since the size and complexity of the data imposes high computation and memory requirements, it is important to select a proper hardware environment on which to build the application. By carefully both developing the algorithm and selecting the hardware, we can provide an effective environment in which to analyze huge amounts of highly complex data in a large-scale manner. In this dissertation, I go into detail regarding my applications using big data and deep learning techniques to analyze complex and large data. I investigate how big data frameworks, such as Hadoop, can be applied to problems such as large-scale molecular dynamics simulations. Following this, many popular deep learning frameworks are evaluated and compared to find those that suit certain hardware setups and deep learning models. Then, we explore an application of deep learning to a biomedical problem, namely ADHD diagnosis from fMRI data. Lastly, I demonstrate a framework for real-time and fine-grained vehicle detection and classification. With each of these works in this dissertation, a unique large-scale analysis algorithm or deep learning model is implemented that caters towards the problem and leverages specialized computing resources

Louisiana State University

One-Class Classification: Taxonomy of Study and Review of Techniques

Author: Khan Shehroz S.
Madden Michael G.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/11/2013
Field of study

One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

arXiv.org e-Print Archive

Access to Research at National University of Ireland, Galway

Big data analytics: Machine learning and Bayesian learning perspectives—What is done? What is not?

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2018
Field of study

Big data analytics provides an interdisciplinary framework that is essential to support the current trend for solving real-world problems collaboratively. The progression of big data analytics framework must be clearly understood so that novel approaches can be developed to advance this state-of-the-art discipline. An ignorance of observing the progression of this fast-growing discipline may lead to duplications in research and waste of efforts. Its main companion field, machine learning, helps solve many big data analytics problems; therefore, it is also important to understand the progression of machine learning in the big data analytics framework. One of the current research efforts in big data analytics is the integration of deep learning and Bayesian optimization, which can help the automatic initialization and optimization of hyperparameters of deep learning and enhance the implementation of iterative algorithms in software. The hyperparameters include the weights used in deep learning, and the number of clusters in Bayesian mixture models that characterize data heterogeneity. The big data analytics research also requires computer systems and software that are capable of storing, retrieving, processing, and analyzing big data that are generally large, complex, heterogeneous, unstructured, unpredictable, and exposed to scalability problems. Therefore, it is appropriate to introduce a new research topic—transformative knowledge discovery—that provides a research ground to study and develop smart machine learning models and algorithms that are automatic, adaptive, and cognitive to address big data analytics problems and challenges. The new research domain will also create research opportunities to work on this interdisciplinary research space and develop solutions to support research in other disciplines that may not have expertise in the research area of big data analytics. For example, the research, such as detection and characterization of retinal diseases in medical sciences and the classification of highly interacting species in environmental sciences can benefit from the knowledge and expertise in big data analytics

The University of North Carolina at Greensboro

Large-scale Machine Learning in High-dimensional Datasets

Author: Hansen Toke Jansen
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Online Research Database In Technology