9 research outputs found

    The analysis and advanced extensions of canonical correlation analysis

    Get PDF
    Drug discovery is the process of identifying compounds which have potentially meaningful biological activity. A problem that arises is that the number of compounds to search over can be quite large, sometimes numbering in the millions, making experimental testing intractable. For this reason computational methods are employed to filter out those compounds which do not exhibit strong biological activity. This filtering step, also called virtual screening reduces the search space, allowing for the remaining compounds to be experimentally tested. In this dissertation I will provide an approach to the problem of virtual screening based on Canonical Correlation Analysis (CCA) and several extensions which use kernel and spectral learning ideas. Specifically these methods will be applied to the protein ligand matching problem. Additionally, theoretical results analyzing the behavior of CCA in the High Dimension Low Sample Size (HDLSS) setting will be provided

    PEPR: pipelines for evaluating prokaryotic references

    Get PDF

    The Fast RODEO for Local Polynomial Regression

    No full text
    <div><p>An open challenge in nonparametric regression is finding fast, computationally efficient approaches to estimating local bandwidths for large data sets, in particular in two or more dimensions. In the work presented here we introduce a novel local bandwidth estimation procedure for local polynomial regression which combines the greedy search of the RODEO algorithm with linear binning. The result is a fast, computationally efficient algorithm we refer to as the <i>fast RODEO</i>. We motivate the development of our algorithm by using a novel scale-space approach to derive the RODEO. We conclude with a toy example and real world example using data from the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite validation study, where we show the fast RODEO’s improvement in accuracy and computational speed over two other standard approaches.</p></div

    The Spatial LASSO With Applications to Unmixing Hyperspectral Biomedical Images

    No full text
    <div><p>Hyperspectral imaging (HSI) is a spectroscopic method that uses densely sampled measurements along the electromagnetic spectrum to identify the unique molecular composition of an object. Traditionally HSI has been associated with remote sensing-type applications, but recently has found increased use in biomedicine, from investigations at the cellular to the tissue level. One of the main challenges in the analysis of HSI is estimating the proportions, also called abundance fractions of each of the molecular signatures. While there is great promise for HSI in the area of biomedicine, large variability in the measurements and artifacts related to the instrumentation has slow adoption into more widespread practice. In this article, we propose a novel regularization and variable selection method called the spatial LASSO (SPLASSO). The SPLASSO incorporates spatial information via a graph Laplacian-based penalty to help improve the model estimation process for multivariate response data. We show the strong performance of this approach on a benchmark HSI dataset with considerable improvement in predictive accuracy over the standard LASSO. Supplementary materials for this article are available online.</p></div

    Differential Absorption LIDAR for Greenhouse Gas Detection

    Get PDF
    Our goal is to develop and characterize measurement technology for ground-based quantification of greenhouse-gas emissions from natural and anthropogenic sources and sinks. Toward this end, we are developing a differential absorption LIDAR (DIAL) system capable of remote measurements of atmospheric levels of the major greenhouse gases with 10-100 meter range resolution over distances of several kilometers. Our DIAL system operates in the eye safe 1600 nm wavelength region where three critical greenhouse gases, carbon dioxide (CO2), nitrous oxide (N2O), and methane (CH4) have vibrational absorption bands. Our final system will be capable of measuring both concentration and wind speed to determine gas fluxes. The system has potential to serve as validation for other methods of remote sensing. This talk will discuss the development of our high energy infrared transmitter and receiver, novel detection strategies, recent results, and our 100 meter long indoor test facility. The test facility will enable careful characterization of the DIAL system and may serve as a reference for other remote sensing systems. 11:25 Absolute Radiance Re-calibration of FIRST Harri Latvakoski, Jason Swasey, Kendall Johnson – USU/Space Dynamics Laboratory; Martin Mylnczak, David Johnson, Richard Cageao – NASA Langley Research Center ABSTRACT: The FIRST (Far-InfraRed Spectroscopy of the Troposphere) instrument is a 10 to 100 micron spectrometer with 0.64 micron resolution designed to measure the complete mid and far-infrared radiance of the Earth\u27s Atmosphere. FIRST has been successfully used to obtain high-quality atmospheric radiance data from the ground and from a high-altitude balloon. A Fourier transform interferometer is used to provide the spectral resolution and two on-board blackbodies are used for calibration. This presentation will discuss the recent re-calibration of FIRST at Space Dynamics Laboratory for absolute radiance accuracy. The calibration used the LWRICS (Long Wave Infrared calibration source) blackbody, which NIST testing shows to be accurate to the ~100 mK level in brightness temperature. There are several challenged to calibrating FIRST, including the large dynamic range, out of phase light, and drift in the interferogram phase. The accuracy goal for FIRST was 0.2 K over most of the 10 to 100 micron range, and results show FIRST meets this goal for a range of target temperatures
    corecore