42,778 research outputs found

    Occlusion Coherence: Detecting and Localizing Occluded Faces

    Full text link
    The presence of occluders significantly impacts object recognition accuracy. However, occlusion is typically treated as an unstructured source of noise and explicit models for occluders have lagged behind those for object appearance and shape. In this paper we describe a hierarchical deformable part model for face detection and landmark localization that explicitly models part occlusion. The proposed model structure makes it possible to augment positive training data with large numbers of synthetically occluded instances. This allows us to easily incorporate the statistics of occlusion patterns in a discriminatively trained model. We test the model on several benchmarks for landmark localization and detection including challenging new data sets featuring significant occlusion. We find that the addition of an explicit occlusion model yields a detection system that outperforms existing approaches for occluded instances while maintaining competitive accuracy in detection and landmark localization for unoccluded instances

    EFICAz²: enzyme function inference by a combined approach enhanced by machine learning

    Get PDF
    ©2009 Arakaki et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2105/10/107doi:10.1186/1471-2105-10-107Background: We previously developed EFICAz, an enzyme function inference approach that combines predictions from non-completely overlapping component methods. Two of the four components in the original EFICAz are based on the detection of functionally discriminating residues (FDRs). FDRs distinguish between member of an enzyme family that are homofunctional (classified under the EC number of interest) or heterofunctional (annotated with another EC number or lacking enzymatic activity). Each of the two FDR-based components is associated to one of two specific kinds of enzyme families. EFICAz exhibits high precision performance, except when the maximal test to training sequence identity (MTTSI) is lower than 30%. To improve EFICAz's performance in this regime, we: i) increased the number of predictive components and ii) took advantage of consensual information from the different components to make the final EC number assignment. Results: We have developed two new EFICAz components, analogs to the two FDR-based components, where the discrimination between homo and heterofunctional members is based on the evaluation, via Support Vector Machine models, of all the aligned positions between the query sequence and the multiple sequence alignments associated to the enzyme families. Benchmark results indicate that: i) the new SVM-based components outperform their FDR-based counterparts, and ii) both SVM-based and FDR-based components generate unique predictions. We developed classification tree models to optimally combine the results from the six EFICAz components into a final EC number prediction. The new implementation of our approach, EFICAz², exhibits a highly improved prediction precision at MTTSI < 30% compared to the original EFICAz, with only a slight decrease in prediction recall. A comparative analysis of enzyme function annotation of the human proteome by EFICAz² and KEGG shows that: i) when both sources make EC number assignments for the same protein sequence, the assignments tend to be consistent and ii) EFICAz² generates considerably more unique assignments than KEGG. Conclusion: Performance benchmarks and the comparison with KEGG demonstrate that EFICAz² is a powerful and precise tool for enzyme function annotation, with multiple applications in genome analysis and metabolic pathway reconstruction. The EFICAz² web service is available at: http://cssb.biology.gatech.edu/skolnick/webservice/EFICAz2/index.htm

    Measuring neutrino masses with a future galaxy survey

    Full text link
    We perform a detailed forecast on how well a Euclid-like photometric galaxy and cosmic shear survey will be able to constrain the absolute neutrino mass scale. Adopting conservative assumptions about the survey specifications and assuming complete ignorance of the galaxy bias, we estimate that the minimum mass sum of sum m_nu ~ 0.06 eV in the normal hierarchy can be detected at 1.5 sigma to 2.5 sigma significance, depending on the model complexity, using a combination of galaxy and cosmic shear power spectrum measurements in conjunction with CMB temperature and polarisation observations from Planck. With better knowledge of the galaxy bias, the significance of the detection could potentially reach 5.4 sigma. Interestingly, neither Planck+shear nor Planck+galaxy alone can achieve this level of sensitivity; it is the combined effect of galaxy and cosmic shear power spectrum measurements that breaks the persistent degeneracies between the neutrino mass, the physical matter density, and the Hubble parameter. Notwithstanding this remarkable sensitivity to sum m_nu, Euclid-like shear and galaxy data will not be sensitive to the exact mass spectrum of the neutrino sector; no significant bias (< 1 sigma) in the parameter estimation is induced by fitting inaccurate models of the neutrino mass splittings to the mock data, nor does the goodness-of-fit of these models suffer any significant degradation relative to the true one (Delta chi_eff ^2< 1).Comment: v1: 29 pages, 10 figures. v2: 33 pages, 12 figures; added sections on shape evolution and constraints in more complex models, accepted for publication in JCA

    Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission

    Full text link
    By deploying machine-learning algorithms at the network edge, edge learning can leverage the enormous real-time data generated by billions of mobile devices to train AI models, which enable intelligent mobile applications. In this emerging research area, one key direction is to efficiently utilize radio resources for wireless data acquisition to minimize the latency of executing a learning task at an edge server. Along this direction, we consider the specific problem of retransmission decision in each communication round to ensure both reliability and quantity of those training data for accelerating model convergence. To solve the problem, a new retransmission protocol called data-importance aware automatic-repeat-request (importance ARQ) is proposed. Unlike the classic ARQ focusing merely on reliability, importance ARQ selectively retransmits a data sample based on its uncertainty which helps learning and can be measured using the model under training. Underpinning the proposed protocol is a derived elegant communication-learning relation between two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data uncertainty. This relation facilitates the design of a simple threshold based policy for importance ARQ. The policy is first derived based on the classic classifier model of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. The policy is then extended to the more complex model of convolutional neural networks (CNN) where data uncertainty is measured by entropy. Extensive experiments have been conducted for both the SVM and CNN using real datasets with balanced and imbalanced distributions. Experimental results demonstrate that importance ARQ effectively copes with channel fading and noise in wireless data acquisition to achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2) consideration of imbalanced classification in the experiments. Submitted to IEEE Journal for possible publicatio

    Thermal effects compensation and associated uncertainty for large magnet assembly precision alignment

    Get PDF
    Big science and ambitious industrial projects continually push technical requirements forward beyond the grasp of conventional engineering techniques. An example of these are the extremely tight micrometric assembly and alignment tolerances required in the field of celestial telescopes, particle accelerators, and the aerospace industry. Achieving such extreme requirements for large assemblies is limited, largely by the capability of the metrology used, namely, its uncertainty in relation to the alignment tolerance required. The current work described here was done as part of Maria Curie European research project held at CERN, Geneva. This related to future accelerators requiring the spatial alignment of several thousand, metre-plus large assemblies to a common datum within a targeted combined standard uncertainty (uctg(y)) of 12 μm. The current work has found several gaps in knowledge limiting such a capability. Among these was the lack of uncertainty statements for the thermal error compensation applied to correct for the assembly's dimensional instability, post metrology and during assembly and alignment. A novel methodology was developed by which a mixture of probabilistic modelling and high precision traceable reference measurements were used to quantify the uncertainty of the various thermal expansion models used namely: Empirical, Finite Element Method (FEM) models and FEM metamodels. Results have shown that the suggested methodology can accurately predict the uncertainty of the thermal deformation predictions made and thus compensations. The analysis of the results further showed how using this method a ‘digital twin’ of the engineering structure can be calibrated with known uncertainty of the thermal deformation behaviour predictions in the micrometric range. Namely, the Empirical, FEM and FEM metamodels combined standard uncertainties ( uc(y) ) of prediction were validated to be of maximum: 8.7 μm, 11.28 μm and 12.24 μm for the studied magnet assemblies

    Autonomous integrated GPS/INS navigation experiment for OMV. Phase 1: Feasibility study

    Get PDF
    The phase 1 research focused on the experiment definition. A tightly integrated Global Positioning System/Inertial Navigation System (GPS/INS) navigation filter design was analyzed and was shown, via detailed computer simulation, to provide precise position, velocity, and attitude (alignment) data to support navigation and attitude control requirements of future NASA missions. The application of the integrated filter was also shown to provide the opportunity to calibrate inertial instrument errors which is particularly useful in reducing INS error growth during times of GPS outages. While the Orbital Maneuvering Vehicle (OMV) provides a good target platform for demonstration and for possible flight implementation to provide improved capability, a successful proof-of-concept ground demonstration can be obtained using any simulated mission scenario data, such as Space Transfer Vehicle, Shuttle-C, Space Station

    Inference of Markovian Properties of Molecular Sequences from NGS Data and Applications to Comparative Genomics

    Full text link
    Next Generation Sequencing (NGS) technologies generate large amounts of short read data for many different organisms. The fact that NGS reads are generally short makes it challenging to assemble the reads and reconstruct the original genome sequence. For clustering genomes using such NGS data, word-count based alignment-free sequence comparison is a promising approach, but for this approach, the underlying expected word counts are essential. A plausible model for this underlying distribution of word counts is given through modelling the DNA sequence as a Markov chain (MC). For single long sequences, efficient statistics are available to estimate the order of MCs and the transition probability matrix for the sequences. As NGS data do not provide a single long sequence, inference methods on Markovian properties of sequences based on single long sequences cannot be directly used for NGS short read data. Here we derive a normal approximation for such word counts. We also show that the traditional Chi-square statistic has an approximate gamma distribution, using the Lander-Waterman model for physical mapping. We propose several methods to estimate the order of the MC based on NGS reads and evaluate them using simulations. We illustrate the applications of our results by clustering genomic sequences of several vertebrate and tree species based on NGS reads using alignment-free sequence dissimilarity measures. We find that the estimated order of the MC has a considerable effect on the clustering results, and that the clustering results that use a MC of the estimated order give a plausible clustering of the species.Comment: accepted by RECOMB-SEQ 201
    • …
    corecore