484 research outputs found

    User-defined Machine Learning Functions

    Full text link
    [EN] In Data Science practices it is commonly assumed and accepted to abstract and slice big data architectures into functional layers, in particular a triad of governance-, data analysis- and persistence layer. However, moving input data to analysis, which is required when abstracting a data persistence layer from a data analysis layer, needs to be considered as highly expensive at large scale. Especially in Machine Learning (ML), the data analytics layer module requires intense data movements during preprocessing, data integration, preparation and analytics steps. Therefore, we propose to consider an application of User-defined functions (UDFs) with ML capabilities directly at the data persistence layer, i.e. at the database. We observed that it might be overall most efficient in traditional on-premise (i.e. non-cloud) RDBMS environments to apply ML UDFs if only singular and self-contained ML tasks should be integrated. Whereas the availability of ML functions in databases was predominantly owned by proprietary solutions in the past, there are now entirely new opportunities to integrate Python ML libraries with open source RDBMS. Whilst considering Python as one dominant language for ML applications in Data Science, the now achieved facilitation of Python ML UDFs consequently opens a broad range of opportunities to add Python ML capabilities to already existing persistence layers - without having to build an additional data analysis layer and related pipeline. With this presentation we deliver preliminary results of our industry research about database centric ML applications, and we open source code for the application of (un)supervised learning models.Herrmann, M.; Fiedler, M. (2020). User-defined Machine Learning Functions. Editorial Universitat Politècnica de València. http://hdl.handle.net/10251/149581OC

    jPREdictor: a versatile tool for the prediction of cis-regulatory elements

    Get PDF
    Gene regulation is the process through which an organism effects spatial and temporal differences in gene expression levels. Knowledge of cis-regulatory elements as key players in gene regulation is indispensable for the understanding of the latter and of the development of organisms. Here we present the tool jPREdictor for the fast and versatile prediction of cis-regulatory elements on a genome-wide scale. The prediction is based on clusters of individual motifs and any combination of these into multi-motifs with selectable minimal and maximal distances. Individual motifs can be of heterogenous classes, such as simple sequence motifs or position-specific scoring matrices. Cluster scores are weighted occurrences of multi-motifs, where the weights are derived from positive and negative training sets. We illustrate the flexibility of the jPREdictor with a new predic-tion of Polycomb/Trithorax Response Elements in Drosophila melanogaster. jPREdictor is available as a graphical user interface for online use and for download at

    System Concepts for Bi- and Multi-Static SAR Missions

    Get PDF
    The performance and capabilities of bi- and multistatic spaceborne synthetic aperture radar (SAR) are analyzed. Such systems can be optimized for a broad range of applications like frequent monitoring, wide swath imaging, single-pass cross-track interferometry, along-track interferometry, resolution enhancement or radar tomography. Further potentials arises from digital beamforming on receive, which allows to gather additional information about the direction of the scattered radar echoes. This directional information can be used to suppress interferences, to improve geometric and radiometric resolution, or to increase the unambiguous swath width. Furthermore, a coherent combination of multiple receiver signals will allow for a suppression of azimuth ambiguities. For this, a reconstruction algorithm is derived, which enables a recovery of the unambiguous Doppler spectrum also in case of non-optimum receiver aperture displacements leading to a non-uniform sampling of the SAR signal. This algorithm has also a great potential for systems relying on the displaced phase center (DPC) technique, like the high resolution wide swath (HRWS) SAR or the split antenna approach in the TerraSAR-X and Radarsat II satellites

    Semantic-Aware Environment Perception for Mobile Human-Robot Interaction

    Full text link
    Current technological advances open up new opportunities for bringing human-machine interaction to a new level of human-centered cooperation. In this context, a key issue is the semantic understanding of the environment in order to enable mobile robots more complex interactions and a facilitated communication with humans. Prerequisites are the vision-based registration of semantic objects and humans, where the latter are further analyzed for potential interaction partners. Despite significant research achievements, the reliable and fast registration of semantic information still remains a challenging task for mobile robots in real-world scenarios. In this paper, we present a vision-based system for mobile assistive robots to enable a semantic-aware environment perception without additional a-priori knowledge. We deploy our system on a mobile humanoid robot that enables us to test our methods in real-world applications.Comment: ISPA 201

    Automated Deception Detection from Videos: Using End-to-End Learning Based High-Level Features and Classification Approaches

    Full text link
    Deception detection is an interdisciplinary field attracting researchers from psychology, criminology, computer science, and economics. We propose a multimodal approach combining deep learning and discriminative models for automated deception detection. Using video modalities, we employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions, achieving promising results compared to state-of-the-art methods. Due to limited training data, we also utilize discriminative models for deception detection. Although sequence-to-class approaches are explored, discriminative models outperform them due to data scarcity. Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors. Results indicate that facial expressions outperform gaze and head pose, and combining modalities with feature selection enhances detection performance. Differences in expressed features across datasets emphasize the importance of scenario-specific training data and the influence of context on deceptive behavior. Cross-dataset experiments reinforce these findings. Despite the challenges posed by low-stake datasets, including the Rolling-Dice Experiment, deception detection performance exceeds chance levels. Our proposed multimodal approach and comprehensive evaluation shed light on the potential of automating deception detection from video modalities, opening avenues for future research.Comment: 29 pages, 17 figures (19 if counting subfigures

    Serum amino acid profiles and their alterations in colorectal cancer

    Get PDF
    Mass spectrometry-based serum metabolic profiling is a promising tool to analyse complex cancer associated metabolic alterations, which may broaden our pathophysiological understanding of the disease and may function as a source of new cancer-associated biomarkers. Highly standardized serum samples of patients suffering from colon cancer (n=59) and controls (n=58) were collected at the University Hospital Leipzig. We based our investigations on amino acid screening profiles using electrospray tandem-mass spectrometry. Metabolic profiles were evaluated using the Analyst 1.4.2 software. General, comparative and equivalence statistics were performed by R 2.12.2. 11 out of 26 serum amino acid concentrations were significantly different between colorectal cancer patients and healthy controls. We found a model including CEA, glycine, and tyrosine as best discriminating and superior to CEA alone with an AUROC of 0.878 (95% CI 0.815-0.941). Our serum metabolic profiling in colon cancer revealed multiple significant disease-associated alterations in the amino acid profile with promising diagnostic power. Further large-scale studies are necessary to elucidate the potential of our model also to discriminate between cancer and potential differential diagnoses. In conclusion, serum glycine and tyrosine in combination with CEA are superior to CEA for the discrimination between colorectal cancer patients and control

    Mechanism-based traps enable protease and hydrolase substrate discovery.

    Get PDF
    Hydrolase enzymes, including proteases, are encoded by 2-3% of the genes in the human genome and 14% of these enzymes are active drug targets1. However, the activities and substrate specificities of many proteases-especially those embedded in membranes-and other hydrolases remain unknown. Here we report a strategy for creating mechanism-based, light-activated protease and hydrolase substrate traps in complex mixtures and live mammalian cells. The traps capture substrates of hydrolases, which normally use a serine or cysteine nucleophile. Replacing the catalytic nucleophile with genetically encoded 2,3-diaminopropionic acid allows the first step reaction to form an acyl-enzyme intermediate in which a substrate fragment is covalently linked to the enzyme through a stable amide bond2; this enables stringent purification and identification of substrates. We identify new substrates for proteases, including an intramembrane mammalian rhomboid protease RHBDL4 (refs. 3,4). We demonstrate that RHBDL4 can shed luminal fragments of endoplasmic reticulum-resident type I transmembrane proteins to the extracellular space, as well as promoting non-canonical secretion of endogenous soluble endoplasmic reticulum-resident chaperones. We also discover that the putative serine hydrolase retinoblastoma binding protein 9 (ref. 5) is an aminopeptidase with a preference for removing aromatic amino acids in human cells. Our results exemplify a powerful paradigm for identifying the substrates and activities of hydrolase enzymes

    Pancreatic carcinoma, pancreatitis, and healthy controls: metabolite models in a three-class diagnostic dilemma

    Get PDF
    Metabolomics as one of the most rapidly growing technologies in the "-omics” field denotes the comprehensive analysis of low molecular-weight compounds and their pathways. Cancer-specific alterations of the metabolome can be detected by high-throughput mass-spectrometric metabolite profiling and serve as a considerable source of new markers for the early differentiation of malignant diseases as well as their distinction from benign states. However, a comprehensive framework for the statistical evaluation of marker panels in a multi-class setting has not yet been established. We collected serum samples of 40 pancreatic carcinoma patients, 40 controls, and 23 pancreatitis patients according to standard protocols and generated amino acid profiles by routine mass-spectrometry. In an intrinsic three-class bioinformatic approach we compared these profiles, evaluated their selectivity and computed multi-marker panels combined with the conventional tumor marker CA19-9. Additionally, we tested for non-inferiority and superiority to determine the diagnostic surplus value of our multi-metabolite marker panels. Compared to CA19-9 alone, the combined amino acid-based metabolite panel had a superior selectivity for the discrimination of healthy controls, pancreatitis, and pancreatic carcinoma patients [volume under ROC surface  (VUS)=0.891(95 % CI 0.794−0.968)]. [ {\text{volume under ROC surface}}\;\left( {\text{VUS}} \right) = 0. 8 9 1 { }\left( { 9 5\,\% {\text{ CI }}0. 7 9 4- 0. 9 6 8} \right)]. We combined highly standardized samples, a three-class study design, a high-throughput mass-spectrometric technique, and a comprehensive bioinformatic framework to identify metabolite panels selective for all three groups in a single approach. Our results suggest that metabolomic profiling necessitates appropriate evaluation strategies and—despite all its current limitations—can deliver marker panels with high selectivity even in multi-class setting
    • …
    corecore