39,625 research outputs found

    Developing a comprehensive framework for multimodal feature extraction

    Full text link
    Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions---ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package's architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability

    Avoiding Unnecessary Information Loss: Correct and Efficient Model Synchronization Based on Triple Graph Grammars

    Full text link
    Model synchronization, i.e., the task of restoring consistency between two interrelated models after a model change, is a challenging task. Triple Graph Grammars (TGGs) specify model consistency by means of rules that describe how to create consistent pairs of models. These rules can be used to automatically derive further rules, which describe how to propagate changes from one model to the other or how to change one model in such a way that propagation is guaranteed to be possible. Restricting model synchronization to these derived rules, however, may lead to unnecessary deletion and recreation of model elements during change propagation. This is inefficient and may cause unnecessary information loss, i.e., when deleted elements contain information that is not represented in the second model, this information cannot be recovered easily. Short-cut rules have recently been developed to avoid unnecessary information loss by reusing existing model elements. In this paper, we show how to automatically derive (short-cut) repair rules from short-cut rules to propagate changes such that information loss is avoided and model synchronization is accelerated. The key ingredients of our rule-based model synchronization process are these repair rules and an incremental pattern matcher informing about suitable applications of them. We prove the termination and the correctness of this synchronization process and discuss its completeness. As a proof of concept, we have implemented this synchronization process in eMoflon, a state-of-the-art model transformation tool with inherent support of bidirectionality. Our evaluation shows that repair processes based on (short-cut) repair rules have considerably decreased information loss and improved performance compared to former model synchronization processes based on TGGs.Comment: 33 pages, 20 figures, 3 table

    Automated verification of model transformations based on visual contracts

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10515-012-0102-yModel-Driven Engineering promotes the use of models to conduct the different phases of the software development. In this way, models are transformed between different languages and notations until code is generated for the final application. Hence, the construction of correct Model-to-Model (M2M) transformations becomes a crucial aspect in this approach. Even though many languages and tools have been proposed to build and execute M2M transformations, there is scarce support to specify correctness requirements for such transformations in an implementation-independent way, i.e., irrespective of the actual transformation language used. In this paper we fill this gap by proposing a declarative language for the specification of visual contracts, enabling the verification of transformations defined with any transformation language. The verification is performed by compiling the contracts into QVT to detect disconformities of transformation results with respect to the contracts. As a proof of concept, we also report on a graphical modeling environment for the specification of contracts, and on its use for the verification of transformations in several case studies.This work has been funded by the Austrian Science Fund (FWF) under grant P21374-N13, the Spanish Ministry of Science under grants TIN2008-02081 and TIN2011-24139, and the R&D programme of the Madrid Region under project S2009/TIC-1650

    Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part I: template-based generic programming

    Full text link
    An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering

    Customer churn prediction in telecom using machine learning and social network analysis in big data platform

    Full text link
    Customer churn is a major problem and one of the most important concerns for large companies. Due to the direct effect on the revenues of the companies, especially in the telecom field, companies are seeking to develop means to predict potential customer to churn. Therefore, finding factors that increase customer churn is important to take necessary actions to reduce this churn. The main contribution of our work is to develop a churn prediction model which assists telecom operators to predict customers who are most likely subject to churn. The model developed in this work uses machine learning techniques on big data platform and builds a new way of features' engineering and selection. In order to measure the performance of the model, the Area Under Curve (AUC) standard measure is adopted, and the AUC value obtained is 93.3%. Another main contribution is to use customer social network in the prediction model by extracting Social Network Analysis (SNA) features. The use of SNA enhanced the performance of the model from 84 to 93.3% against AUC standard. The model was prepared and tested through Spark environment by working on a large dataset created by transforming big raw data provided by SyriaTel telecom company. The dataset contained all customers' information over 9 months, and was used to train, test, and evaluate the system at SyriaTel. The model experimented four algorithms: Decision Tree, Random Forest, Gradient Boosted Machine Tree "GBM" and Extreme Gradient Boosting "XGBOOST". However, the best results were obtained by applying XGBOOST algorithm. This algorithm was used for classification in this churn predictive model.Comment: 24 pages, 14 figures. PDF https://rdcu.be/budK

    An Underwater SLAM System using Sonar, Visual, Inertial, and Depth Sensor

    Full text link
    This paper presents a novel tightly-coupled keyframe-based Simultaneous Localization and Mapping (SLAM) system with loop-closing and relocalization capabilities targeted for the underwater domain. Our previous work, SVIn, augmented the state-of-the-art visual-inertial state estimation package OKVIS to accommodate acoustic data from sonar in a non-linear optimization-based framework. This paper addresses drift and loss of localization -- one of the main problems affecting other packages in underwater domain -- by providing the following main contributions: a robust initialization method to refine scale using depth measurements, a fast preprocessing step to enhance the image quality, and a real-time loop-closing and relocalization method using bag of words (BoW). An additional contribution is the addition of depth measurements from a pressure sensor to the tightly-coupled optimization formulation. Experimental results on datasets collected with a custom-made underwater sensor suite and an autonomous underwater vehicle from challenging underwater environments with poor visibility demonstrate performance never achieved before in terms of accuracy and robustness
    • …
    corecore