19 research outputs found

    Técnicas big data para el procesamiento de flujos de datos masivos en tiempo real

    Get PDF
    Programa de Doctorado en Biotecnología, Ingeniería y Tecnología QuímicaLínea de Investigación: Ingeniería, Ciencia de Datos y BioinformáticaClave Programa: DBICódigo Línea: 111Machine learning techniques have become one of the most demanded resources by companies due to the large volume of data that surrounds us in these days. The main objective of these technologies is to solve complex problems in an automated way using data. One of the current perspectives of machine learning is the analysis of continuous flows of data or data streaming. This approach is increasingly requested by enterprises as a result of the large number of information sources producing time-indexed data at high frequency, such as sensors, Internet of Things devices, social networks, etc. However, nowadays, research is more focused on the study of historical data than on data received in streaming. One of the main reasons for this is the enormous challenge that this type of data presents for the modeling of machine learning algorithms. This Doctoral Thesis is presented in the form of a compendium of publications with a total of 10 scientific contributions in International Conferences and journals with high impact index in the Journal Citation Reports (JCR). The research developed during the PhD Program focuses on the study and analysis of real-time or streaming data through the development of new machine learning algorithms. Machine learning algorithms for real-time data consist of a different type of modeling than the traditional one, where the model is updated online to provide accurate responses in the shortest possible time. The main objective of this Doctoral Thesis is the contribution of research value to the scientific community through three new machine learning algorithms. These algorithms are big data techniques and two of them work with online or streaming data. In this way, contributions are made to the development of one of the current trends in Artificial Intelligence. With this purpose, algorithms are developed for descriptive and predictive tasks, i.e., unsupervised and supervised learning, respectively. Their common idea is the discovery of patterns in the data. The first technique developed during the dissertation is a triclustering algorithm to produce three-dimensional data clusters in offline or batch mode. This big data algorithm is called bigTriGen. In a general way, an evolutionary metaheuristic is used to search for groups of data with similar patterns. The model uses genetic operators such as selection, crossover, mutation or evaluation operators at each iteration. The goal of the bigTriGen is to optimize the evaluation function to achieve triclusters of the highest possible quality. It is used as the basis for the second technique implemented during the Doctoral Thesis. The second algorithm focuses on the creation of groups over three-dimensional data received in real-time or in streaming. It is called STriGen. Streaming modeling is carried out starting from an offline or batch model using historical data. As soon as this model is created, it starts receiving data in real-time. The model is updated in an online or streaming manner to adapt to new streaming patterns. In this way, the STriGen is able to detect concept drifts and incorporate them into the model as quickly as possible, thus producing triclusters in real-time and of good quality. The last algorithm developed in this dissertation follows a supervised learning approach for time series forecasting in real-time. It is called StreamWNN. A model is created with historical data based on the k-nearest neighbor or KNN algorithm. Once the model is created, data starts to be received in real-time. The algorithm provides real-time predictions of future data, keeping the model always updated in an incremental way and incorporating streaming patterns identified as novelties. The StreamWNN also identifies anomalous data in real-time allowing this feature to be used as a security measure during its application. The developed algorithms have been evaluated with real data from devices and sensors. These new techniques have demonstrated to be very useful, providing meaningful triclusters and accurate predictions in real time.Universidad Pablo de Olavide de Sevilla. Departamento de Deporte e informátic

    Data Management for Dynamic Multimedia Analytics and Retrieval

    Get PDF
    Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other

    Thermodynamic Modeling and Thermoeconomic Optimization of Integrated Trigeneration Plants Using Organic Rankine Cycles

    Get PDF
    In this study, the feasibility of using an organic Rankine cycle (ORC) in trigeneration plants is examined through thermodynamic modeling and thermoeconomic optimization. Three novel trigeneration systems are considered. Each one of these systems consists of an ORC, a heating-process heat exchanger, and a single-effect absorption chiller. The three systems are distinguished by the source of the heat input to the ORC. The systems considered are SOFC-trigeneration, biomass- trigeneration, and solar-trigeneration systems. For each system four cases are considered: electrical-power, cooling-cogeneration, heating-cogeneration, and trigeneration cases. Comprehensive thermodynamic analysis on each system is carried out. Furthermore, thermoeconomic optimization is conducted. The objective of the thermoeconomic optimization is to minimize the cost per exergy unit of the trigeneration product. The results of the thermoeconomic optimization are used to compare the three systems through thermodynamic and thermoeconomic analyses. This study illustrates key output parameters to assess the trigeneration systems considered. These parameters are energy efficiency, exergy efficiency, net electrical power, electrical to cooling ratio, and electrical to heating ratio. Moreover, exergy destruction modeling is conducted to identify and quantify the major sources of exergy destruction in the systems considered. In addition, an environmental impact assessment is conducted to quantify the amount of CO2 emissions in the systems considered. Furthermore, this study examines both the cost rate and cost per exergy unit of the electrical power and other trigeneration products. This study reveals that there is a considerable efficiency improvement when trigeneration is used, as compared to only electrical power production. In addition, the emissions of CO2 per MWh of trigeneration are significantly lower than that of electrical power. It was shown that the exergy destruction rates of the ORC evaporators for the three systems are quite high. Therefore, it is important to consider using more efficient ORC evaporators in trigeneration plants. In addition, this study reveals that the SOFC-trigeneration system has the highest electrical energy efficiency while the biomass-trigeneration system and the solar mode of the solar trigeneration system have the highest trigeneration energy efficiencies. In contrast, the SOFC-trigeneration system has the highest exergy efficiency for both electrical and trigeneration cases. Furthermore, the thermoeconomic optimization shows that the solar-trigeneration system has the lowest cost per exergy unit. Meanwhile the solar-trigeneration system has zero CO2 emissions and depends on a free renewable energy source. Therefore, it can be concluded that the solar-trigeneration system has the best thermoeconomic performance among the three systems considered

    Marine Power Systems

    Get PDF
    Marine power systems have been designed to be a safer alternative to stationary plants in order to adhere to the regulations of classification societies. Marine steam boilers recently achieved 10 MPa pressure, in comparison to stationary plants, where a typical boiler pressure of 17 MPa was the standard for years. The latest land-based, ultra-supercritical steam boilers reach 25 MPa pressure and 620 °C temperatures, which increases plant efficiency and reduces fuel consumption. There is little chance that such a plant concept could be applied to ships. The reliability of marine power systems has to be higher due to the lack of available spare parts and services that are available for shore power systems. Some systems are still very expensive and are not able to be widely utilized for commercial merchant fleets such as COGAS, mainly due to the high cost of gas turbines. Submarine vehicles are also part of marine power systems, which have to be reliable and accurate in their operation due to their distant control centers. Materials that are used in marine environments are prone to faster corrosive wear, so special care also should be taken in this regard. The main aim of this Special Issue is to discuss the options and possibilities of utilizing energy in a more economical way, taking into account the reliability of such a system in operation

    On Signal Transduction in Human Embryonic Stem Cells: Towards a Systems View

    Get PDF
    Human embryonic stem cells (hESC) have been a major cell source for research in regenerative medicine due to the demonstration of properties of self-renewal and efficient lineage specific differentiation, both on additions of external cues. Self-renewal provides the potential to extract large quantities of naïve cells that can then be differentiated to clinically relevant mature lineages. While there exists significant proof-of-concept to transform stem cells to the desired lineage, generating fully functional cell types is still an unmet challenge. A major reason for this is our limited understanding of the complexity of the transformation process. The overarching goal of this PhD research was to provide strategies to bring mathematical modeling into the realm of stem cell research, particularly to analyze the complex regulatory network of signaling events controlling cell fate. This work focused on the signaling pathways that in concert control the balance of self-renewal and endoderm differentiation of hESCs. We proposed a framework for developing mechanistic understanding from disparate signaling pathways using combinations of data-driven and equation based models. As a first step, we analyzed growth factor mediated PI3K/AKT pathway that must remain highly active to inhibit differentiation in self-renewal state. Using an integrated approach of mechanistic modeling, systems analysis and experimental validation we identified the role of a regulatory process (negative feedback) in maintaining signal amplitudes and controlling the propagation of parameter uncertainty down the pathway in the self-renewal state. To analyze endoderm differentiation, biclustering with bootstrapping formulation was used to identify co-regulated transcription factor patterns under a combinatorial modulation of endoderm inducing signaling pathways. In the final step, a detailed mechanistic analysis was done to characterize the dynamic features of TGF-β/SMAD pathway for inducing endoderm. Utilizing a dynamic Bayesian network formulism, AKT mediated crosstalk connections were inferred from the detailed time series data. Modeling of competing AKT-SMAD interactions followed by parametric ensemble analysis enabled identification of plausible hypotheses that could explain experimental observations. Using our integrated approach, we can now begin to rationally optimize for desirable fate of hESCs with reduced variability and accelerate the path towards therapeutic applications of hESCs

    23-035-B

    Get PDF
    The Annual Wheat Newsletter is edited by W.J. Raupp and published by the Wheat Genetic and Genomic Resources Center at Kansas State University. The scope of the Newsletter includes current project activities, cultivar releases, special reports, and publications of wheat researchers worldwide. The Newsletter annually has over 100 contributors from more than 30 countries

    Authentic self, incongruent acoustics : a corpus-based sociophonetic analysis of nonbinary speech.

    Get PDF
    This thesis examines the ways six nonbinary speakers in Christchurch, New Zealand present their gender identity via speech. It examines their productions in reference to both established trends in the literature, as well as speech collected from ten binary speakers (5M, 5F) at the same time. It seeks to examine whether, in addition to encoding binary gender, speech also encodes nonbinary gender. Three hypotheses are proposed and tested across multiple linguistic variables. The first hypothesis regards acoustic incongruence, and posits that nonbinary speakers may assert their nonbinary identities via speech that utilises particular combinations of variables which create either ambiguity or dissonance in regards to established binary-gender norms. Ambiguous gender incongruence arises from the use of speech that is neither reliably perceived as female, nor reliably perceived as male. Dissonant gender incongruence arises from the use of speech that is reliably perceived as both male and female. The second hypothesis predicts that nonbinary speakers will show greater variation in speech based on immediate contextual factors, compared to binary speakers. This difference is hypothesised to be due to to nonbinary speakers paying greater attention to production, and the greater degree of variation in their own speech over time compared to binary speakers. Hypothesis 3 predicts that nonbinary speakers are not a uniform population, and that their use of incongruence will be influenced extensively by their individual condition, including their professed speech goals, history, and gender identity. The hypotheses are tested quantitatively in regards to five linguistic variables: Pitch, pitch range, monophthong production, Vowel Space Area (VSA), and intervocalic /t/ frication rates. The interaction between multiple variables together is also considered. In-depth examinations of the variation utilised by a single speaker in the form of "Spotlights" address the hypotheses from a qualitative perspective. Overall, the thesis finds some evidence for Hypothesis 1. In every linguistic variable examined, nonbinary speakers show some distinction from binary speakers that is not explained fully via speaker Assigned Sex at Birth (ASAB). Some binary speakers also seem to produce incongruence, particularly binary women and particularly within single variables. The small scale of the study presents a limitation in addressing Hypothesis 2, but avenues for future work are identified. The qualitative evidence provides strong support for Hypothesis 3, in the examination of individual nonbinary speakers and the way their measured productions support their professed speech goals and identities. Overall, this dissertation presents one of the first comparative analyses of nonbinary speech, and presents a number of novel approaches to examining phonetic data from a statistical perspective that still accommodates an analysis of individual agency and goals in identity building

    22-063-B

    Get PDF
    The Annual Wheat Newsletter is edited by W.J. Raupp and published by the Wheat Genetic and Genomic Resources Center at Kansas State University. The scope of the Newsletter includes current project activities, cultivar releases, special reports, and publications of wheat researchers worldwide. The Newsletter annually has over 100 contributors from more than 30 countries
    corecore