1,111,357 research outputs found

    Time Series Data Mining Algorithms for Identifying Short RNA in Arabidopsis thaliana

    Get PDF
    The class of molecules called short RNAs (sRNAs) are known to play a key role in gene regulation. Th are typically sequences of nucleotides between 21-25 nucleotides in length. They are known to play a key role in gene regulation. The identification, clustering and classification of sRNA has recently become the focus of much research activity. The basic problem involves detecting regions of interest on the chromosome where the pattern of candidate matches is somehow unusual. Currently, there are no published algorithms for detecting regions of interest, and the unpublished methods that we are aware of involve bespoke rule based systems designed for a specific organism. Work in this very new field has understandably focused on the outcomes rather than the methods used to obtain the results. In this paper we propose two generic approaches that place the specific biological problem in the wider context of time series data mining problems. Both methods are based on treating the occurrences on a chromosome, or “hit count” data, as a time series, then running a sliding window along a chromosome and measuring unusualness. This formulation means we can treat finding unusual areas of candidate RNA activity as a variety of time series anomaly detection problem. The first set of approaches is model based. We specify a null hypothesis distribution for not being a sRNA, then estimate the p-values along the chromosome. The second approach is instance based. We identify some typical shapes from known sRNA, then use dynamic time warping and fourier trans-form based distance to measure how closely the candidate series matches. We demonstrate that these methods can find known sRNA on Arabidopsis thaliana chromosomes and illustrate the benefits of the added information provided by these algorithms

    ADVANCED FAULT AREA IDENTIFICATION AND FAULT LOCATION FOR TRANSMISSION AND DISTRIBUTION SYSTEMS

    Get PDF
    Fault location reveals the exact information needed for utility crews to timely and promptly perform maintenance and system restoration. Therefore, accurate fault location is a key function in reducing outage time and enhancing power system reliability. Modern power systems are witnessing a trend of integrating more distributed generations (DG) into the grid. DG power outputs may be intermittent and can no longer be treated as constants in fault location method development. DG modeling is also difficult for fault location purpose. Moreover, most existing fault location methods are not applicable to simultaneous faults. To solve the challenges, this dissertation proposes three impedance-based fault location algorithms to pinpoint simultaneous faults for power transmission systems and distribution systems with high penetration of DGs. The proposed fault location algorithms utilize the voltage and/or current phasors that are captured by phasor measurement units. Bus impedance matrix technique is harnessed to establish the relationship between the measurements and unknown simultaneous fault locations. The distinct features of the proposed algorithms are that no fault types and fault resistances are needed to determine the fault locations. In particular, Type I and Type III algorithms do not need the information of source impedances and prefault measurements to locate the faults. Moreover, the effects of shunt capacitance are fully considered to improve fault location accuracy. The proposed algorithms for distribution systems are validated by evaluation studies using Matlab and Simulink SimPowerSystems on a 21 bus distribution system and the modified IEEE 34 node test system. Type II fault location algorithm for transmission systems is applicable to untransposed lines and is validated by simulation studies using EMTP on a 27 bus transmission system. Fault area identification method is proposed to reduce the number of line segments to be examined for fault location. In addition, an optimal fault location method that can identify possible bad measurement is proposed for enhanced fault location estimate. Evaluation studies show that the optimal fault location method is accurate and effective. The proposed algorithms can be integrated into the existing energy management system for enhanced fault management capability for power systems

    Forwarder independent tracking systems : problem description and solution design proposal

    Get PDF
    This thesis provides a concise review of previous literature discussing tracking and expands on it based on empirical observations and proposed solution design. The aim of the thesis is to describe why tracking has become an increasingly important area in logistics management, identify the key problems with current designs of tracking systems, and develop and test a solution proposal that addresses the identified problems. Current tracking systems are difficult to set up in short-term multi-company networks. This is a problem for example in project-oriented industries and when utilising spot markets for logistics services. The proposed Forwarder Independent Tracking (FIT) solution concept was developed and tested in four case studies, two of which included a pilot implementation. The main data collection methods of the case studies have been active involvement in the planning and the carrying out of the pilot implementations, acting as a helpdesk in the pilots, observation, constant contact with key personnel of the case companies, and semi structured interviews. The studies show that FIT can be used to produce and disseminate reliable tracking and inventory transparency data in short-term multi-company distribution networks. Tracking systems built according to the FIT solution concept offer a possibility to gather tracking information from logistics service providers without a priori integration. The FIT solution concept also provides logistics companies currently without tracking systems (e.g. small, local companies) a possibility to offer tracking information to their customers. The thesis concludes that owing to these properties, the proposed FIT solution concept can have a high significance also in stable distribution networks. The thesis also examines when Radio Frequency Identification (RFID) technology offers the most benefits in tracking, and concludes that the benefits are linked to the efficiency and security of physical identification, and that RFID is not needed for implementing the FIT concept.reviewe

    Unravelling migratory connectivity in marine turtles using multiple methods

    Get PDF
    Comprehensive knowledge of the fundamental spatial ecology of marine species is critical to allow the identification of key habitats and the likely sources of anthropogenic threats, thus informing effective conservation strategies. 2. Research on migratory marine vertebrates has lagged behind many similar terrestrial animal groups, but studies using electronic tagging systems and molecular techniques offer great insights. 3. Marine turtles have complex life history patterns, spanning wide spatio-temporal scales. As a result of this multidimensional complexity, and despite extensive effort, there are no populations for which a truly holistic understanding of the spatial aspects of the life history has been attained. There is a particular lack of information regarding the distribution and habitats utilized during the first few years of life. 4. We used satellite tracking technology to track individual turtles following nesting at the green turtle Chelonia mydas nesting colony at Poilão Island, Guinea Bissau; the largest breeding aggregation in the eastern Atlantic. 5. We further contextualize these data with pan-Atlantic molecular data and oceanographic current modelling to gain insights into likely dispersal patterns of hatchlings and small pelagic juveniles. 6. All adult turtles remained in the waters of West Africa, with strong connectivity demonstrated with Banc D’Arguin, Mauritania. 7. Despite shortcomings in current molecular markers, we demonstrate evidence for profound sub-structuring of marine turtle stocks across the Atlantic; with a high likelihood based on oceanographic modelling that most turtles from Guinea-Bissau are found in the eastern Atlantic. 8. Synthesis and applications. There is an increased need for a better understanding of spatial distribution of marine vertebrates demonstrating life histories with spatio-temporal complexity. We propose the synergistic use of the technologies and modelling used here as a working framework for the future rapid elucidation of the range and likely key habitats used by the different life stages from such species

    Development and implementation of an HIV/AIDS trials management system: A geographical information systems approach

    Get PDF
    Introduction. Researchers, practitioners and policymakers make decisions at all levels – from local to international. Accessible, integrated and up-to-date evidence is essential for successful and responsive decision-making. A current trials register of randomised and clinically controlled trials of HIV/AIDS interventions can provide invaluable information to decision-making processes. Using the newly emerging geographical information systems (GIS) technology, we have developed a tool which assists such decisions. Objective. To demonstrate how the tool provides consistent, quantitative information in an accessible format, making it a key tool in evidence-based decision-making. Methods. We identified all HIV/AIDS trials in relation to publications for the period 1980 - 2007, using both electronic and manual search methods. To facilitate searching the trials register, studies were coded by using a comprehensive but user-friendly coding sheet. We captured the geographical co-ordinates for each trial and used the ArcGIS 9 mapping software to design and develop a geodatabase of trials. Results. The geodatabase delivered the complete requirements for a data-driven information system, featuring the following functions: (i) a clear display of the spatial distribution of HIV/AIDS trials around the world; (ii) identification of and access to information about any particular trial on a map; and (iii) a global resource of potential information on the safety and efficacy of prevention and treatment measures. Conclusions. The building of a functioning HIV/AIDS trials management system can provide policymakers, researchers and practitioners with accessible, integrated and up-to-date evidence that is essential to successful and dynamic decision-making. Southern African Journal of HIV Medicine Vol. 9 (2) 2008: pp. 58-6

    Content-based indexing of low resolution documents

    Get PDF
    In any multimedia presentation, the trend for attendees taking pictures of slides that interest them during the presentation using capturing devices is gaining popularity. To enhance the image usefulness, the images captured could be linked to image or video database. The database can be used for the purpose of file archiving, teaching and learning, research and knowledge management, which concern image search. However, the above-mentioned devices include cameras or mobiles phones have low resolution resulted from poor lighting and noise. Content-Based Image Retrieval (CBIR) is considered among the most interesting and promising fields as far as image search is concerned. Image search is related with finding images that are similar for the known query image found in a given image database. This thesis concerns with the methods used for the purpose of identifying documents that are captured using image capturing devices. In addition, the thesis also concerns with a technique that can be used to retrieve images from an indexed image database. Both concerns above apply digital image processing technique. To build an indexed structure for fast and high quality content-based retrieval of an image, some existing representative signatures and the key indexes used have been revised. The retrieval performance is very much relying on how the indexing is done. The retrieval approaches that are currently in existence including making use of shape, colour and texture features. Putting into consideration these features relative to individual databases, the majority of retrievals approaches have poor results on low resolution documents, consuming a lot of time and in the some cases, for the given query image, irrelevant images are obtained. The proposed identification and indexing method in the thesis uses a Visual Signature (VS). VS consists of the captures slides textual layout’s graphical information, shape’s moment and spatial distribution of colour. This approach, which is signature-based are considered for fast and efficient matching to fulfil the needs of real-time applications. The approach also has the capability to overcome the problem low resolution document such as noisy image, the environment’s varying lighting conditions and complex backgrounds. We present hierarchy indexing techniques, whose foundation are tree and clustering. K-means clustering are used for visual features like colour since their spatial distribution give a good image’s global information. Tree indexing for extracted layout and shape features are structured hierarchically and Euclidean distance is used to get similarity image for CBIR. The assessment of the proposed indexing scheme is conducted based on recall and precision, a standard CBIR retrieval performance evaluation. We develop CBIR system and conduct various retrieval experiments with the fundamental aim of comparing the accuracy during image retrieval. A new algorithm that can be used with integrated visual signatures, especially in late fusion query was introduced. The algorithm has the capability of reducing any shortcoming associated with normalisation in initial fusion technique. Slides from conferences, lectures and meetings presentation are used for comparing the proposed technique’s performances with that of the existing approaches with the help of real data. This finding of the thesis presents exciting possibilities as the CBIR systems is able to produce high quality result even for a query, which uses low resolution documents. In the future, the utilization of multimodal signatures, relevance feedback and artificial intelligence technique are recommended to be used in CBIR system to further enhance the performance

    Modern Power System Dynamic Performance Improvement through Big Data Analysis

    Get PDF
    Higher penetration of Renewable Energy (RE) is causing generation uncertainty and reduction of system inertia for the modern power system. This phenomenon brings more challenges on the power system dynamic behavior, especially the frequency oscillation and excursion, voltage and transient stability problems. This dissertation work extracts the most useful information from the power system features and improves the system dynamic behavior by big data analysis through three aspects: inertia distribution estimation, actuator placement, and operational studies.First of all, a pioneer work for finding the physical location of COI in the system and creating accurate and useful inertia distribution map is presented. Theoretical proof and dynamic simulation validation have been provided to support the proposed method for inertia distribution estimation based on measurement PMU data. Estimation results are obtained for a radial system, a meshed system, IEEE 39 bus-test system, the Chilean system, and a real utility system in the US. Then, this work provided two control actuator placement strategy using measurement data samples and machine learning algorithms. The first strategy is for the system with single oscillation mode. Control actuators should be placed at the bus that are far away from the COI bus. This rule increased damping ratio of eamples systems up to 14\% and hugely reduced the computational complexity from the simulation results of the Chilean system. The second rule is created for system with multiple dynamic problems. General and effective guidance for planners is obtained for IEEE 39-bus system and IEEE 118-bus system using machine learning algorithms by finding the relationship between system most significant features and system dynamic performance. Lastly, it studied the real-time voltage security assessment and key link identification in cascading failure analysis. A proposed deep-learning framework has Achieved the highest accuracy and lower computational time for real-time security analysis. In addition, key links are identified through distance matrix calculation and probability tree generation using 400,000 data samples from the Western Electricity Coordinating Council (WECC) system
    • …
    corecore