534 research outputs found

    Diapause vs. reproductive programs: transcriptional phenotypes in a keystone copepod

    Get PDF
    © The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Lenz, P. H., Roncalli, V., Cieslak, M. C., Tarrant, A. M., Castelfranco, A. M., & Hartline, D. K. Diapause vs. reproductive programs: transcriptional phenotypes in a keystone copepod. Communications Biology, 4(1), (2021): 426, https://doi.org/10.1038/s42003-021-01946-0.Many arthropods undergo a seasonal dormancy termed “diapause” to optimize timing of reproduction in highly seasonal environments. In the North Atlantic, the copepod Calanus finmarchicus completes one to three generations annually with some individuals maturing into adults, while others interrupt their development to enter diapause. It is unknown which, why and when individuals enter the diapause program. Transcriptomic data from copepods on known programs were analyzed using dimensionality reduction of gene expression and functional analyses to identify program-specific genes and biological processes. These analyses elucidated physiological differences and established protocols that distinguish between programs. Differences in gene expression were associated with maturation of individuals on the reproductive program, while those on the diapause program showed little change over time. Only two of six filters effectively separated copepods by developmental program. The first one included all genes annotated to RNA metabolism and this was confirmed using differential gene expression analysis. The second filter identified 54 differentially expressed genes that were consistently up-regulated in individuals on the diapause program in comparison with those on the reproductive program. Annotated to oogenesis, RNA metabolism and fatty acid biosynthesis, these genes are both indicators for diapause preparation and good candidates for functional studies.This work was supported by National Science Foundation Grants (NSF) OCE-1459235 and OCE-1756767 to P.H.L., D.K.H. and AE Christie and OPP-1746087 to A.M.T

    ANALYZING TEMPORAL PATTERNS IN PHISHING EMAIL TOPICS

    Get PDF
    In 2020, the Federal Bureau of Investigation (FBI) found phishing to be the most common cybercrime, with a record number of complaints from Americans reporting losses exceeding $4.1 billion. Various phishing prevention methods exist; however, these methods are usually reactionary in nature as they activate only after a phishing campaign has been launched. Priming people ahead of time with the knowledge of which phishing topic is more likely to occur could be an effective proactive phishing prevention strategy. It has been noted that the volume of phishing emails tended to increase around key calendar dates and during times of uncertainty. This thesis aimed to create a classifier to predict which phishing topics have an increased likelihood of occurring in reference to an external event. After distilling around 1.2 million phishes until only meaningful words remained, a Latent Dirichlet allocation (LDA) topic model uncovered 90 latent phishing topics. On average, human evaluators agreed with the composition of a topic 74% of the time in one of the phishing topic evaluation tasks, showing an accordance of human judgment to the topics produced by the LDA model. Each topic was turned into a timeseries by creating a frequency count over the dataset’s two-year timespan. This time-series was changed into an intensity count to highlight the days of increased phishing activity. All phishing topics were analyzed and reviewed for influencing events. After the review, ten topics were identified to have external events that could have possibly influenced their respective intensities. After performing the intervention analysis, none of the selected topics were found to correlate with the identified external event. The analysis stopped here, and no predictive classifiers were pursued. With this dataset, temporal patterns coupled with external events were not able to predict the likelihood of a phishing attack

    Automated early plant disease detection and grading system: Development and implementation

    Get PDF
    As the agriculture industry grows, many attempts have been made to ensure high quality of produce. Diseases and defects found in plants and crops, affect the agriculture industry greatly. Hence, many techniques and technologies have been developed to help solving or reducing the impact of plant diseases. Imagining analysis tools, and gas sensors are becoming more frequently integrated into smart systems for plant disease detection. Many disease detection systems incorporate imaging analysis tools and Volatile Organic Compound (VOC) profiling techniques to detect early symptoms of diseases and defects of plants, fruits and vegetative produce. These disease detection techniques can be further categorized into two main groups; preharvest disease detection and postharvest disease detection techniques. This thesis aims to introduce the available disease detection techniques and to compare it with the latest innovative smart systems that feature visible imaging, hyperspectral imaging, and VOC profiling. In addition, this thesis incorporates the use of image analysis tools and k-means segmentation to implement a preharvest Offline and Online disease detection system. The Offline system to be used by pathologists and agriculturists to measure plant leaf disease severity levels. K-means segmentation and triangle thresholding techniques are used together to achieve good background segmentation of leaf images. Moreover, a Mamdani-Type Fuzzy Logic classification technique is used to accurately categorize leaf disease severity level. Leaf images taken from a real field with varying resolutions were tested using the implemented system to observe its effect on disease grade classification. Background segmentation using k-means clustering and triangle thresholding proved to be effective, even in non-uniform lighting conditions. Integration of a Fuzzy Logic system for leaf disease severity level classification yielded in classification accuracies of 98%. Furthermore, a robot is designed and implemented as a robotized Online system to provide field based analysis of plant health using visible and near infrared spectroscopy. Fusion of visible and near infrared images are used to calculate the Normalized Deference Vegetative Index (NDVI) to measure and monitor plant health. The robot is designed to have the functionality of moving across a specified path within an agriculture field and provide health information of leaves as well as position data. The system was tested in a tomato greenhouse under real field conditions. The developed system proved effective in accurately classifying plant health into one of 3 classes; underdeveloped, unhealthy, and healthy with an accuracy of 83%. A map with plant health and locations is produced for farmers and agriculturists to monitor the plant health across different areas. This system has the capability of providing early vital health analysis of plants for immediate action and possible selective pesticide spraying

    Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes

    Full text link
    Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model's prediction uncertainty to identify critical areas for informative data collection. Gaussian Processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary -- different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple, robust, and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.Comment: International Journal of Robotics Research (IJRR). arXiv admin note: text overlap with arXiv:2205.0642

    Automated classification of transients in optical time-domain surveys

    Get PDF
    Repeated sky surveys in the past decade have led to the proliferation in the discovery of transients. This has not come without its own challenges: the rate of discovery from current sky surveys greatly exceeds the human capacity to manually identify and classify newly discovered objects. The use of machine learning approaches to automate the discovery process with little to no human intervention is rapidly becoming a standard practice in surveys. In this thesis we present the use of machine learning to classify objects discovered by sky surveys observing at optical wavelengths. The Gravitational-wave Optical Transient Observer (GOTO) is a survey with the aim of searching for the optical counterparts to gravitational waves, while also scanning the night sky for transients and variable objects. We use both a machine learning and a deep learning approach to classify objects observed by GOTO using their light curves, and compare the effectiveness and limitations of both methods for photometric classification. We find that using a deep learning approach with recurrent neural networks works best to reliably classify objects using their light curves in real-time. We investigate the use of Gaussian processes to create uniform representations of supernova light curves from different surveys. These are then used with a convolutional neural network for classification into supernova sub-types. Future surveys will have a lack of labelled data to train classifiers. We use transfer learning to show how data from another survey can be used to train a classifier for a new survey. Machine learning is a widely used methodology, and has uses in other fields of research. We show how the classifiers developed for GOTO light curve classification can be adapted for other classification tasks, and how they perform on these tasks

    Integrating optics and microfluidics to automatically identify algae species

    Get PDF

    Message traceback systems dancing with the devil

    Get PDF
    The research community has produced a great deal of work in recent years in the areas of IP, layer 2 and connection-chain traceback. We collectively designate these as message traceback systems which, invariably aim to locate the origin of network data, in spite of any alterations effected to that data (whether legitimately or fraudulently). This thesis provides a unifying definition of spoofing and a classification based on this which aims to encompass all streams of message traceback research. The feasibility of this classification is established through its application to our literature review of the numerous known message traceback systems. We propose two layer 2 (L2) traceback systems, switch-SPIE and COTraSE, which adopt different approaches to logging based L2 traceback for switched ethernet. Whilst message traceback in spite of spoofing is interesting and perhaps more challenging than at first seems, one might say that it is rather academic. Logging of network data is a controversial and unpopular notion and network administrators don't want the added installation and maintenance costs. However, European Parliament Directive 2006/24/EC requires that providers of publicly available electronic communications networks retain data in a form similar to mobile telephony call records, from April 2009 and for periods of up to 2 years. This thesis identifies the relevance of work in all areas of message traceback to the European data retention legislation. In the final part of this thesis we apply our experiences with L2 traceback, together with our definitions and classification of spoofing to discuss the issues that EU data retention implementations should consider. It is possible to 'do logging right' and even safeguard user privacy. However this can only occur if we fully understand the technical challenges, requiring much further work in all areas of logging based, message traceback systems. We have no choice but to dance with the devil.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex
    corecore