13,788 research outputs found

    Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control

    Full text link
    This paper provides an overview of the current state-of-the-art in selective harvesting robots (SHRs) and their potential for addressing the challenges of global food production. SHRs have the potential to increase productivity, reduce labour costs, and minimise food waste by selectively harvesting only ripe fruits and vegetables. The paper discusses the main components of SHRs, including perception, grasping, cutting, motion planning, and control. It also highlights the challenges in developing SHR technologies, particularly in the areas of robot design, motion planning and control. The paper also discusses the potential benefits of integrating AI and soft robots and data-driven methods to enhance the performance and robustness of SHR systems. Finally, the paper identifies several open research questions in the field and highlights the need for further research and development efforts to advance SHR technologies to meet the challenges of global food production. Overall, this paper provides a starting point for researchers and practitioners interested in developing SHRs and highlights the need for more research in this field.Comment: Preprint: to be appeared in Journal of Field Robotic

    The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions

    Full text link
    The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution

    Corporate Social Responsibility: the institutionalization of ESG

    Get PDF
    Understanding the impact of Corporate Social Responsibility (CSR) on firm performance as it relates to industries reliant on technological innovation is a complex and perpetually evolving challenge. To thoroughly investigate this topic, this dissertation will adopt an economics-based structure to address three primary hypotheses. This structure allows for each hypothesis to essentially be a standalone empirical paper, unified by an overall analysis of the nature of impact that ESG has on firm performance. The first hypothesis explores the evolution of CSR to the modern quantified iteration of ESG has led to the institutionalization and standardization of the CSR concept. The second hypothesis fills gaps in existing literature testing the relationship between firm performance and ESG by finding that the relationship is significantly positive in long-term, strategic metrics (ROA and ROIC) and that there is no correlation in short-term metrics (ROE and ROS). Finally, the third hypothesis states that if a firm has a long-term strategic ESG plan, as proxied by the publication of CSR reports, then it is more resilience to damage from controversies. This is supported by the finding that pro-ESG firms consistently fared better than their counterparts in both financial and ESG performance, even in the event of a controversy. However, firms with consistent reporting are also held to a higher standard than their nonreporting peers, suggesting a higher risk and higher reward dynamic. These findings support the theory of good management, in that long-term strategic planning is both immediately economically beneficial and serves as a means of risk management and social impact mitigation. Overall, this contributes to the literature by fillings gaps in the nature of impact that ESG has on firm performance, particularly from a management perspective

    Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction

    Full text link
    Saliency Prediction aims to predict the attention distribution of human eyes given an RGB image. Most of the recent state-of-the-art methods are based on deep image feature representations from traditional CNNs. However, the traditional convolution could not capture the global features of the image well due to its small kernel size. Besides, the high-level factors which closely correlate to human visual perception, e.g., objects, color, light, etc., are not considered. Inspired by these, we propose a Transformer-based method with semantic segmentation as another learning objective. More global cues of the image could be captured by Transformer. In addition, simultaneously learning the object segmentation simulates the human visual perception, which we would verify in our investigation of human gaze control in cognitive science. We build an extra decoder for the subtask and the multiple tasks share the same Transformer encoder, forcing it to learn from multiple feature spaces. We find in practice simply adding the subtask might confuse the main task learning, hence Multi-task Attention Module is proposed to deal with the feature interaction between the multiple learning targets. Our method achieves competitive performance compared to other state-of-the-art methods

    A real-time smart sensing system for automatic localization and recognition of vegetable plants for weed control

    Get PDF
    Tomato is a globally grown vegetable crop with high economic and nutritional values. Tomato production is being threatened by weeds. This effect is more pronounced in the early stages of tomato plant growth. Thus weed management in the early stages of tomato plant growth is very critical. The increasing labor cost of manual weeding and the negative impact on human health and the environment caused by the overuse of herbicides are driving the development of smart weeders. The core task that needs to be addressed in developing a smart weeder is to accurately distinguish vegetable crops from weeds in real time. In this study, a new approach is proposed to locate tomato and pakchoi plants in real time based on an integrated sensing system consisting of camera and color mark sensors. The selection scheme of reference, color, area, and category of plant labels for sensor identification was examined. The impact of the number of sensors and the size of the signal tolerance region on the system recognition accuracy was also evaluated. The experimental results demonstrated that the color mark sensor using the main stem of tomato as the reference exhibited higher performance than that of pakchoi in identifying the plant labels. The scheme of applying white topical markers on the lower main stem of the tomato plant is optimal. The effectiveness of the six sensors used by the system to detect plant labels was demonstrated. The computer vision algorithm proposed in this study was specially developed for the sensing system, yielding the highest overall accuracy of 95.19% for tomato and pakchoi localization. The proposed sensor-based system is highly accurate and reliable for automatic localization of vegetable plants for weed control in real time

    Generalized Relation Modeling for Transformer Tracking

    Full text link
    Compared with previous two-stream trackers, the recent one-stream tracking pipeline, which allows earlier interaction between the template and search region, has achieved a remarkable performance gain. However, existing one-stream trackers always let the template interact with all parts inside the search region throughout all the encoder layers. This could potentially lead to target-background confusion when the extracted feature representations are not sufficiently discriminative. To alleviate this issue, we propose a generalized relation modeling method based on adaptive token division. The proposed method is a generalized formulation of attention-based relation modeling for Transformer tracking, which inherits the merits of both previous two-stream and one-stream pipelines whilst enabling more flexible relation modeling by selecting appropriate search tokens to interact with template tokens. An attention masking strategy and the Gumbel-Softmax technique are introduced to facilitate the parallel computation and end-to-end learning of the token division module. Extensive experiments show that our method is superior to the two-stream and one-stream pipelines and achieves state-of-the-art performance on six challenging benchmarks with a real-time running speed.Comment: Accepted by CVPR 2023. Code and models are publicly available at https://github.com/Little-Podi/GR

    Examples of works to practice staccato technique in clarinet instrument

    Get PDF
    Klarnetin staccato tekniğini güçlendirme aşamaları eser çalışmalarıyla uygulanmıştır. Staccato geçişlerini hızlandıracak ritim ve nüans çalışmalarına yer verilmiştir. Çalışmanın en önemli amacı sadece staccato çalışması değil parmak-dilin eş zamanlı uyumunun hassasiyeti üzerinde de durulmasıdır. Staccato çalışmalarını daha verimli hale getirmek için eser çalışmasının içinde etüt çalışmasına da yer verilmiştir. Çalışmaların üzerinde titizlikle durulması staccato çalışmasının ilham verici etkisi ile müzikal kimliğe yeni bir boyut kazandırmıştır. Sekiz özgün eser çalışmasının her aşaması anlatılmıştır. Her aşamanın bir sonraki performans ve tekniği güçlendirmesi esas alınmıştır. Bu çalışmada staccato tekniğinin hangi alanlarda kullanıldığı, nasıl sonuçlar elde edildiği bilgisine yer verilmiştir. Notaların parmak ve dil uyumu ile nasıl şekilleneceği ve nasıl bir çalışma disiplini içinde gerçekleşeceği planlanmıştır. Kamış-nota-diyafram-parmak-dil-nüans ve disiplin kavramlarının staccato tekniğinde ayrılmaz bir bütün olduğu saptanmıştır. Araştırmada literatür taraması yapılarak staccato ile ilgili çalışmalar taranmıştır. Tarama sonucunda klarnet tekniğin de kullanılan staccato eser çalışmasının az olduğu tespit edilmiştir. Metot taramasında da etüt çalışmasının daha çok olduğu saptanmıştır. Böylelikle klarnetin staccato tekniğini hızlandırma ve güçlendirme çalışmaları sunulmuştur. Staccato etüt çalışmaları yapılırken, araya eser çalışmasının girmesi beyni rahatlattığı ve istekliliği daha arttırdığı gözlemlenmiştir. Staccato çalışmasını yaparken doğru bir kamış seçimi üzerinde de durulmuştur. Staccato tekniğini doğru çalışmak için doğru bir kamışın dil hızını arttırdığı saptanmıştır. Doğru bir kamış seçimi kamıştan rahat ses çıkmasına bağlıdır. Kamış, dil atma gücünü vermiyorsa daha doğru bir kamış seçiminin yapılması gerekliliği vurgulanmıştır. Staccato çalışmalarında baştan sona bir eseri yorumlamak zor olabilir. Bu açıdan çalışma, verilen müzikal nüanslara uymanın, dil atış performansını rahatlattığını ortaya koymuştur. Gelecek nesillere edinilen bilgi ve birikimlerin aktarılması ve geliştirici olması teşvik edilmiştir. Çıkacak eserlerin nasıl çözüleceği, staccato tekniğinin nasıl üstesinden gelinebileceği anlatılmıştır. Staccato tekniğinin daha kısa sürede çözüme kavuşturulması amaç edinilmiştir. Parmakların yerlerini öğrettiğimiz kadar belleğimize de çalışmaların kaydedilmesi önemlidir. Gösterilen azmin ve sabrın sonucu olarak ortaya çıkan yapıt başarıyı daha da yukarı seviyelere çıkaracaktır

    Deep Learning for Scene Flow Estimation on Point Clouds: A Survey and Prospective Trends

    Get PDF
    Aiming at obtaining structural information and 3D motion of dynamic scenes, scene flow estimation has been an interest of research in computer vision and computer graphics for a long time. It is also a fundamental task for various applications such as autonomous driving. Compared to previous methods that utilize image representations, many recent researches build upon the power of deep analysis and focus on point clouds representation to conduct 3D flow estimation. This paper comprehensively reviews the pioneering literature in scene flow estimation based on point clouds. Meanwhile, it delves into detail in learning paradigms and presents insightful comparisons between the state-of-the-art methods using deep learning for scene flow estimation. Furthermore, this paper investigates various higher-level scene understanding tasks, including object tracking, motion segmentation, etc. and concludes with an overview of foreseeable research trends for scene flow estimation

    Learning disentangled speech representations

    Get PDF
    A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody. The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions. In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks. This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically

    Review of Methodologies to Assess Bridge Safety During and After Floods

    Get PDF
    This report summarizes a review of technologies used to monitor bridge scour with an emphasis on techniques appropriate for testing during and immediately after design flood conditions. The goal of this study is to identify potential technologies and strategies for Illinois Department of Transportation that may be used to enhance the reliability of bridge safety monitoring during floods from local to state levels. The research team conducted a literature review of technologies that have been explored by state departments of transportation (DOTs) and national agencies as well as state-of-the-art technologies that have not been extensively employed by DOTs. This review included informational interviews with representatives from DOTs and relevant industry organizations. Recommendations include considering (1) acquisition of tethered kneeboard or surf ski-mounted single-beam sonars for rapid deployment by local agencies, (2) acquisition of remote-controlled vessels mounted with single-beam and side-scan sonars for statewide deployment, (3) development of large-scale particle image velocimetry systems using remote-controlled drones for stream velocity and direction measurement during floods, (4) physical modeling to develop Illinois-specific hydrodynamic loading coefficients for Illinois bridges during flood conditions, and (5) development of holistic risk-based bridge assessment tools that incorporate structural, geotechnical, hydraulic, and scour measurements to provide rapid feedback for bridge closure decisions.IDOT-R27-SP50Ope
    corecore