5,274 research outputs found

    Constellation Queries over Big Data

    Full text link
    A geometrical pattern is a set of points with all pairwise distances (or, more generally, relative distances) specified. Finding matches to such patterns has applications to spatial data in seismic, astronomical, and transportation contexts. For example, a particularly interesting geometric pattern in astronomy is the Einstein cross, which is an astronomical phenomenon in which a single quasar is observed as four distinct sky objects (due to gravitational lensing) when captured by earth telescopes. Finding such crosses, as well as other geometric patterns, is a challenging problem as the potential number of sets of elements that compose shapes is exponentially large in the size of the dataset and the pattern. In this paper, we denote geometric patterns as constellation queries and propose algorithms to find them in large data applications. Our methods combine quadtrees, matrix multiplication, and unindexed join processing to discover sets of points that match a geometric pattern within some additive factor on the pairwise distances. Our distributed experiments show that the choice of composition algorithm (matrix multiplication or nested loops) depends on the freedom introduced in the query geometry through the distance additive factor. Three clearly identified blocks of threshold values guide the choice of the best composition algorithm. Finally, solving the problem for relative distances requires a novel continuous-to-discrete transformation. To the best of our knowledge this paper is the first to investigate constellation queries at scale

    Engaging Girls in Computer Science: Do Single-Gender Interdisciplinary Classes Help?

    Get PDF
    Computing-driven innovation cannot reach its full potential if only a fraction of the population is involved. Without girls and their non-stereotypical contribution, the innovation potential is severely limited. In computer science (CS) and software engineering (SE), the gender gap persists without any positive trend. Many girls find it challenging to identify with the subject of CS. However, we can capitalize on their interests and create environments for girls through interdisciplinary subcultures to spark and foster enthusiasm for CS. This paper presents and discusses the results of an intervention in which we applied a novel interdisciplinary online course in data science to get girls excited about CS and programming by contributing to the grand goal of solving colony collapse disorder from biology and geoecology. The results show the potential of such programs to get girls excited about programming, but also important implications in terms of the learning environment. The startling results show that girls from single-gender classes (SGCs) are significantly more open to CS-related topics and that the intervention evoked significantly more positive feelings in them than in girls from mixed-gender classes (MGCs). The findings highlight the importance of how CS-related topics are introduced in school and the crucial impact of the learning environment to meet the requirements of truly gender-inclusive education

    Enhanced Distributed File Replication Protocol for Efficient File Sharing in Wireless Mobile Ad-Hoc Networks.

    Get PDF
    File sharing applications in mobile unintended networks (MANETs) have attracted additional and additional attention in recent years. The potency of file querying suffers from the distinctive properties of such networks as well as node quality and restricted communication vary and resource. associate degree intuitive methodology to alleviate this drawback is to form file replicas within the network. However, despite the efforts on file replication, no analysis has targeted on the worldwide optimum duplicate creation with minimum average querying delay. Specifically, current file replication protocols in mobile unintended networks have 2 shortcomings. First, they lack a rule to portion restricted resources to completely different files so as to reduce the typical querying delay. Second, they merely contemplate storage as offered resources for replicas, however neglect the actual fact that the file holders’ frequency of meeting different nodes additionally plays a crucial role in deciding file availableness. Actually, a node that contains a higher meeting frequency with others provides higher availableness to its files. This becomes even additional evident in sparsely distributed MANETs, during which nodes meet disruptively. during this paper, we have a tendency to introduce a replacement conception of resource for file replication, that considers each node storage and meeting frequency. we have a tendency to on paper study the influence of resource allocation on the typical querying delay and derive a resource allocation rule to reduce the typical querying delay. we have a tendency to additional propose a distributed file replication protocol to appreciate the projected rule. intensive trace-driven experiments with synthesized traces and real traces show that our protocol are able to do shorter average querying delay at a lower value than current replication protocols

    ICT tools for data management and analysis to support decisional process oriented to sustainable agri-food chains

    Get PDF
    Il settore agroalimentare sta affrontando delle sfide globali. La prima riguarda sfamare la popolazione mondiale che nel 2050, secondo le proiezioni delle Nazioni Unite, raggiungerà quota 9,3 miliardi di persone. La seconda sfida riguarda la richiesta da parte dei consumatori di prodotti ottenuti da filiere agroalimentari sempre più sostenibili, sicure e trasparenti. In particolare, l’Agricoltura sostenibile è una tecnica di gestione in grado di preservare la diversità biologica, la produttività, la capacità di rigenerazione, la vitalità e l’abilità alla funzione di un ecosistema agricolo, assicurandone, oggi e in futuro, le funzioni ecologiche, economiche e sociali a livello locale, nazionale ed globale, senza danneggiare altri ecosistemi. Quindi, per fronteggiare la sfida dell’agricoltura sostenibile, gli agricoltori devono aumentare la qualità e la quantità della produzione, riducendo l’impatto ambientale attraverso nuovi strumenti e nuove strategie di gestione. Questo lavoro analizza l’integrazione nel settore agroalimentare di alcune tecnologie e metodologie ICT per l’acquisizione, gestione e analisi dei dati, come la tecnologia RFID (Radio Frequency IDentification), i FMIS (Farm Management Information Systems), i DW (Data Warehouse) e l’approccio OLAP (On-Line Analytical Processing). Infine, l’adozione delle tecnologie ICT da parte di vere aziende è stata valutata attraverso un questionario. Al riguardo dell’adozione delle tecnologie RFID, questo lavoro analizza l’opportunità di trasferimento tecnologico relativo al monitoraggio e controllo dei prodotti agroalimentari tramite l’utilizzo di sensori innovativi, intelligenti e miniaturizzati. Le informazioni riguardanti lo stato del prodotto sono trasferite in tempo reale in wireless, come previsto dalla tecnologia RFID. In particolare, due soluzioni RFID sono state analizzate, evidenziando vantaggi e punti critici in confronto ai classici sistemi per assicurare la tracciabilità e la qualità dei prodotti agroalimentari. Quindi, questo lavoro analizza la possibilità di sviluppare una struttura che combina le tecnologie della Business Intelligence con i principi della Protezione Integrata (IPM) per aiutare gli agricoltori nel processo decisionale, andando a diminuire l’impatto ambientale ed aumentare la performance produttiva. L’IPM richiede di utilizzare simultaneamente diverse tecniche di protezione delle colture per il controllo dei parassiti e patogeni tramite un approccio ecologico ed economico. Il sistema di BI proposto è chiamato BI4IPM e combina l’approccio OLTP (On-Line Transaction Processing) con quello OLAP per verificare il rispetto dei disciplinari di produzione integrata. BI4IPM è stato testato con dati provenienti da vere aziende olivicole pugliesi. L’olivo è una delle principali colture a livello globale e la Puglia è la prima regione produttrice in Italia, con un gran numero di aziende che generano dati sull’IPM. Le strategie di protezione delle colture sono correlate alle condizioni climatiche, considerando la forte relazione tra clima, colture e parassiti. Quindi, in questo lavoro è presentato un nuovo e avanzato modello OLAP che integra il GSI (Growing Season Index), un modello fenologico, per comparare indirettamente le aziende agricole dal punto di vista climatico. Il sistema proposto permette di analizzare dati IPM di diverse aziende agricole che presentano le stesse condizioni fenologiche in un anno al fine di individuare best practices e di evidenziare e spiegare pratiche differenti adottate da aziende che lavorano in differenti condizioni climatiche. Infine, è stata effettuata un’indagine al fine di capire come le aziende agricole della Basilicata si raggruppano in funzione del livello di innovazione adottato. È stato utilizzato un questionario per domandare alle aziende se adottano strumenti ICT, ed eventualmente in quale processo produttivo o di management vengano usati. È stata quindi effettuata un’analisi cluster sui dati raccolti. I risultati mostrano che, usando il metodo di clustering k-means, appaiono due gruppi: gli innovatori e gli altri. Mentre, applicando la rappresentazione boxlot, si ottengono 3 gruppi: innovatori, utilizzatori precoci e ritardatari.The Agri-Food sector is facing global challenges. The first issue concerns feeding a world population that in 2050, according to United Nations projections, will reach 9.3 billion people. The second challenge is the request by consumers for high quality products obtained by more sustainable, safely and clear agri-food chains. In particular, the Sustainable agriculture is a management strategy able to preserve the biological diversity, productivity, regeneration capacity, vitality and ability to function of an agricultural ecosystem, ensuring, today and in the future, significant ecological, economic and social functions at the local, national and global scales, without harming other ecosystems. Therefore, to face the challenge of the sustainable agriculture, farmers need to increase quality and quantity of the production, reducing the environmental impact through new management strategies and tools. This work explores the integration of several ICT technologies and methodologies in the agri-food sector for the data acquisition, management and analysis, such as RFID technology, Farm Management Information Systems (FMIS), Data Warehouse (DW) and On-Line Analytical Processing (OLAP). Finally, the adoption of the ICT technologies by real farms is evaluated through a survey. Regarding the adoption of the RFID technology, this work explores an opportunity for technology transfer related to the monitoring and control of agri-food products, based on the use of miniaturized, smart and innovative sensors. The information concerning to the state of the product is transferred in real time in a wireless way, according to the RFID technology. In particular, two technical solutions involving RFID are provided, highlighting the advantages and critical points referred to the normal system used to ensure the traceability and the quality of the agri-food products. Therefore, this work explores the possibility of developing a framework that combines business intelligence (BI) technologies with Integrated Pest Management (IPM) principles to support farmers in the decisional process, thereby decreasing environmental cost and improving production performance. The IPM requires the simultaneous use of different crop protection techniques to control pests through an ecological and economic approach. The proposed BI system is called BI4IPM, and it combines on-line transaction processing (OLTP) with OLAP to verify adherence to the IPM technical specifications. BI4IPM is tested with data from real Apulian olive crop farms. Olive tree is one of the most important crop at global scale and Apulia is the first olive-producing region in Italy, with a huge amount of farms that generate IPM data. The crop protection strategies are correlated to the climate conditions considering the very important relation among climate, crops and pests. Therefore, in this work is presented a new advanced OLAP model integrating the Growing Season Index (GSI), a phenology model, to compare indirectly the farms by a climatic point of view. The proposed system allows analysing IPM data of different farms having the same phenological conditions over a year to understand some best practices and to highlight and explain different practices adopted by farms working in different climatic conditions. Finally, a survey aimed at investigating how Lucania' farms cluster according to the level of innovation adopted was performed. It was used a questionnaire for asking if farms adopt ICTs tools and, in case, what type they involved in managing and/or production processes. It has been done a cluster analysis on collected data. Results show that, using k-means clustering method, appear two clusters: innovators, remaining groups. While, using boxplot representation, clustered three groups: innovators, early adopters and laggards

    FireAct: Toward Language Agent Fine-tuning

    Full text link
    Recent efforts have augmented language models (LMs) with external tools or environments, leading to the development of language agents that can reason and act. However, most of these agents rely on few-shot prompting techniques with off-the-shelf LMs. In this paper, we investigate and argue for the overlooked direction of fine-tuning LMs to obtain language agents. Using a setup of question answering (QA) with a Google search API, we explore a variety of base LMs, prompting methods, fine-tuning data, and QA tasks, and find language agents are consistently improved after fine-tuning their backbone LMs. For example, fine-tuning Llama2-7B with 500 agent trajectories generated by GPT-4 leads to a 77% HotpotQA performance increase. Furthermore, we propose FireAct, a novel approach to fine-tuning LMs with trajectories from multiple tasks and prompting methods, and show having more diverse fine-tuning data can further improve agents. Along with other findings regarding scaling effects, robustness, generalization, efficiency and cost, our work establishes comprehensive benefits of fine-tuning LMs for agents, and provides an initial set of experimental designs, insights, as well as open questions toward language agent fine-tuning.Comment: Code, data, and models are available at https://fireact-agent.github.i

    The future of Earth observation in hydrology

    Get PDF
    In just the past 5 years, the field of Earth observation has progressed beyond the offerings of conventional space-agency-based platforms to include a plethora of sensing opportunities afforded by CubeSats, unmanned aerial vehicles (UAVs), and smartphone technologies that are being embraced by both for-profit companies and individual researchers. Over the previous decades, space agency efforts have brought forth well-known and immensely useful satellites such as the Landsat series and the Gravity Research and Climate Experiment (GRACE) system, with costs typically of the order of 1 billion dollars per satellite and with concept-to-launch timelines of the order of 2 decades (for new missions). More recently, the proliferation of smart-phones has helped to miniaturize sensors and energy requirements, facilitating advances in the use of CubeSats that can be launched by the dozens, while providing ultra-high (3-5 m) resolution sensing of the Earth on a daily basis. Start-up companies that did not exist a decade ago now operate more satellites in orbit than any space agency, and at costs that are a mere fraction of traditional satellite missions. With these advances come new space-borne measurements, such as real-time high-definition video for tracking air pollution, storm-cell development, flood propagation, precipitation monitoring, or even for constructing digital surfaces using structure-from-motion techniques. Closer to the surface, measurements from small unmanned drones and tethered balloons have mapped snow depths, floods, and estimated evaporation at sub-metre resolutions, pushing back on spatio-temporal constraints and delivering new process insights. At ground level, precipitation has been measured using signal attenuation between antennae mounted on cell phone towers, while the proliferation of mobile devices has enabled citizen scientists to catalogue photos of environmental conditions, estimate daily average temperatures from battery state, and sense other hydrologically important variables such as channel depths using commercially available wireless devices. Global internet access is being pursued via high-altitude balloons, solar planes, and hundreds of planned satellite launches, providing a means to exploit the "internet of things" as an entirely new measurement domain. Such global access will enable real-time collection of data from billions of smartphones or from remote research platforms. This future will produce petabytes of data that can only be accessed via cloud storage and will require new analytical approaches to interpret. The extent to which today's hydrologic models can usefully ingest such massive data volumes is unclear. Nor is it clear whether this deluge of data will be usefully exploited, either because the measurements are superfluous, inconsistent, not accurate enough, or simply because we lack the capacity to process and analyse them. What is apparent is that the tools and techniques afforded by this array of novel and game-changing sensing platforms present our community with a unique opportunity to develop new insights that advance fundamental aspects of the hydrological sciences. To accomplish this will require more than just an application of the technology: in some cases, it will demand a radical rethink on how we utilize and exploit these new observing systems
    • …