80 research outputs found

    City-level water withdrawal in China:Accounting methodology and applications

    Get PDF
    In the context of the freshwater crisis, accounting for water withdrawal could help planners better regulate water use in different sectors to combat water scarcity. However, the water withdrawal statistics in China are patchy, and the water data across all sectors at the city level appear to be relatively insufficient. Hence, we develop a general framework to, for the first time, estimate the water withdrawal of 58 economic–social–environmental sectors in cities in China. This methodology was applied because only inconsistent water statistics collected from different data sources at the city level are available. We applied it to 18 representative Chinese cities. Different from conventional perceptions that agriculture is usually the largest water user, industrial and household water withdrawal may also occupy the largest percentages in the water-use structure of some cities. The discrepancy among annual household water use per capita in the urban areas of different cities is relatively small (as is the case for rural areas), but that between urban and rural areas is large. As a result, increased attention should be paid to controlling industrial and urban household water use in particular cities. China should specifically prepare annual water accounts at the city level and establish a timetable to tackle water scarcity, which is a basic step toward efficient and sustainable water crisis mitigation

    Gentopia: A Collaborative Platform for Tool-Augmented LLMs

    Full text link
    Augmented Language Models (ALMs) empower large language models with the ability to use tools, transforming them into intelligent agents for real-world interactions. However, most existing frameworks for ALMs, to varying degrees, are deficient in the following critical features: flexible customization, collaborative democratization, and holistic evaluation. We present gentopia, an ALM framework enabling flexible customization of agents through simple configurations, seamlessly integrating various language models, task formats, prompting modules, and plugins into a unified paradigm. Furthermore, we establish gentpool, a public platform enabling the registration and sharing of user-customized agents. Agents registered in gentpool are composable such that they can be assembled together for agent collaboration, advancing the democratization of artificial intelligence. To ensure high-quality agents, gentbench, an integral component of gentpool, is designed to thoroughly evaluate user-customized agents across diverse aspects such as safety, robustness, efficiency, etc. We release gentopia on Github and will continuously move forward

    Upper bound of a band complex

    Full text link
    Band structure for a crystal generally consists of connected components in energy-momentum space, known as band complexes. Here, we explore a fundamental aspect regarding the maximal number of bands that can be accommodated in a single band complex. We show that in principle a band complex can have no finite upper bound for certain space groups. It means infinitely many bands can entangle together, forming a connected pattern stable against symmetry-preserving perturbations. This is demonstrated by our developed inductive construction procedure, through which a given band complex can always be grown into a larger one by gluing a basic building block to it. As a by-product, we demonstrate the existence of arbitrarily large accordion type band structures containing NC=4nN_C=4n bands, with n∈Nn\in\mathbb{N}.Comment: 6 pages, 4 figure

    Improved shrunken centroid method for better variable selection in cancer classification with high throughput molecular data

    Get PDF
    Master of ScienceDepartment of StatisticsHaiyan WangCancer type classification with high throughput molecular data has received much attention. Many methods have been published in this area. One of them is called PAM (nearest centroid shrunken algorithm), which is simple and efficient. It can give very good prediction accuracy. A problem with PAM is that this method selects too many genes, some of which may have no influence on cancer type. A reason for this phenomenon is that PAM assumes that all genes have identical distribution and give a common threshold parameter for genes selection. This may not hold in reality since expressions from different genes could have very different distributions due to complicated biological process. We propose a new method aimed to improve the ability of PAM to select informative genes. Keeping informative genes while reducing false positive variables can lead to more accurate classification result and help to pinpoint target genes for further studies. To achieve this goal, we introduce variable specific test based on Edgeworth expansion to select informative genes. We apply this test on each gene and select some genes based on the result of the test so that a large number of genes will be excluded. Afterward, soft thresholding with cross-validation can be further applied to decide a common threshold value. Simulation and real application show that our method can reduce the irrelevant information and select the informative genes more precisely. The simulation results give us more insight about where the newly proposed procedure could improve the accuracy, especially when the data set is skewed or unbalanced. The method can be applied to broad molecular data, including, for example, lipidomic data from mass spectrum, copy number data from genomics, eQLT analysis with GWAS data, etc. We expect the proposed method will help life scientists to accelerate discoveries with highthroughput data

    Disaster tweet text and image analysis using deep learning approaches

    Get PDF
    Doctor of PhilosophyDepartment of Computer ScienceDoina CarageaFast analysis of damage information after a disaster can inform responders and aid agencies, accelerate real-time response, and guide the allocation of resources. Once the details of damage information have been collected by a response center, the rescue resources can be assigned more efficiently according to the needs of different areas. The challenge of the information collection is that the traditional communication lines can be damaged or unavailable in the beginning of a disaster. With the fast growth of the social media platforms, the situational awareness and damage data can be collected, as affected people will post information on social media during a disaster. There are several challenges to analyzing the social media disaster data. One challenge is posed by the nature of social media disaster data, which is generally large, but noisy, and comes in various formats, such as text or images. This challenge can be addressed by using deep learning approaches, which have achieved good performance on image processing and natural language processing. Another challenge posed by disaster related social media data is that it needs to be analyzed in real-time because of the urgent need for damage and situational awareness information. It is not feasible to learn supervised classifiers in the beginning of a disaster given the lack of labeled data from the disaster of interest. Domain adaptation can be applied to address this challenge. Using domain adaptation, we can adapt a model learned from pre-labeled data from a prior source disaster to the current on-going disaster. In this dissertation, I propose deep learning approaches to analyze disaster related tweet text and image data. Firstly, domain adaptation approaches are proposed to identify informative images or text, respectively. Secondly, a multimodal approach is proposed to further improve the performance by utilizing the information from both images and text. Thirdly, an approach for localizing and qualifying damage in the informative images is proposed. Experimental results show that the proposed approaches can efficiently identify informative tweets or images. Furthermore, the type and the area of damage can be localized effectively
    • …
    corecore