9 research outputs found

    Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering

    Full text link
    While large language models (LLMs) are equipped with longer text input capabilities than before, they are struggling to seek correct information in long contexts. The "lost in the middle" problem challenges most LLMs, referring to the dramatic decline in accuracy when correct information is located in the middle. To overcome this crucial issue, this paper proposes to enhance the information searching and reflection ability of LLMs in long contexts via specially designed tasks called Attention Strengthening Multi-doc QA (ASM QA). Following these tasks, our model excels in focusing more precisely on the desired information. Experimental results show substantial improvement in Multi-doc QA and other benchmarks, superior to state-of-the-art models by 13.7% absolute gain in shuffled settings, by 21.5% in passage retrieval task. We release our model, Ziya-Reader to promote related research in the community

    Sampling Uncertainties of Long-Term Remote-Sensing Suspended Sediments Monitoring over China's Seas: Impacts of Cloud Coverage and Sediment Variations

    No full text
    Satellite-based ocean color sensors have provided an unprecedentedly large amount of information on ocean, coastal and inland waters at varied spatial and temporal scales. However, observations are often adversely affected by cloud coverage and other poor weather conditions, like sun glint, and this influences the accuracy associated with long-term monitoring of water quality parameters. This study uses long-term (2013-2017) and high-frequency (eight observations per day) datasets from the Geostationary Ocean Color Imager (GOCI), the first geostationary ocean color satellite sensor, to quantify the cloud coverage over China&#39;s seas, the resultant interrupted observations in remote sensing, and their impacts on the retrieval of total suspended sediments (TSS). The monthly mean cloud coverage for the East China Sea (ECS), Bohai Sea (BS) and Yellow Sea (YS) were 62.6%, 67.3% and 69.9%, respectively. Uncertainties regarding the long-term retrieved TSS were affected by a combination of the effects of cloud coverage and TSS variations. The effects of the cloud coverage dominated at the monthly scale, with the mean normalized bias (P-bias) at 14.1% (+/- 2.6%), 7.6% (+/- 2.3%) and 12.2% (+/- 4.3%) for TSS of the ECS, BS and YS, respectively. Cloud coverage-interfering observations with the Terra/Aqua MODIS systems were also estimated, with monthly P(bias)ranging from 6.5% (+/- 7.4%) to 20% (+/- 13.1%) for TSS products, and resulted in a smaller data range and lower maximum to minimum ratio compared to the eight GOCI observations. Furthermore, with approximately 16.7% monthly variations being missed during the periods, significant &quot;missing trends&quot; effects were revealed in monthly TSS variations from Terra/Aqua MODIS. For the entire region and the Bohai Sea, the most appropriate timeframe for sampling ranges from 12:30 to 15:30, while this timeframe was narrowed to from 13:30 to 15:30 for observations in the East China Sea and the Yellow Sea. This research project evaluated the effects of cloud coverage and times for sampling on the remote sensing monitoring of ocean color constituents, which would suggest the most appropriate timeframe for ocean color sensor scans, as well as in situ data collection, and can provide design specification guidance for future satellite sensor systems.</p

    Temporal Variation of Chlorophyll-a Concentrations in Highly Dynamic Waters from Unattended Sensors and Remote Sensing Observations

    No full text
    Monitoring of water quality changes in highly dynamic inland lakes is frequently impeded by insufficient spatial and temporal coverage, for both field surveys and remote sensing methods. To track short-term variations of chlorophyll fluorescence and chlorophyll-a concentrations in Poyang Lake, the largest freshwater lake in China, high-frequency, in-situ, measurements were collected from two fixed stations. The K-mean clustering method was also applied to identify clusters with similar spatio-temporal variations, using remote sensing Chl-a data products from the MERIS satellite, taken from 2003 to 2012. Four lake area classes were obtained with distinct spatio-temporal patterns, two of which were selected for in situ measurement. Distinct daily periodic variations were observed, with peaks at approximately 3:00 PM and troughs at night or early morning. Short-term variations of chlorophyll fluorescence and Chl-a levels were revealed, with a maximum intra-diurnal ratio of 5.1 and inter-diurnal ratio of 7.4, respectively. Using geostatistical analysis, the temporal range of chlorophyll fluorescence and corresponding Chl-a variations was determined to be 9.6 h, which indicates that there is a temporal discrepancy between Chl-a variations and the sampling frequency of current satellite missions. An analysis of the optimal sampling strategies demonstrated that the influence of the sampling time on the mean Chl-a concentrations observed was higher than 25%, and the uncertainty of any single Terra/MODIS or Aqua/MODIS observation was approximately 15%. Therefore, sampling twice a day is essential to resolve Chl-a variations with a bias level of 10% or less. The results highlight short-term variations of critical water quality parameters in freshwater, and they help identify specific design requirements for geostationary earth observation missions, so that they can better address the challenges of monitoring complex coastal and inland environments around the world

    Spatio-temporal patterns of Ulva prolifera blooms and the corresponding influence on chlorophyll-a concentration in the Southern Yellow Sea, China

    No full text
    The world's largest macroalgal blooms (MABs) caused by the Mai prolifera outbreaks have occurred every summer since 2007 in the Southern Yellow Sea, China. Accumulating evidence showed that MABs may deteriorate the regional marine environment and influence the growth of some primary producers such as phytoplankton. In this study, we investigated the spatio-temporal patterns of U. prolifera green tides and chlorophyll-a concentration in the Southern Yellow Sea in 2015 using satellite images obtained from HI-1 CCD, MODIS, and GOCI. The correlation between the distributions of U. prolifera abundance and chlorophyll-a concentration was analyzed quantitatively by setting up a series of 5 x 5 km experimental grids, and we also discussed the possible mechanisms about the influence of U. prolifera blooms on the other floating microalgae. The results showed that the development of U. prolifera blooms in the Southern Yellow Sea in 2015 could be featured as "appearance - development - outbreak - decline - disappearance", while the concentration of chlorophyll-a showed "increase - sharp decline - slow recovery - stabilization" from April to August. We also found that the concentration of chlorophyll-a had the following relationships with U. proliferu temporally: (1) the concentration of chlorophyll-a increased with the growth of U. prolifera from April to mid-May; (2) the chlorophyll-a concentration decreased sharply with the dramatically increased coverage of U. prolifera in June; and (3) the chlorophyll-a concentration slowly recovered and finally stabilized as U. prolifera decreased in July. Generally, there was a negative correlation between the occurrence of U. prolifera and chlorophyll-a concentration in the Southern Yellow Sea, China. Our results showed that the outbreak of U. prolifera does have a certain impact on the growth and reproduction of planktonic microalgae, and it suggests that U. pro lifera blooms have potentially altered the ecological balance in the coastal waters of the Southern Yellow Sea. (C) 2018 Elsevier BM. All rights reserved

    Monitoring Secchi depth of the Yellow Sea and the East China Sea using a semi-analytical algorithm

    No full text
    Secchi depth, an important optical characteristic of water, is a useful index of water quality and is widely used in many environmental studies. The Yellow Sea and the East China Sea are typical case 2 waters, where concentrations of suspended matter, phytoplankton pigments, and colored dissolved organic matter are higher than those in other open oceans. Two cruises were conducted to investigate the water optical characteristics in the Yellow Sea and the East China Sea in May and June, 2009. 62 water sampling stations of Secchi disk depth were measured in situ in day time, and their values were in the range of 0.0112 to 15.6 m with the mean of 6.72 m and a standard deviation of 3.18 m. In this paper, we adapted a quasi-analytical algorithm to estimate the Secchi depth from satellite ocean data in both coastal and oceanic waters. The development of the algorithm is based on the use of in situ measurements and 8-day MODIS-Aqua remote sensing reflectance data with 4 km spatial resolution. More than 39 matchups were compiled for the MODIS sensor by spatialoral matching. The comparison between water transparency retrievals from remote sensing data and in situ measurements yields showed that the determination coefficient was 0.60 and a root mean square error of 8.4 m. This study suggests that the quasi-analytical algorithm provide a promising result on in situ data. In the future, maps of ocean transparency for this area will be derived using this algorithm

    Spatio-temporal Variability in Sea Surface Temperatures for the Yellow Sea based on MODIS Dataset

    No full text
    The spatio-temporal variabilities in sea surface temperature (SST) were analyzed using a time series of MODIS datasets for four separate regions in the Yellow Sea (YS) that were located along a north-south axis. The space variant temporal anomaly was further decomposed using an empirical orthogonal function (EOF) for estimating spatially distributed SST. The monthly SSTs showed similar temporal patterns in each region, which ranged from 2.4 degrees C to 28.4 degrees C in the study years 2011 to 2013, with seasonal cycles being stronger at the higher latitudes and weaker at the lower latitudes. Spatially, although there were no significant differences among the four regions (p<0.05) in any year, the geographical distribution of SST was characterized by an obvious gradient whereby SST decreased along the north-south axis. The monthly thermal difference among regions was largest in winter since the SST in the southeast was mainly affected by the Yellow Sea Warm Currents. The EOF1 mode accounted for 56% of the total spatial variance and exhibited a warming signal during the study period. The EOF2 mode accounted for 8% of the total variance and indicated the warm current features in the YS. The EOF3 mode accounted for 6% of the total variance and indicated the topographical features. The methodology used in this study demonstrated the spatio temporal variabilities in the YS

    Mapping Ulva prolifera green tides from space: A revisit on algorithm design and data products

    No full text
    Since the first report in 2008, macroalgal blooms of Ulva prolifera (often called green tides) in the Yellow Sea have occurred every year, with their origins, transport pathways, temporal changes, as well as causes and consequences studied extensively. Of these studies, satellite remote sensing has been used widely to detect the bloom presence and quantify the bloom size (i.e., U. prolifera coverage in km2 or biomass in kilotons). However, substantial variability has been found in the refereed literature in the remote sensing methodology, results, and interpretation of the U. prolifera coverage, especially in the attempts to study inter-annual changes or long-term trends. There are often inconsistent or contradicting results even from the same satellite sensor. Such inconsistencies or contradictions create difficulty not only within the remote sensing community when presenting new methodology or results, but also to researchers when attempting to use the remote sensing results to make predictions or perform impact assessments. Here, we review the literature on the remote sensing methodology to detect and quantify U. prolifera blooms, and make recommendations based on physical principles. Specifically, we propose the following conceptual guidelines: 1) a reliable index or algorithm should be relatively tolerant to perturbations by non-optimal observing conditions (thick aerosols, thin clouds, moderate sun glint, cloud-adjacent straylight, which can all be found frequently in the study region) for presence/absence detection, as well as to small errors in the selected thresholds to quantify U. prolifera; 2) a reliable index or algorithm should also make it relatively easy to account for variability in subpixel coverage of U. prolifera (i.e., through pixel unmixing) in order to obtain an accurate estimate of total U. prolifera coverage from an image; 3) a reliable data product (i.e., U. prolifera maps) should be able to account for the variable clouds when interpreting spatial patterns or temporal changes, with uncertainty estimates provided whenever possible; and 4) both the algorithm and the data product should minimize manual work in order to make them more objective and repeatable by other researchers. Finally, we show different types of time series of U. prolifera amounts in the Yellow Sea using the approaches based on these guidelines and Moderate Resolution Imaging Spectroradiometer (MODIS) observations, and discuss their implications on the interpretation of annual changes in interdisciplinary studies
    corecore