16 research outputs found

    Initial design and concept of operations for a clandestine data relay UUV to circumvent jungle canopy effects on satellite communications

    Get PDF
    Communications within jungle environments has always been a difficult proposition. This is especially true of collection assets beneath triple canopy jungle that need to communicate with overhead national assets. The traditional methods of countering the negative effects of the canopy on EM signals have been to increase the power to offset the losses, or to utilize new, more canopy transparent portions of the EM spectrum. However, there are complications with both of these methods. Simply increasing transmitted power increases the drain on the system's power supply, thus lowering effective on-station time. Shifting to a different portion of the EM spectrum can negatively affect the transmission rate of the system and requires specialized equipment such as antennas and modulators. This work addresses the issue by designing a semi-autonomous UUV, which will clandestinely relay data from the embedded jungle systems to overhead national assets. Rather than trying to punch through the canopy directly, the proposed UUV will take advantage of the fact that most jungle water ways have, at the very least, a thinner canopy overhead if not a clear view of the sky for less lossy satellite communications. This shifts the primary communications from an Earth-Sky problem to a lateral wave model where the communications travels parallel to the canopy. While the jungle is still not an ideal medium for communications, other methods can be used to address these losses. The proposed UUV will be designed to be cheap and constructed from existing systems. It will also be small, and lightweight, enough to be delivered and deployed in theater via aircraft, boats, and operators on the ground. Additionally it will be capable of long on station times due to the ability recharge on station.http://archive.org/details/initialdesignndc109455537Approved for public release; distribution is unlimited

    Interactive Information Retrieval with Structured Documents

    Get PDF
    In recent years there has been a growing realisation in the IR community that the interaction of searchers with information is an indispensable component of the IR process. As a result, issues relating to interactive IR have been extensively investigated in the last decade. This research has been performed in the context of unstructured documents or in the context of the loosely-defined structure encountered in web pages. XML documents, on the other hand, define a different context, by offering the possibility of navigating within the structure of a single document, or of following links to other documents. Relatively little work has been carried out to study user interaction with IR systems that make use of the additional features offered by XML documents. As part of the INEX initiative for the evaluation of XML retrieval, the INEX interactive track has focused on interactive XML retrieval since 2004. Here user friendly exposition to various features of XML documents is provided and some new features are designed and implemented to enable searchers to have access to their desired information in an efficient manner. In this study interaction entails three levels: query formulation, inspecting result list, and examining the detail. For query formulation, suggesting related terms is a conventional method to assist searchers. Here we investigate the related terms derived from two different co-occurrence units: elements and documents. In addition, contextual aspect is added to facilitate the searchers for appropriate selection of terms. Results showed the usefulness of suggesting related terms and some what acceptance of the contextual related tool. For inspecting the result list, classic document retrieval systems such as web search engines retrieve whole documents, and leave it to the searchers to collect their required information from possibly a lengthy text. In contrast, element retrieval aims at a focused view of information by pointing to the optimal access points of the document. A number of strategies have been investigated for presenting result lists. For examining the detail of a document, traditionally the complete document is presented to a searcher and here again the searcher has to put in effort to reach its required information. We investigated the use of additional support such as a table of contents along with document detail. In addition, we also investigated graphical representations of documents depicting its structure and granularity of retrieved elements along with their estimated relevance. Here the table of contents was found to be a very useful features for examining details. In order to conduct the analysis of searcher's interaction, a visualisation technique based on Tree Map was developed. It depicts the search interaction with element retrieval system. A number of browsing strategies has been identified with the help of this tool. The value of element retrieval for searchers and comparison between two focused approaches such as element and passage retrieval system was also evaluated. The study suggests that searchers find elements useful for their tasks and they locate a lot of the relevant information in specific elements rather than full documents. Sections, in particular, appear to be helpful. In order to provide user-specific support, the system needs feedback from searchers, who in turn, are very reluctant to give this information explicitly. Therefore, we investigated to what extent the different features can be used as relevance predictors. Of the five features regarded, primarily the reading time is a useful relevance predictor. Overall, relevance predictors for structured documents seem to be much weaker than for the case of atomic documents

    Eight Biennial Report : April 2005 – March 2007

    No full text

    Focused Retrieval

    Get PDF
    Traditional information retrieval applications, such as Web search, return atomic units of retrieval, which are generically called ``documents''. Depending on the application, a document may be a Web page, an email message, a journal article, or any similar object. In contrast to this traditional approach, focused retrieval helps users better pin-point their exact information needs by returning results at the sub-document level. These results may consist of predefined document components~---~such as pages, sections, and paragraphs~---~or they may consist of arbitrary passages, comprising any sub-string of a document. If a document is marked up with XML, a focused retrieval system might return individual XML elements or ranges of elements. This thesis proposes and evaluates a number of approaches to focused retrieval, including methods based on XML markup and methods based on arbitrary passages. It considers the best unit of retrieval, explores methods for efficient sub-document retrieval, and evaluates formulae for sub-document scoring. Focused retrieval is also considered in the specific context of the Wikipedia, where methods for automatic vandalism detection and automatic link generation are developed and evaluated

    The Role of Context in Matching and Evaluation of XML Information Retrieval

    Get PDF
    Sähköisten kokoelmien kasvun, hakujen arkipäiväistymisen ja mobiililaitteiden yleistymisen myötä yksi tiedonhaun menetelmien kehittämisen tavoitteista on saavuttaa alati tarkempia hakutuloksia; pitkistäkin dokumenteista oleellinen sisältö pyritään osoittamaan hakijalle tarkasti. Tiedonhakija pyritään siis vapauttamaan turhasta dokumenttien selaamisesta. Internetissä ja muussa sähköisessä julkaisemisessa dokumenttien osat merkitään usein XML-kielen avulla dokumenttien automaattista käsittelyä varten. XML-merkkaus mahdollistaa dokumenttien sisäisen rakenteen hyödyntämisen. Toisin sanoen tätä merkkausta voidaan hyödyntää kehitettäessä tarkkuusorientoituneita (kohdennettuja) tiedonhakujärjestelmiä ja menetelmiä. Väitöskirja käsittelee tarkkuusorientoitunutta tiedonhakua, jossa eksplisiittistä XML merkkausta voidaan hyödyntää. Väitöskirjassa on kaksi pääteemaa, joista ensimmäisen käsittelee XML -tiedonhakujärjestelmä TRIX:in (Tampere Retrieval and Indexing for XML) kehittämistä, toteuttamista ja arviointia. Toinen teema käsittelee kohdennettujen tiedonhakujärjestelmien empiirisiä arviointimenetelmiä. Ensimmäisen teeman merkittävin kontribuutio on kontekstualisointi, jolloin täsmäytyksessä XML-tiedonhaulle tyypillistä tekstievidenssin vähäisyyttä kompensoidaan hyödyntämällä XML-hierarkian ylempien tai rinnakkaisten osien sisältöä (so. kontekstia). Menetelmän toimivuus osoitetaan empiirisin menetelmin. Tutkimuksen seurauksena kontekstualisointi (contextualization) on vakiintunut alan yleiseen, kansainväliseen sanastoon. Toisessa teemassa todetaan kohdennetun tiedonhaun vaikuttavuuden mittaamiseen käytettävien menetelmien olevan monin tavoin puutteellisia. Puutteiden korjaamiseksi väitöskirjassa kehitetään realistisempia arviointimenetelmiä, jotka ottavat huomioon palautettavien hakuyksiköiden kontekstin, lukemisjärjestyksen ja käyttäjälle selailusta koituvan vaivan. Tutkimuksessa kehitetty mittari (T2I(300)) on valittu varsinaiseksi mittariksi kansainvälisessä INEX (Initiative for the Evaluation of XML Retrieval) hankkeessa, joka on vuonna 2002 perustettu XML tiedonhaun tutkimusfoorumi.This dissertation addresses focused retrieval, especially its sub-concept XML (eXtensible Mark-up Language) information retrieval (XML IR). In XML IR, the retrievable units are either individual elements, or sets of elements grouped together typically by a document. These units are ranked according to their estimated relevance by an XML IR system. In traditional information retrieval, the retrievable unit is an atomic document. Due to this atomicity, many core characteristics of such document retrieval paradigm are not appropriate for XML IR. Of these characteristics, this dissertation explores element indexing, scoring and evaluation methods which form two main themes: 1. Element indexing, scoring, and contextualization 2. Focused retrieval evaluation To investigate the first theme, an XML IR system based on structural indices is constructed. The structural indices offer analyzing power for studying element hierarchies. The main finding in the system development is the utilization of surrounding elements as supplementary evidence in element scoring. This method is called contextualization, for which we distinguish three models: vertical, horizontal and ad hoc contextualizations. The models are tested with the tools provided by (or derived from) the Initiative for the Evaluation of XML retrieval (INEX). The results indicate that the evidence from element surroundings improves the scoring effectiveness of XML retrieval. The second theme entails a task where the retrievable elements are grouped by a document. The aim of this theme is to create methods measuring XML IR effectiveness in a credible fashion in a laboratory environment. The credibility is pursued by assuming the chronological reading order of a user together with a point where the user becomes frustrated after reading a certain amount of non-relevant material. Novel metrics are created based on these assumptions. The relative rankings of systems measured with the metrics differ from those delivered by contemporary metrics. In addition, the focused retrieval strategies benefit from the novel metrics over traditional full document retrieval

    Recherche d'information et contexte

    Get PDF
    My research work is related the field of Information Retrieval (IR) whose objective is to enable a user to find information that meets its needs within a large volume of information. The work in IR have focused primarily on improving information processing in terms of indexing to obtain optimal representations of documents and queries and in terms of matching between these representations. Contributions have long made no distinction between all searches assuming a unique type of search and when proposing a model intended to be effective for this unique type of search. The growing volume of information and diversity of situations have marked the limits of existing IR approaches bringing out the field of contextual IR. Contextual IR aims to better respond to users' needs taking into account the search context. The principle is to differentiate searches by integrating in the IR process, contextual factors that will influence the IRS effectiveness. The notion of context is broad and refers to all knowledge related to information conducted by a user querying an IRS. My research has been directed toward taking into account the contextual factors that are: the domain of information, the information structure and the user. The first three directions of my work consist in proposing models that incorporate each of these elements of context, and a fourth direction aims at exploring how to adapt the process to each search according to its context. Various European and national projects have provided application frameworks for this research and have allowed us to validate our proposals. This research has also led to development of various prototypes and allowed the conduct of PhD theses and research internships.Mes travaux de recherche s'inscrivent dans le domaine de la recherche d'information (RI) dont l'objectif est de permettre à un utilisateur de trouver de l'information répondant à son besoin au sein d'un volume important d'informations. Les recherches en RI ont été tout d'abord orientées système. Elles sont restées très longtemps axées sur l'appariement pour évaluer la correspondance entre les requêtes et les documents ainsi que sur l'indexation des documents et de requêtes pour obtenir une représentation qui supporte leur mise en correspondance. Cela a conduit à la définition de modèles théoriques de RI comme le modèle vectoriel ou le modèle probabiliste. L'objectif initialement visé a été de proposer un modèle de RI qui possède un comportement global le plus efficace possible. La RI s'est longtemps basée sur des hypothèses simplificatrices notamment en considérant un type unique d'interrogation et en appliquant le même traitement à chaque interrogation. Le contexte dans lequel s'effectue la recherche a été ignoré. Le champ d'application de la RI n'a cessé de s'étendre notamment grâce à l'essor d'internet. Le volume d'information toujours plus important combiné à une utilisation de SRI qui s'est démocratisée ont conduit à une diversité des situations. Cet essor a rendu plus difficile l'identification des informations correspondant à chaque besoin exprimé par un utilisateur, marquant ainsi les limites des approches de RI existantes. Face à ce constat, des propositions ont émergé, visant à faire évoluer la RI en rapprochant l'utilisateur du système tels que les notions de réinjection de pertinence utilisateur ou de profil utilisateur. Dans le but de fédérer les travaux et proposer des SRI offrant plus de précision en réponse au besoin de l'utilisateur, le domaine de la RI contextuelle a récemment émergé. L'objectif est de différencier les recherches au niveau des modèles de RI en intégrant des éléments de contexte susceptibles d'avoir une influence sur les performances du SRI. La notion de contexte est vaste et se réfère à toute connaissance liée à la recherche de l'utilisateur interrogeant un SRI. Mes travaux de recherche se sont orientés vers la prise en compte des éléments de contexte que sont le domaine de l'information, la structure de l'information et l'utilisateur. Ils consistent, dans le cadre de trois premières orientations, à proposer des modèles qui intègrent chacun de ces éléments de contexte, et, dans une quatrième orientation, d'étudier comment adapter les processus à chaque recherche en fonction de son contexte. Différents projets européens et nationaux ont servi de cadre applicatifs à ces recherches et ainsi à valider nos propositions. Mes travaux de recherche ont également fait l'objet de développements dans différents prototypes et ont permis le déroulement de thèses de doctorat et stages de recherche

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Diagnosis of the atmospheric hydrological cycle and its variability in the present-day climate

    Get PDF
    This thesis investigates some important aspects of the atmospheric branch of the hydrological cycle in the modern day climate from an observational perspective. Data quality is evaluated, focusing on two state-of-the-art reanalysis products, ERA-I and JRA-55. Regional-scale discrepancies among reanalyses and observations, especially in their annual cycles, are found in the warm pool, Amazon, Gulf stream and Indian subcontinent regions. In the tropics, oceanic evaporation and its temporal variability are notably greater in JRA-55 than in ERA-I and satellite-based estimates, while both reanalyses overestimate precipitation. Higher tropical precipitation and evaporation, accompanied by a slightly lower level of total column water (TCW), might suggest a more intense hydrological cycle, but this can be an ill-defined concept especially when analysis increments mask “spin-down” errors in reanalysis models. Analysis increments arise to remove unphysical residuals in the atmospheric water budget, and these are explored via a cluster analysis to identify regimes with common behavior. Consistent for ERA-I and JRA-55, the regime with the largest negative residuals (greater moisture outputs than inputs) exceeding 50% of mean precipitation occurs during the dry season of some low latitude regions that feature strong seasonality, high evapotranspiration and high moisture divergence. Errors in the moisture divergence are likely responsible because they correlate strongly with the budget residual. Empirical Orthogonal Function (EOF) and Self Organizing Map (SOM) analyses are applied to identify the dominant inter-annual patterns of vertically-integrated moisture divergence variability. They reveal that the transition from strong La Niña through to extreme El Niño events is not a linear one and that the EOF orthogonality constraint results in the patterns being split between leading EOFs that are non-linearly related. The SOM analysis captures the range of responses to the El Niño Southern Oscillation (ENSO), indicating that the distinction between the moderate and extreme El Niños can be as great as the difference between La Niña and moderate El Niños, from a moisture divergence point of view. On diurnal time scales, horizontal moisture fluxes vary in response to thermodynamic and dynamic effects. TCW shows a global scale diurnal cycle that peaks around 1800 - 2100 local time with a peak-to-trough magnitude of 0.4mm. Semi-diurnal variations in surface winds and pressure, consistent with atmospheric tidal theory, create a westward propagating moisture convergence/divergence wave along the equator. Finally, the importance of Tropical Cyclones (TCs) as a source of freshwater for the North American continent is estimated using an ensemble of schemes designed to attribute onshore moisture fluxes to TCs. Averaged over the 2004–2012 hurricane seasons and integrated over the western, southern and eastern coasts of North America, the seven schemes attribute 7 to 18% (mean 14 %) of total net onshore flux to Atlantic TCs. A reduced contribution of 10% (range 9 to 11 %) was found for the 1980–2003 period, though only two schemes could be applied to this earlier period. Over the whole 1980–2012 period, a further 8% (range 6 to 9% from two schemes) was attributed to East Pacific TCs, resulting in a total TC contribution of 19% (range 17 to 22 %) to the ocean-to-land moisture transport onto the North American continent between May and November. The inter-annual variability does not appear to be strongly related to ENSO

    Evaluation of effective XML information retrieval

    Get PDF
    XML is being adopted as a common storage format in scientific data repositories, digital libraries, and on the World Wide Web. Accordingly, there is a need for content-oriented XML retrieval systems that can efficiently and effectively store, search and retrieve information from XML document collections. Unlike traditional information retrieval systems where whole documents are usually indexed and retrieved as information units, XML retrieval systems typically index and retrieve document components of varying granularity. To evaluate the effectiveness of such systems, test collections where relevance assessments are provided according to an XML-specific definition of relevance are necessary. Such test collections have been built during four rounds of the INitiative for the Evaluation of XML Retrieval (INEX). There are many different approaches to XML retrieval; most approaches either extend full-text information retrieval systems to handle XML retrieval, or use database technologies that incorporate existing XML standards to handle both XML presentation and retrieval. We present a hybrid approach to XML retrieval that combines text information retrieval features with XML-specific features found in a native XML database. Results from our experiments on the INEX 2003 and 2004 test collections demonstrate the usefulness of applying our hybrid approach to different XML retrieval tasks. A realistic definition of relevance is necessary for meaningful comparison of alternative XML retrieval approaches. The three relevance definitions used by INEX since 2002 comprise two relevance dimensions, each based on topical relevance. We perform an extensive analysis of the two INEX 2004 and 2005 relevance definitions, and show that assessors and users find them difficult to understand. We propose a new definition of relevance for XML retrieval, and demonstrate that a relevance scale based on this definition is useful for XML retrieval experiments. Finding the appropriate approach to evaluate XML retrieval effectiveness is the subject of ongoing debate within the XML information retrieval research community. We present an overview of the evaluation methodologies implemented in the current INEX metrics, which reveals that the metrics follow different assumptions and measure different XML retrieval behaviours. We propose a new evaluation metric for XML retrieval and conduct an extensive analysis of the retrieval performance of simulated runs to show what is measured. We compare the evaluation behaviour obtained with the new metric to the behaviours obtained with two of the official INEX 2005 metrics, and demonstrate that the new metric can be used to reliably evaluate XML retrieval effectiveness. To analyse the effectiveness of XML retrieval in different application scenarios, we use evaluation measures in our new metric to investigate the behaviour of XML retrieval approaches under the following two scenarios: the ad-hoc retrieval scenario, exploring the activities carried out as part of the INEX 2005 Ad-hoc track; and the multimedia retrieval scenario, exploring the activities carried out as part of the INEX 2005 Multimedia track. For both application scenarios we show that, although different values for retrieval parameters are needed to achieve the optimal performance, the desired textual or multimedia information can be effectively located using a combination of XML retrieval approaches
    corecore