2,101 research outputs found

    Knowledge discovery for friction stir welding via data driven approaches: Part 2 – multiobjective modelling using fuzzy rule based systems

    Get PDF
    In this final part of this extensive study, a new systematic data-driven fuzzy modelling approach has been developed, taking into account both the modelling accuracy and its interpretability (transparency) as attributes. For the first time, a data-driven modelling framework has been proposed designed and implemented in order to model the intricate FSW behaviours relating to AA5083 aluminium alloy, consisting of the grain size, mechanical properties, as well as internal process properties. As a result, ‘Pareto-optimal’ predictive models have been successfully elicited which, through validations on real data for the aluminium alloy AA5083, have been shown to be accurate, transparent and generic despite the conservative number of data points used for model training and testing. Compared with analytically based methods, the proposed data-driven modelling approach provides a more effective way to construct prediction models for FSW when there is an apparent lack of fundamental process knowledge

    On enhancing the robustness of timeline summarization test collections

    Get PDF
    Timeline generation systems are a class of algorithms that produce a sequence of time-ordered sentences or text snippets extracted in real-time from high-volume streams of digital documents (e.g. news articles), focusing on retaining relevant and informative content for a particular information need (e.g. topic or event). These systems have a range of uses, such as producing concise overviews of events for end-users (human or artificial agents). To advance the field of automatic timeline generation, robust and reproducible evaluation methodologies are needed. To this end, several evaluation metrics and labeling methodologies have recently been developed - focusing on information nugget or cluster-based ground truth representations, respectively. These methodologies rely on human assessors manually mapping timeline items (e.g. sentences) to an explicit representation of what information a ‘good’ summary should contain. However, while these evaluation methodologies produce reusable ground truth labels, prior works have reported cases where such evaluations fail to accurately estimate the performance of new timeline generation systems due to label incompleteness. In this paper, we first quantify the extent to which the timeline summarization test collections fail to generalize to new summarization systems, then we propose, evaluate and analyze new automatic solutions to this issue. In particular, using a depooling methodology over 19 systems and across three high-volume datasets, we quantify the degree of system ranking error caused by excluding those systems when labeling. We show that when considering lower-effectiveness systems, the test collections are robust (the likelihood of systems being miss-ranked is low). However, we show that the risk of systems being mis-ranked increases as the effectiveness of systems held-out from the pool increases. To reduce the risk of mis-ranking systems, we also propose a range of different automatic ground truth label expansion techniques. Our results show that the proposed expansion techniques can be effective at increasing the robustness of the TREC-TS test collections, as they are able to generate large numbers missing matches with high accuracy, markedly reducing the number of mis-rankings by up to 50%

    Towards automatic generation of relevance judgments for a test collection

    Get PDF
    This paper represents a new technique for building a relevance judgment list for information retrieval test collections without any human intervention. It is based on the number of occurrences of the documents in runs retrieved from several information retrieval systems and a distance based measure between the documents. The effectiveness of the technique is evaluated by computing the correlation between the ranking of the TREC systems using the original relevance judgment list (qrels) built by human assessors and the ranking obtained by using the newly generated qrels

    Aggregated search: a new information retrieval paradigm

    Get PDF
    International audienceTraditional search engines return ranked lists of search results. It is up to the user to scroll this list, scan within different documents and assemble information that fulfill his/her information need. Aggregated search represents a new class of approaches where the information is not only retrieved but also assembled. This is the current evolution in Web search, where diverse content (images, videos, ...) and relational content (similar entities, features) are included in search results. In this survey, we propose a simple analysis framework for aggregated search and an overview of existing work. We start with related work in related domains such as federated search, natural language generation and question answering. Then we focus on more recent trends namely cross vertical aggregated search and relational aggregated search which are already present in current Web search

    Approaches to implement and evaluate aggregated search

    Get PDF
    La recherche d'information agrĂ©gĂ©e peut ĂȘtre vue comme un troisiĂšme paradigme de recherche d'information aprĂšs la recherche d'information ordonnĂ©e (ranked retrieval) et la recherche d'information boolĂ©enne (boolean retrieval). Les deux paradigmes les plus explorĂ©s jusqu'Ă  aujourd'hui retournent un ensemble ou une liste ordonnĂ©e de rĂ©sultats. C'est Ă  l'usager de parcourir ces ensembles/listes et d'en extraire l'information nĂ©cessaire qui peut se retrouver dans plusieurs documents. De maniĂšre alternative, la recherche d'information agrĂ©gĂ©e ne s'intĂ©resse pas seulement Ă  l'identification des granules (nuggets) d'information pertinents, mais aussi Ă  l'assemblage d'une rĂ©ponse agrĂ©gĂ©e contenant plusieurs Ă©lĂ©ments. Dans nos travaux, nous analysons les travaux liĂ©s Ă  la recherche d'information agrĂ©gĂ©e selon un schĂ©ma gĂ©nĂ©ral qui comprend 3 parties: dispatching de la requĂȘte, recherche de granules d'information et agrĂ©gation du rĂ©sultat. Les approches existantes sont groupĂ©es autours de plusieurs perspectives gĂ©nĂ©rales telle que la recherche relationnelle, la recherche fĂ©dĂ©rĂ©e, la gĂ©nĂ©ration automatique de texte, etc. Ensuite, nous nous sommes focalisĂ©s sur deux pistes de recherche selon nous les plus prometteuses: (i) la recherche agrĂ©gĂ©e relationnelle et (ii) la recherche agrĂ©gĂ©e inter-verticale. * La recherche agrĂ©gĂ©e relationnelle s'intĂ©resse aux relations entre les granules d'information pertinents qui servent Ă  assembler la rĂ©ponse agrĂ©gĂ©e. En particulier, nous nous sommes intĂ©ressĂ©s Ă  trois types de requĂȘtes notamment: requĂȘte attribut (ex. prĂ©sident de la France, PIB de l'Italie, maire de Glasgow, ...), requĂȘte instance (ex. France, Italie, Glasgow, Nokia e72, ...) et requĂȘte classe (pays, ville française, portable Nokia, ...). Pour ces requĂȘtes qu'on appelle requĂȘtes relationnelles nous avons proposĂ©s trois approches pour permettre la recherche de relations et l'assemblage des rĂ©sultats. Nous avons d'abord mis l'accent sur la recherche d'attributs qui peut aider Ă  rĂ©pondre aux trois types de requĂȘtes. Nous proposons une approche Ă  large Ă©chelle capable de rĂ©pondre Ă  des nombreuses requĂȘtes indĂ©pendamment de la classe d'appartenance. Cette approche permet l'extraction des attributs Ă  partir des tables HTML en tenant compte de la qualitĂ© des tables et de la pertinence des attributs. Les diffĂ©rentes Ă©valuations de performances effectuĂ©es prouvent son efficacitĂ© qui dĂ©passe les mĂ©thodes de l'Ă©tat de l'art. DeuxiĂšmement, nous avons traitĂ© l'agrĂ©gation des rĂ©sultats composĂ©s d'instances et d'attributs. Ce problĂšme est intĂ©ressant pour rĂ©pondre Ă  des requĂȘtes de type classe avec une table contenant des instances (lignes) et des attributs (colonnes). Pour garantir la qualitĂ© du rĂ©sultat, nous proposons des pondĂ©rations sur les instances et les attributs promouvant ainsi les plus reprĂ©sentatifs. Le troisiĂšme problĂšme traitĂ© concerne les instances de la mĂȘme classe (ex. France, Italie, Allemagne, ...). Nous proposons une approche capable d'identifier massivement ces instances en exploitant les listes HTML. Toutes les approches proposĂ©es fonctionnent Ă  l'Ă©chelle Web et sont importantes et complĂ©mentaires pour la recherche agrĂ©gĂ©e relationnelle. Enfin, nous proposons 4 prototypes d'application de recherche agrĂ©gĂ©e relationnelle. Ces derniers peuvent rĂ©pondre des types de requĂȘtes diffĂ©rents avec des rĂ©sultats relationnels. Plus prĂ©cisĂ©ment, ils recherchent et assemblent des attributs, des instances, mais aussi des passages et des images dans des rĂ©sultats agrĂ©gĂ©s. Un exemple est la requĂȘte ``Nokia e72" dont la rĂ©ponse sera composĂ©e d'attributs (ex. prix, poids, autonomie batterie, ...), de passages (ex. description, reviews, ...) et d'images. Les rĂ©sultats sont encourageants et illustrent l'utilitĂ© de la recherche agrĂ©gĂ©e relationnelle. * La recherche agrĂ©gĂ©e inter-verticale s'appuie sur plusieurs moteurs de recherche dits verticaux tel que la recherche d'image, recherche vidĂ©o, recherche Web traditionnelle, etc. Son but principal est d'assembler des rĂ©sultats provenant de toutes ces sources dans une mĂȘme interface pour rĂ©pondre aux besoins des utilisateurs. Les moteurs de recherche majeurs et la communautĂ© scientifique nous offrent dĂ©jĂ  une sĂ©rie d'approches. Notre contribution consiste en une Ă©tude sur l'Ă©valuation et les avantages de ce paradigme. Plus prĂ©cisĂ©ment, nous comparons 4 types d'Ă©tudes qui simulent des situations de recherche sur un total de 100 requĂȘtes et 9 sources diffĂ©rentes. Avec cette Ă©tude, nous avons identifiĂ©s clairement des avantages de la recherche agrĂ©gĂ©e inter-verticale et nous avons pu dĂ©duire de nombreux enjeux sur son Ă©valuation. En particulier, l'Ă©valuation traditionnelle utilisĂ©e en RI, certes la moins rapide, reste la plus rĂ©aliste. Pour conclure, nous avons proposĂ© des diffĂ©rents approches et Ă©tudes sur deux pistes prometteuses de recherche dans le cadre de la recherche d'information agrĂ©gĂ©e. D'une cĂŽtĂ©, nous avons traitĂ© trois problĂšmes importants de la recherche agrĂ©gĂ©e relationnelle qui ont portĂ© Ă  la construction de 4 prototypes d'application avec des rĂ©sultats encourageants. De l'autre cĂŽtĂ©, nous avons mis en place 4 Ă©tudes sur l'intĂ©rĂȘt et l'Ă©valuation de la recherche agrĂ©gĂ©e inter-verticale qui ont permis d'identifier les enjeux d'Ă©valuation et les avantages du paradigme. Comme suite Ă  long terme de ce travail, nous pouvons envisager une recherche d'information qui intĂšgre plus de granules relationnels et plus de multimĂ©dia.Aggregated search or aggregated retrieval can be seen as a third paradigm for information retrieval following the Boolean retrieval paradigm and the ranked retrieval paradigm. In the first two, we are returned respectively sets and ranked lists of search results. It is up to the time-poor user to scroll this set/list, scan within different documents and assemble his/her information need. Alternatively, aggregated search not only aims the identification of relevant information nuggets, but also the assembly of these nuggets into a coherent answer. In this work, we present at first an analysis of related work to aggregated search which is analyzed with a general framework composed of three steps: query dispatching, nugget retrieval and result aggregation. Existing work is listed aside different related domains such as relational search, federated search, question answering, natural language generation, etc. Within the possible research directions, we have then focused on two directions we believe promise the most namely: relational aggregated search and cross-vertical aggregated search. * Relational aggregated search targets relevant information, but also relations between relevant information nuggets which are to be used to assemble reasonably the final answer. In particular, there are three types of queries which would easily benefit from this paradigm: attribute queries (e.g. president of France, GDP of Italy, major of Glasgow, ...), instance queries (e.g. France, Italy, Glasgow, Nokia e72, ...) and class queries (countries, French cities, Nokia mobile phones, ...). We call these queries as relational queries and we tackle with three important problems concerning the information retrieval and aggregation for these types of queries. First, we propose an attribute retrieval approach after arguing that attribute retrieval is one of the crucial problems to be solved. Our approach relies on the HTML tables in the Web. It is capable to identify useful and relevant tables which are used to extract relevant attributes for whatever queries. The different experimental results show that our approach is effective, it can answer many queries with high coverage and it outperforms state of the art techniques. Second, we deal with result aggregation where we are given relevant instances and attributes for a given query. The problem is particularly interesting for class queries where the final answer will be a table with many instances and attributes. To guarantee the quality of the aggregated result, we propose the use of different weights on instances and attributes to promote the most representative and important ones. The third problem we deal with concerns instances of the same class (e.g. France, Germany, Italy ... are all instances of the same class). Here, we propose an approach that can massively extract instances of the same class from HTML lists in the Web. All proposed approaches are applicable at Web-scale and they can play an important role for relational aggregated search. Finally, we propose 4 different prototype applications for relational aggregated search. They can answer different types of queries with relevant and relational information. Precisely, we not only retrieve attributes and their values, but also passages and images which are assembled into a final focused answer. An example is the query ``Nokia e72" which will be answered with attributes (e.g. price, weight, battery life ...), passages (e.g. description, reviews ...) and images. Results are encouraging and they illustrate the utility of relational aggregated search. * The second research direction that we pursued concerns cross-vertical aggregated search, which consists of assembling results from different vertical search engines (e.g. image search, video search, traditional Web search, ...) into one single interface. Here, different approaches exist in both research and industry. Our contribution concerns mostly evaluation and the interest (advantages) of this paradigm. We propose 4 different studies which simulate different search situations. Each study is tested with 100 different queries and 9 vertical sources. Here, we could clearly identify new advantages of this paradigm and we could identify different issues with evaluation setups. In particular, we observe that traditional information retrieval evaluation is not the fastest but it remains the most realistic. To conclude, we propose different studies with respect to two promising research directions. On one hand, we deal with three important problems of relational aggregated search following with real prototype applications with encouraging results. On the other hand, we have investigated on the interest and evaluation of cross-vertical aggregated search. Here, we could clearly identify some of the advantages and evaluation issues. In a long term perspective, we foresee a possible combination of these two kinds of approaches to provide relational and cross-vertical information retrieval incorporating more focus, structure and multimedia in search results

    EOS: A project to investigate the design and construction of real-time distributed embedded operating systems

    Get PDF
    The EOS project is investigating the design and construction of a family of real-time distributed embedded operating systems for reliable, distributed aerospace applications. Using the real-time programming techniques developed in co-operation with NASA in earlier research, the project staff is building a kernel for a multiple processor networked system. The first six months of the grant included a study of scheduling in an object-oriented system, the design philosophy of the kernel, and the architectural overview of the operating system. In this report, the operating system and kernel concepts are described. An environment for the experiments has been built and several of the key concepts of the system have been prototyped. The kernel and operating system is intended to support future experimental studies in multiprocessing, load-balancing, routing, software fault-tolerance, distributed data base design, and real-time processing

    Qualities, objects, sorts, and other treasures : gold digging in English and Arabic

    Get PDF
    In the present monograph, we will deal with questions of lexical typology in the nominal domain. By the term "lexical typology in the nominal domain", we refer to crosslinguistic regularities in the interaction between (a) those areas of the lexicon whose elements are capable of being used in the construction of "referring phrases" or "terms" and (b) the grammatical patterns in which these elements are involved. In the traditional analyses of a language such as English, such phrases are called "nominal phrases". In the study of the lexical aspects of the relevant domain, however, we will not confine ourselves to the investigation of "nouns" and "pronouns" but intend to take into consideration all those parts of speech which systematically alternate with nouns, either as heads or as modifiers of nominal phrases. In particular, this holds true for adjectives both in English and in other Standard European Languages. It is well known that adjectives are often difficult to distinguish from nouns, or that elements with an overt adjectival marker are used interchangeably with nouns, especially in particular semantic fields such as those denoting MATERIALS or NATlONALlTIES. That is, throughout this work the expression "lexical typology in the nominal domain" should not be interpreted as "a typology of nouns", but, rather, as the cross-linguistic investigation of lexical areas constitutive for "referring phrases" irrespective of how the parts-of-speech system in a specific language is defined

    Spatial stream modeling of Louisiana Waterthrush (\u3ci\u3eParkesia motacilla\u3c/i\u3e) foraging substrate and aquatic prey in a watershed undergoing shale gas development

    Get PDF
    We demonstrate the use of spatial stream network models (SSNMs) to explore relationships between a semiaquatic bioindicator songbird, Louisiana Waterthrush (Parkesia motacilla), and stream monitoring and benthic macroinvertebrate data in an area undergoing shale gas development. SSNMs allowed us to account for spatial autocorrelation inherent to these environmental data types and stream properties that traditional modeling approaches cannot capture to elucidate factors that affect waterthrush foraging locations. We monitored waterthrush along 58.1 km of 1st- and 2nd-order headwater stream tributaries (n = 14) in northwestern West Virginia over a two year period (2013–2014), sampled benthic macroinvertebrates in waterthrush territories, and collected wetted perimeter stream channel and water chemistry data along a 50 m fixed point stream grid. Spatial models outperformed traditional regression models and made a statistical difference in whether stream covariates of interest were considered relatable to waterthrush foraging. Waterthrush foraging probability index (FPI) was greater in areas where family and genus-level multi-metric indices of biotic stream integrity were higher (i.e. WVSCI and GLIMPSS). Waterthrush were found foraging both among stream flow connected and unconnected sampled sites on relatively further upstream locations where WVSCI and GLIMPSS were predicted to be highest. While there was no significant relationship found between FPI and shale gas land use on a catchment area scale, further information on waterthrush trophic dynamics and bioaccumulation of surface contaminants is needed before establishing the extent to which waterthrush foraging may be affected by shale gas development
    • 

    corecore