200 research outputs found

    Proceedings of the 9th Dutch-Belgian Information Retrieval Workshop

    Get PDF

    Extraction of ontology and semantic web information from online business reports

    Get PDF
    CAINES, Content Analysis and INformation Extraction System, employs an information extraction (IE) methodology to extract unstructured text from the Web. It can create an ontology and a Semantic Web. This research is different from traditional IE systems in that CAINES examines the syntactic and semantic relationships within unstructured text of online business reports. Using CAINES provides more relevant results than manual searching or standard keyword searching. Over most extraction systems, CAINES extensively uses information extraction from natural language, Key Words in Context (KWIC), and semantic analysis. A total of 21 online business reports, averaging about 100 pages long, were used in this study. Based on financial expert opinions, extraction rules were created to extract information, an ontology, and a Semantic Web of data from financial reports. Using CAINES, one can extract information about global and domestic market conditions, market condition impacts, and information about the business outlook. A Semantic Web was created from Merrill Lynch reports, 107,533 rows of data, and displays information regarding mergers, acquisitions, and business segment news between 2007 and 2009. User testing of CAINES resulted in recall of 85.91%, precision of 87.16%, and an F-measure of 86.46%. Speed with CAINES was also greater than manually extracting information. Users agree that CAINES quickly and easily extracts unstructured information from financial reports on the EDGAR database

    From Debate to Design: Issues in Clean Energy and Climate Change Law and Policy

    Get PDF
    A report on the work of the REIL Network 2007-200

    Toponym Disambiguation in Information Retrieval

    Full text link
    In recent years, geography has acquired a great importance in the context of Information Retrieval (IR) and, in general, of the automated processing of information in text. Mobile devices that are able to surf the web and at the same time inform about their position are now a common reality, together with applications that can exploit this data to provide users with locally customised information, such as directions or advertisements. Therefore, it is important to deal properly with the geographic information that is included in electronic texts. The majority of such kind of information is contained as place names, or toponyms. Toponym ambiguity represents an important issue in Geographical Information Retrieval (GIR), due to the fact that queries are geographically constrained. There has been a struggle to nd speci c geographical IR methods that actually outperform traditional IR techniques. Toponym ambiguity may constitute a relevant factor in the inability of current GIR systems to take advantage from geographical knowledge. Recently, some Ph.D. theses have dealt with Toponym Disambiguation (TD) from di erent perspectives, from the development of resources for the evaluation of Toponym Disambiguation (Leidner (2007)) to the use of TD to improve geographical scope resolution (Andogah (2010)). The Ph.D. thesis presented here introduces a TD method based on WordNet and carries out a detailed study of the relationship of Toponym Disambiguation to some IR applications, such as GIR, Question Answering (QA) and Web retrieval. The work presented in this thesis starts with an introduction to the applications in which TD may result useful, together with an analysis of the ambiguity of toponyms in news collections. It could not be possible to study the ambiguity of toponyms without studying the resources that are used as placename repositories; these resources are the equivalent to language dictionaries, which provide the di erent meanings of a given word.Buscaldi, D. (2010). Toponym Disambiguation in Information Retrieval [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8912Palanci

    Carbon Free Boston: Social equity report 2019

    Full text link
    OVERVIEW: In January 2019, the Boston Green Ribbon Commission released its Carbon Free Boston: Summary Report, identifying potential options for the City of Boston to meet its goal of becoming carbon neutral by 2050. The report found that reaching carbon neutrality by 2050 requires three mutually-reinforcing strategies in key sectors: 1) deepen energy efficiency while reducing energy demand, 2) electrify activity to the fullest practical extent, and 3) use fuels and electricity that are 100 percent free of greenhouse gases (GHGs). The Summary Report detailed the ways in which these technical strategies will transform Boston’s physical infrastructure, including its buildings, energy supply, transportation, and waste management systems. The Summary Report also highlighted that it is how these strategies are designed and implemented that matter most in ensuring an effective and equitable transition to carbon neutrality. Equity concerns exist for every option the City has to reduce GHG emissions. The services provided by each sector are not experienced equally across Boston’s communities. Low-income families and families of color are more likely to live in residences that are in poor physical condition, leading to high utility bills, unsafe and unhealthy indoor environments, and high GHG emissions.1 Those same families face greater exposure to harmful outdoor air pollution compared to others. The access and reliability of public transportation is disproportionately worse in neighborhoods with large populations of people of color, and large swaths of vulnerable neighborhoods, from East Boston to Mattapan, do not have ready access to the city’s bike network. Income inequality is a growing national issue and is particularly acute in Boston, which consistently ranks among the highest US cities in regards to income disparities. With the release of Imagine Boston 2030, Mayor Walsh committed to make Boston more equitable, affordable, connected, and resilient. The Summary Report outlined the broad strokes of how action to reach carbon neutrality intersects with equity. A just transition to carbon neutrality improves environmental quality for all Bostonians, prioritizes socially vulnerable populations, seeks to redress current and past injustice, and creates economic and social opportunities for all. This Carbon Free Boston: Social Equity Report provides a deeper equity context for Carbon Free Boston as a whole, and for each strategy area, by demonstrating how inequitable and unjust the playing field is for socially vulnerable Bostonians and why equity must be integrated into policy design and implementation. This report summarizes the current landscape of climate action work for each strategy area and evaluates how it currently impacts inequity. Finally, this report provides guidance to the City and partners on how to do better; it lays out the attributes of an equitable approach to carbon neutrality, framed around three guiding principles: 1) plan carefully to avoid unintended consequences, 2) be intentional in design through a clear equity lens, and 3) practice inclusivity from start to finish

    Term Association Modelling in Information Retrieval

    Get PDF
    Many traditional Information Retrieval (IR) models assume that query terms are independent of each other. For those models, a document is normally represented as a bag of words/terms and their frequencies. Although traditional retrieval models can achieve reasonably good performance in many applications, the corresponding independence assumption has limitations. There are some recent studies that investigate how to model term associations/dependencies by proximity measures. However, the modeling of term associations theoretically under the probabilistic retrieval framework is still largely unexplored. In this thesis, I propose a new concept named Cross Term, to model term proximity, with the aim of boosting retrieval performance. With Cross Terms, the association of multiple query terms can be modeled in the same way as a simple unigram term. In particular, an occurrence of a query term is assumed to have an impact on its neighboring text. The degree of the query term impact gradually weakens with increasing distance from the place of occurrence. Shape functions are used to characterize such impacts. Based on this assumption, I first propose a bigram CRoss TErm Retrieval (CRTER2) model for probabilistic IR and a Language model based model CRTER2LM. Specifically, a bigram Cross Term occurs when the corresponding query terms appear close to each other, and its impact can be modeled by the intersection of the respective shape functions of the query terms. Second, I propose a generalized n-gram CRoss TErm Retrieval (CRTERn) model recursively for n query terms where n>2. For n-gram Cross Term, I develop several distance metrics with different properties and employ them in the proposed models for ranking. Third, an enhanced context-sensitive proximity model is proposed to boost the CRTER models, where the contextual relevance of term proximity is studied. The models are validated on several large standard data sets, and show improved performance over other state-of-art approaches. I also discusse the practical impact of the proposed models. The approaches in this thesis can also provide helpful benefit for term association modeling in other domains

    Growing a Green Economy for All: From Green Jobs to Green Ownership

    Get PDF
    This Democracy Collaborative report provides the first comprehensive survey of community wealth building institutions in the green economy. Featuring ten cases, the report identifies how policy and philanthropy can build on these examples to create "green jobs you can own.
    • …
    corecore