189 research outputs found

    2000 - Geology Bibliography of California, 1854–2000

    Get PDF
    The scope of this database is much broader in scope than the original list of consultant reports submitted to Monterey County governmental agencies and includes extensive references on regional geologic mapping, hydrogeology, economic geology, and research done in connection with the Parkfield Earthquake Prediction Experiment. Major sources of information include: • Monterey County Planning Department: a database of approximately 2,000 references within the categories of geology, soil, water resources, and water quality. • Monterey County Water Resources Agency. • Monterey Peninsula Water Management District. • The American Geological Institute’s GeoRef database. • The U.S. Geological Survey’s National Geologic Map Database. • An online bibliography of research conducted in the northern Santa Lucia Mountains, Big Sur, and surrounding area published for the Santa Lucia Natural History Symposium (sponsored by Esalen Institute and University of California Big Creek Reserve, 1994–1997). • Library catalogs of the U.S. Geological Survey, University of California, California State University, Stanford University, and the California Institute of Technology. The list of nearly 4,300 references was prepared to further the County of Monterey\u27s 21st Century General Plan Update in order to have the most complete data available for planning and policy decisions.https://digitalcommons.csumb.edu/hornbeck_cgb_5/1037/thumbnail.jp

    Climate variability and change : hydrological impacts

    Get PDF

    Data-efficient methods for information extraction

    Get PDF
    Strukturierte Wissensrepräsentationssysteme wie Wissensdatenbanken oder Wissensgraphen bieten Einblicke in Entitäten und Beziehungen zwischen diesen Entitäten in der realen Welt. Solche Wissensrepräsentationssysteme können in verschiedenen Anwendungen der natürlichen Sprachverarbeitung eingesetzt werden, z. B. bei der semantischen Suche, der Beantwortung von Fragen und der Textzusammenfassung. Es ist nicht praktikabel und ineffizient, diese Wissensrepräsentationssysteme manuell zu befüllen. In dieser Arbeit entwickeln wir Methoden, um automatisch benannte Entitäten und Beziehungen zwischen den Entitäten aus Klartext zu extrahieren. Unsere Methoden können daher verwendet werden, um entweder die bestehenden unvollständigen Wissensrepräsentationssysteme zu vervollständigen oder ein neues strukturiertes Wissensrepräsentationssystem von Grund auf zu erstellen. Im Gegensatz zu den gängigen überwachten Methoden zur Informationsextraktion konzentrieren sich unsere Methoden auf das Szenario mit wenigen Daten und erfordern keine große Menge an kommentierten Daten. Im ersten Teil der Arbeit haben wir uns auf das Problem der Erkennung von benannten Entitäten konzentriert. Wir haben an der gemeinsamen Aufgabe von Bacteria Biotope 2019 teilgenommen. Die gemeinsame Aufgabe besteht darin, biomedizinische Entitätserwähnungen zu erkennen und zu normalisieren. Unser linguistically informed Named-Entity-Recognition-System besteht aus einem Deep-Learning-basierten Modell, das sowohl verschachtelte als auch flache Entitäten extrahieren kann; unser Modell verwendet mehrere linguistische Merkmale und zusätzliche Trainingsziele, um effizientes Lernen in datenarmen Szenarien zu ermöglichen. Unser System zur Entitätsnormalisierung verwendet String-Match, Fuzzy-Suche und semantische Suche, um die extrahierten benannten Entitäten mit den biomedizinischen Datenbanken zu verknüpfen. Unser System zur Erkennung von benannten Entitäten und zur Entitätsnormalisierung erreichte die niedrigste Slot-Fehlerrate von 0,715 und belegte den ersten Platz in der gemeinsamen Aufgabe. Wir haben auch an zwei gemeinsamen Aufgaben teilgenommen: Adverse Drug Effect Span Detection (Englisch) und Profession Span Detection (Spanisch); beide Aufgaben sammeln Daten von der Social Media Plattform Twitter. Wir haben ein Named-Entity-Recognition-Modell entwickelt, das die Eingabedarstellung des Modells durch das Stapeln heterogener Einbettungen aus verschiedenen Domänen verbessern kann; unsere empirischen Ergebnisse zeigen komplementäres Lernen aus diesen heterogenen Einbettungen. Unser Beitrag belegte den 3. Platz in den beiden gemeinsamen Aufgaben. Im zweiten Teil der Arbeit untersuchten wir Strategien zur Erweiterung synthetischer Daten, um ressourcenarme Informationsextraktion in spezialisierten Domänen zu ermöglichen. Insbesondere haben wir backtranslation an die Aufgabe der Erkennung von benannten Entitäten auf Token-Ebene und der Extraktion von Beziehungen auf Satzebene angepasst. Wir zeigen, dass die Rückübersetzung sprachlich vielfältige und grammatikalisch kohärente synthetische Sätze erzeugen kann und als wettbewerbsfähige Erweiterungsstrategie für die Aufgaben der Erkennung von benannten Entitäten und der Extraktion von Beziehungen dient. Bei den meisten realen Aufgaben zur Extraktion von Beziehungen stehen keine kommentierten Daten zur Verfügung, jedoch ist häufig ein großer unkommentierter Textkorpus vorhanden. Bootstrapping-Methoden zur Beziehungsextraktion können mit diesem großen Korpus arbeiten, da sie nur eine Handvoll Startinstanzen benötigen. Bootstrapping-Methoden neigen jedoch dazu, im Laufe der Zeit Rauschen zu akkumulieren (bekannt als semantische Drift), und dieses Phänomen hat einen drastischen negativen Einfluss auf die endgültige Genauigkeit der Extraktionen. Wir entwickeln zwei Methoden zur Einschränkung des Bootstrapping-Prozesses, um die semantische Drift bei der Extraktion von Beziehungen zu minimieren. Unsere Methoden nutzen die Graphentheorie und vortrainierte Sprachmodelle, um verrauschte Extraktionsmuster explizit zu identifizieren und zu entfernen. Wir berichten über die experimentellen Ergebnisse auf dem TACRED-Datensatz für vier Relationen. Im letzten Teil der Arbeit demonstrieren wir die Anwendung der Domänenanpassung auf die anspruchsvolle Aufgabe der mehrsprachigen Akronymextraktion. Unsere Experimente zeigen, dass die Domänenanpassung die Akronymextraktion in wissenschaftlichen und juristischen Bereichen in sechs Sprachen verbessern kann, darunter auch Sprachen mit geringen Ressourcen wie Persisch und Vietnamesisch.The structured knowledge representation systems such as knowledge base or knowledge graph can provide insights regarding entities and relationship(s) among these entities in the real-world, such knowledge representation systems can be employed in various natural language processing applications such as semantic search, question answering and text summarization. It is infeasible and inefficient to manually populate these knowledge representation systems. In this work, we develop methods to automatically extract named entities and relationships among the entities from plain text and hence our methods can be used to either complete the existing incomplete knowledge representation systems to create a new structured knowledge representation system from scratch. Unlike mainstream supervised methods for information extraction, our methods focus on the low-data scenario and do not require a large amount of annotated data. In the first part of the thesis, we focused on the problem of named entity recognition. We participated in the shared task of Bacteria Biotope 2019, the shared task consists of recognizing and normalizing the biomedical entity mentions. Our linguistically informed named entity recognition system consists of a deep learning based model which can extract both nested and flat entities; our model employed several linguistic features and auxiliary training objectives to enable efficient learning in data-scarce scenarios. Our entity normalization system employed string match, fuzzy search and semantic search to link the extracted named entities to the biomedical databases. Our named entity recognition and entity normalization system achieved the lowest slot error rate of 0.715 and ranked first in the shared task. We also participated in two shared tasks of Adverse Drug Effect Span detection (English) and Profession Span Detection (Spanish); both of these tasks collect data from the social media platform Twitter. We developed a named entity recognition model which can improve the input representation of the model by stacking heterogeneous embeddings from a diverse domain(s); our empirical results demonstrate complementary learning from these heterogeneous embeddings. Our submission ranked 3rd in both of the shared tasks. In the second part of the thesis, we explored synthetic data augmentation strategies to address low-resource information extraction in specialized domains. Specifically, we adapted backtranslation to the token-level task of named entity recognition and sentence-level task of relation extraction. We demonstrate that backtranslation can generate linguistically diverse and grammatically coherent synthetic sentences and serve as a competitive augmentation strategy for the task of named entity recognition and relation extraction. In most of the real-world relation extraction tasks, the annotated data is not available, however, quite often a large unannotated text corpus is available. Bootstrapping methods for relation extraction can operate on this large corpus as they only require a handful of seed instances. However, bootstrapping methods tend to accumulate noise over time (known as semantic drift) and this phenomenon has a drastic negative impact on the final precision of the extractions. We develop two methods to constrain the bootstrapping process to minimise semantic drift for relation extraction; our methods leverage graph theory and pre-trained language models to explicitly identify and remove noisy extraction patterns. We report the experimental results on the TACRED dataset for four relations. In the last part of the thesis, we demonstrate the application of domain adaptation to the challenging task of multi-lingual acronym extraction. Our experiments demonstrate that domain adaptation can improve acronym extraction within scientific and legal domains in 6 languages including low-resource languages such as Persian and Vietnamese

    Latency-driven performance in data centres

    Get PDF
    Data centre based cloud computing has revolutionised the way businesses use computing infrastructure. Instead of building their own data centres, companies rent computing resources and deploy their applications on cloud hardware. Providing customers with well-defined application performance guarantees is of paramount importance to ensure transparency and to build a lasting collaboration between users and cloud operators. A user’s application performance is subject to the constraints of the resources it has been allocated and to the impact of the network conditions in the data centre. In this dissertation, I argue that application performance in data centres can be improved through cluster scheduling of applications informed by predictions of application performance for given network latency, and measurements of current network latency in data centres between hosts. Firstly, I show how to use the Precision Time Protocol (PTP), through an open-source software implementation PTPd, to measure network latency and packet loss in data centres. I propose PTPmesh, which uses PTPd, as a cloud network monitoring tool for tenants. Furthermore, I conduct a measurement study using PTPmesh in different cloud providers, finding that network latency variability in data centres is still common. Normal latency values in data centres are in the order of tens or hundreds of microseconds, while unexpected events, such as network congestion or packet loss, can lead to latency spikes in the order of milliseconds. Secondly, I show that network latency matters for certain distributed applications even in small amounts of tens or hundreds of microseconds, significantly reducing their performance. I propose a methodology to determine the impact of network latency on distributed applications performance by injecting artificial delay into the network of an experimental setup. Based on the experimental results, I build functions that predict the performance of an application for a given network latency. Given the network latency variability observed in data centers, applications’ performance is determined by their placement within the data centre. Thirdly, I propose latency-driven, application performance-aware, cluster scheduling as a way to provide performance guarantees to applications. I introduce NoMora, a cluster scheduling architecture that leverages the predictions of application performance dependent upon network latency combined with dynamic network latency measurements taken between pairs of hosts in data centres to place applications. Moreover, I show that NoMora improves application performance by choosing better placements than other scheduling policies.MEASUREMENT FOR EUROPE: TRAINING AND RESEARCH FOR INTERNET COMMUNICATIONS SCIENCE, European Commission FP7 Marie Curie Innovative Training Networks (ITN) ENDEAVOUR, European Commission Horizon 2020 (H2020) Industrial Leadership (IL

    "Colearning" - Collaborative Open Learning through OER and Social Media

    Get PDF
    This chapter introduces the concept of coLearning as well as discussing how open learning networks can produce, share and reuse OER collaboratively through social media. COLEARNING OBJECTIVES The aim of this investigation is to identify new forms of collaboration, as well as strategies that can be used to make the production and adaptation processes of OER more explicit for anyone in a social network to contribute. REUSABILITY This open content is an adapted version of a conference paper for OCW conference 2012, which was created by the same authors. This chapter can be reused by: Educators who would like to create reusable OER (images, videos, maps, units) Learners who are interested in tools for reusing and adapting OER Content developers who are looking for different media to enrich OER Social network users who would like to produce and share open media conten

    Fresh studies in Rio Grande Valley history

    Get PDF
    Jim Wells, George Parr, Pepe Martin, and Gene Falcón: the spirit of ‘’El Patrón’’ along the Rio Grande of South Texas / Billy Hathorn -- The other underground railroad / Rolando Avila -- Frank Ellis Ferree, humanitarian / Norman Rozeff -- Chip Dameron\u27s Rio Grande Valley: center of a narrowing universe / Ronny Noor -- Historia de la education superior en la ciudad de H. Matamoros, Tamaulipas / Miguel Sesis Botti y Maria Elena Flores Montalvo -- The quest for a public library for Brownsville / Anthony K. Knopp and Alma Ortiz Knopp -- Las Palomas Wildlife management area: a hidden natural jewel of the Rio Grande Valley / Noe E. Perez -- La Beulah: remembering the eye of the storm / Manuel F. Medrano -- Migracion en Matamoros: un laboratorio de la complejdad migratoria en la frontra Mexico-Estados Unidos / Cirila Quintero Ramirez -- Coyotes en accion: relatos de traficantes de migrantes en Reynosa / Oscar Misael Hernandez-Hernandez -- Border Walls, DREAMers and Trump: politics, policy and banality of evil / Terence M. Garrett and Paul J. Pope -- The social, political, and environmental forces contributing to the immigration crisis at the Texas-Mexico border / Mitchell A. Kaplan -- Las violencias sociales y la imparticion de la justicia en Valle Hermoso, Tamaulipas / Arturo Zarate Ruiz -- U.S.-Mexico border spillover violence 2010-2019 / Antonio N. Zavaleta -- La fiebre polca, a poem / Susana Nevarez Marquez.https://scholarworks.utrgv.edu/regionalhist/1016/thumbnail.jp

    The quality of relational exchanges and its impact on marketing strategies - a buyer\u27s perspective in food industry in Slovenia

    Get PDF

    The Murray Ledger and Times, July 6, 1991

    Get PDF
    • …
    corecore