6,152 research outputs found
Bank Networks from Text: Interrelations, Centrality and Determinants
In the wake of the still ongoing global financial crisis, bank
interdependencies have come into focus in trying to assess linkages among banks
and systemic risk. To date, such analysis has largely been based on numerical
data. By contrast, this study attempts to gain further insight into bank
interconnections by tapping into financial discourse. We present a
text-to-network process, which has its basis in co-occurrences of bank names
and can be analyzed quantitatively and visualized. To quantify bank importance,
we propose an information centrality measure to rank and assess trends of bank
centrality in discussion. For qualitative assessment of bank networks, we put
forward a visual, interactive interface for better illustrating network
structures. We illustrate the text-based approach on European Large and Complex
Banking Groups (LCBGs) during the ongoing financial crisis by quantifying bank
interrelations and centrality from discussion in 3M news articles, spanning
2007Q1 to 2014Q3.Comment: Quantitative Finance, forthcoming in 201
Recommended from our members
Improving Computer Network Operations Through Automated Interpretation of State
Networked systems today are hyper-scaled entities that provide core functionality for distributed services and applications spanning personal, business, and government use. It is critical to maintain correct operation of these networks to avoid adverse business outcomes. The advent of programmable networks has provided much needed fine-grained network control, enabling providers and operators alike to build some innovative networking architectures and solutions. At the same time, they have given rise to new challenges in network management. These architectures, coupled with a multitude of devices, protocols, virtual overlays on top of physical data-plane etc. make network management a highly challenging task. Existing network management methodologies have not evolved at the same pace as the technologies and architectures. Current network management practices do not provide adequate solutions for highly dynamic, programmable environments. We have a long way to go in developing management methodologies that can meaningfully contribute to networks becoming self-healing entities. The goal of my research is to contribute to the design and development of networks towards transforming them into self-healing entities.
Network management includes a multitude of tasks, not limited to diagnosis and troubleshooting, but also performance engineering and tuning, security analysis etc. This research explores novel methods of utilizing network state to enhance networking capabilities. It is constructed around hypotheses based on careful analysis of practical deficiencies in the field. I try to generate real-world impact with my research by tackling problems that are prevalent in deployed networks, and that bear practical relevance to the current state of networking. The overarching goal of this body of work is to examine various approaches that could help enhance network management paradigms, providing administrators with a better understanding of the underlying state of the network, thus leading to more informed decision-making. The research looks into two distinct areas of network management, troubleshooting and routing, presenting novel approaches to accomplishing certain goals in each of these areas, demonstrating that they can indeed enhance the network management experience
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
Decoding algorithms for complex natural language tasks
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 79-85).This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a candidate. The task is challenging because in a typical problem the search space is large, and the dependencies between elements of the solution are complex. The goal of this work is two-fold. First, we are interested in developing decoding techniques with strong theoretical guarantees. We develop a decoding model based on the Integer Linear Programming paradigm which is guaranteed to compute the optimal solution and is capable of accounting for a wide range of global constraints. As an alternative, we also present a novel randomized algorithm which can guarantee an arbitrarily high probability of finding the optimal solution. We apply these methods to the task of constructing temporal graphs and to the task of title generation. Second, we are interested in carefully investigating the relations between learning and decoding. We build on the Perceptron framework to integrate the learning and decoding procedures into a single unified process. We use the resulting model to automatically generate tables-of-contents, structures with deep hierarchies and rich contextual dependencies. In all three natural language tasks, our experimental results demonstrate that theoretically grounded and stronger decoding strategies perform better than existing methods. As a final contribution, we have made the source code for these algorithms publicly available for the NLP research community.by Pawan Deshpande.M.Eng
Searching and mining in enriched geo-spatial data
The emergence of new data collection mechanisms in geo-spatial applications paired with a heightened tendency of users to volunteer information provides an ever-increasing flow of data of high volume, complex nature, and often associated with inherent uncertainty. Such mechanisms include crowdsourcing, automated knowledge inference, tracking, and social media data repositories. Such data bearing additional information from multiple sources like probability
distributions, text or numerical attributes, social context, or multimedia content can be called multi-enriched. Searching and mining this abundance of information holds many challenges, if all of the data's potential is to be released.
This thesis addresses several major issues arising in that field, namely path queries using multi-enriched data, trend mining in social media data, and handling uncertainty in geo-spatial data. In all cases, the developed methods have made significant contributions and have appeared in or were
accepted into various renowned international peer-reviewed venues.
A common use of geo-spatial data is path queries in road networks where traditional methods optimise results based on absolute and ofttimes singular metrics, i.e., finding the shortest paths based on distance or the best trade-off between distance and travel time. Integrating additional aspects like qualitative or social data by enriching the data model with knowledge derived from sources as mentioned above allows for queries that can be issued to fit a broader scope of needs or preferences.
This thesis presents two implementations of incorporating multi-enriched data into road networks. In one case, a range of qualitative data sources is evaluated to gain knowledge about user preferences which is subsequently matched with locations represented in a road network and integrated into its
components. Several methods are presented for highly customisable path queries that incorporate a wide spectrum of data.
In a second case, a framework is described for resource distribution with reappearance in road networks to serve one or more clients, resulting in paths that provide maximum gain based on a probabilistic evaluation of available resources. Applications for this include finding parking spots.
Social media trends are an emerging research area giving insight in user sentiment and important topics. Such trends consist of bursts of messages concerning a certain topic within a time frame, significantly deviating from the average appearance frequency of the same topic. By investigating the dissemination of such trends in space and time, this thesis presents methods to classify trend archetypes to predict future dissemination of a trend.
Processing and querying uncertain data is particularly demanding given the additional knowledge required to yield results with probabilistic guarantees. Since such knowledge is not always available and queries are not easily scaled to larger datasets due to the #P-complete nature of the problem, many existing approaches reduce the data to a deterministic representation of its underlying model to eliminate uncertainty. However, data uncertainty can also provide valuable insight into the nature of the data that cannot be represented in a deterministic manner.
This thesis presents techniques for clustering uncertain data as well as query processing, that take the additional information from uncertainty models into account while preserving scalability using a sampling-based approach, while previous approaches could only provide one of the two. The given solutions enable the application of various existing clustering techniques or query types to a
framework that manages the uncertainty.Das Erscheinen neuer Methoden zur Datenerhebung in räumlichen Applikationen gepaart mit einer erhöhten Bereitschaft der Nutzer, Daten über sich preiszugeben, generiert einen stetig steigenden Fluss von Daten in großer Menge, komplexer Natur, und oft gepaart mit inhärenter Unsicherheit. Beispiele für solche Mechanismen sind Crowdsourcing, automatisierte Wissensinferenz, Tracking, und Daten aus sozialen Medien. Derartige Daten, angereichert mit mit zusätzlichen Informationen aus verschiedenen Quellen wie Wahrscheinlichkeitsverteilungen, Text- oder numerische Attribute, sozialem Kontext, oder Multimediainhalten, werden als multi-enriched bezeichnet. Suche und Datamining in dieser weiten Datenmenge hält viele Herausforderungen bereit, wenn das gesamte Potenzial der Daten genutzt werden soll.
Diese Arbeit geht auf mehrere große Fragestellungen in diesem Feld ein, insbesondere Pfadanfragen in multi-enriched Daten, Trend-mining in Daten aus sozialen Netzwerken, und die Beherrschung von Unsicherheit in räumlichen Daten. In all diesen Fällen haben die entwickelten Methoden signifikante Forschungsbeiträge geleistet und wurden veröffentlicht oder angenommen zu diversen renommierten internationalen, von Experten begutachteten Konferenzen und Journals.
Ein gängiges Anwendungsgebiet räumlicher Daten sind Pfadanfragen in Straßennetzwerken, wo traditionelle Methoden die Resultate anhand absoluter und oft auch singulärer Maße optimieren, d.h., der kürzeste Pfad in Bezug auf die Distanz oder der beste Kompromiss zwischen Distanz und Reisezeit. Durch die Integration zusätzlicher Aspekte wie qualitativer Daten oder Daten aus sozialen Netzwerken als Anreicherung des Datenmodells mit aus diesen Quellen abgeleitetem Wissen werden Anfragen möglich, die ein breiteres Spektrum an Anforderungen oder Präferenzen erfüllen.
Diese Arbeit präsentiert zwei Ansätze, solche multi-enriched Daten in Straßennetze einzufügen. Zum einen wird eine Reihe qualitativer Datenquellen ausgewertet, um Wissen über Nutzerpräferenzen zu generieren, welches darauf mit Örtlichkeiten im Straßennetz abgeglichen und in das Netz integriert wird. Diverse Methoden werden präsentiert, die stark personalisierbare Pfadanfragen ermöglichen, die ein weites Spektrum an Daten mit einbeziehen.
Im zweiten Fall wird ein Framework präsentiert, das eine Ressourcenverteilung im Straßennetzwerk modelliert, bei der einmal verbrauchte Ressourcen erneut auftauchen können. Resultierende Pfade ergeben einen maximalen Ertrag basieren auf einer probabilistischen Evaluation der verfügbaren Ressourcen. Eine Anwendung ist die Suche nach Parkplätzen.
Trends in sozialen Medien sind ein entstehendes Forscchungsgebiet, das Einblicke in Benutzerverhalten und wichtige Themen zulässt. Solche Trends bestehen aus großen Mengen an Nachrichten zu einem bestimmten Thema innerhalb eines Zeitfensters, so dass die Auftrittsfrequenz signifikant über den durchschnittlichen Level liegt. Durch die Untersuchung der Fortpflanzung solcher Trends in Raum und Zeit präsentiert diese Arbeit Methoden,
um Trends nach Archetypen zu klassifizieren und ihren zukünftigen Weg vorherzusagen.
Die Anfragebearbeitung und Datamining in unsicheren Daten ist besonders herausfordernd, insbesondere im Hinblick auf das notwendige Zusatzwissen, um Resultate mit probabilistischen Garantien zu erzielen. Solches Wissen ist nicht immer verfügbar und Anfragen lassen sich aufgrund der \P-Vollständigkeit des Problems nicht ohne Weiteres auf größere Datensätze skalieren. Dennoch kann Datenunsicherheit wertvollen Einblick in die Struktur der Daten liefern, der mit deterministischen Methoden nicht erreichbar wäre. Diese Arbeit präsentiert Techniken zum Clustering unsicherer Daten sowie zur Anfragebearbeitung, die die Zusatzinformation aus dem Unsicherheitsmodell in Betracht ziehen, jedoch gleichzeitig die Skalierbarkeit des Ansatzes auf große Datenmengen sicherstellen
- …