Search CORE

20 research outputs found

Topic Extraction from Two Hundred Million Tweets related to the East Japan Great Earthquake

Author: 橋本隆子
Publication venue
Publication date: 01/03/2016
Field of study

Chiba University of Commerce Academic Repository / 千葉商科大学学術リポジトリ

チョウ　コウソク　カテゴリカル　トクチョウ　センタク　シュホウ　ヲ　モチイタ　コウ　ジゲン　データ　ノ　トクチョウ　ヅケ

Author: Hashimoto Takako
Isogami Sadao
Kidokoro Hiroyasu
Kuboyama Tetsuji
Murakami Toshio
Shin Yoshihiro
久保山哲二
城所弘泰
村上登志男
橋本隆子
申吉浩
磯上貞雄
Publication venue: 学習院大学計算機センター
Publication date: 01/01/2017
Field of study

研究報

GLIM IR Institution Repository

Hybrid intelligence for data mining

Author: Chu Wing Yan
Publication venue: UNSW, Sydney
Publication date: 01/01/2016
Field of study

Today, enormous amount of data are being recorded in all kinds of activities. This sheer size provides an excellent opportunity for data scientists to retrieve valuable information using data mining techniques. Due to the complexity of data in many neoteric problems, one-size-fits-all solutions are seldom able to provide satisfactory answers. Although the studies of data mining have been active, hybrid techniques are rarely scrutinized in detail. Currently, not many techniques can handle time-varying properties while performing their core functions, neither do they retrieve and combine information from heterogeneous dimensions, e.g., textual and numerical horizons. This thesis summarizes our investigations on hybrid methods to provide data mining solutions to problems involving non-trivial datasets, such as trajectories, microblogs, and financial data. First, time-varying dynamic Bayesian networks are extended to consider both causal and dynamic regularization requirements. Combining with density-based clustering, the enhancements overcome the difficulties in modeling spatial-temporal data where heterogeneous patterns, data sparseness and distribution skewness are common. Secondly, topic-based methods are proposed for emerging outbreak and virality predictions on microblogs. Complicated models that consider structural details are popular while others might have taken overly simplified assumptions to sacrifice accuracy for efficiency. Our proposed virality prediction solution delivers the benefits of both worlds. It considers the important characteristics of a structure yet without the burden of fine details to reduce complexity. Thirdly, the proposed topic-based approach for microblog mining is extended for sentiment prediction problems in finance. Sentiment-of-topic models are learned from both commentaries and prices for better risk management. Moreover, previously proposed, supervised topic model provides an avenue to associate market volatility with financial news yet it displays poor resolutions at extreme regions. To overcome this problem, extreme topic model is proposed to predict volatility in financial markets by using supervised learning. By mapping extreme events into Poisson point processes, volatile regions are magnified to reveal their hidden volatility-topic relationships. Lastly, some of the proposed hybrid methods are applied to service computing to verify that they are sufficiently generic for wider applications

UNSWorks

千葉商大紀要第53巻第2号全1冊

Author
Publication venue: 千葉商科大学国府台学会
Publication date: 31/03/2016
Field of study

Chiba University of Commerce Academic Repository / 千葉商科大学学術リポジトリ

Understanding Events:A Diversity-driven Human-Machine Approach

Author: Inel Oana
Publication venue
Publication date: 09/03/2022
Field of study

VU Research Portal

Advances in database technology - EDBT 2016: 19th International Conference on Extending Database Technology, Bordeaux, France, March 15-18, 2016 : proceedings

Author
Publication venue: University of Konstanz, University Library
Publication date: 01/01/2016
Field of study

Digitale Bibliothek Thüringen

An Initial Framework Assessing the Safety of Complex Systems

Author: Burton Simon
Garnett Philip
McDermid John Alexander
Weaver Rob
Publication venue: Zenodo
Publication date: 07/12/2020
Field of study

Trabajo presentado en la Conference on Complex Systems, celebrada online del 7 al 11 de diciembre de 2020.Atmospheric blocking events, that is large-scale nearly stationary atmospheric pressure patterns, are often associated with extreme weather in the mid-latitudes, such as heat waves and cold spells which have significant consequences on ecosystems, human health and economy. The high impact of blocking events has motivated numerous studies. However, there is not yet a comprehensive theory explaining their onset, maintenance and decay and their numerical prediction remains a challenge. In recent years, a number of studies have successfully employed complex network descriptions of fluid transport to characterize dynamical patterns in geophysical flows. The aim of the current work is to investigate the potential of so called Lagrangian flow networks for the detection and perhaps forecasting of atmospheric blocking events. The network is constructed by associating nodes to regions of the atmosphere and establishing links based on the flux of material between these nodes during a given time interval. One can then use effective tools and metrics developed in the context of graph theory to explore the atmospheric flow properties. In particular, Ser-Giacomi et al. [1] showed how optimal paths in a Lagrangian flow network highlight distinctive circulation patterns associated with atmospheric blocking events. We extend these results by studying the behavior of selected network measures (such as degree, entropy and harmonic closeness centrality)at the onset of and during blocking situations, demonstrating their ability to trace the spatio-temporal characteristics of these events.This research was conducted as part of the CAFE (Climate Advanced Forecasting of sub-seasonal Extremes) Innovative Training Network which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 813844

Digital.CSIC

White Rose Research Online

Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

Author: Foltynová Pavla
Ornerová Kateřina
Publication venue: International Society for Scientometrics and Informetrics
Publication date: 01/01/2019
Field of study

Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

Univerzitní repozitář Masarykovy univerzity

DRONE DELIVERY OF CBNRECy – DEW WEAPONS Emerging Threats of Mini-Weapons of Mass Destruction and Disruption (WMDD)

Author: Carter Candice
Harding Troy
Hood John Paul
Jackson Mark
Lonstein Wayne
Mai Randall
McCreight Robert
Monnik Mike
Mumm Hans
Nichols Randall K.
Sincavage Suzanne
Slofer William
Publication venue: 'New Prairie Press'
Publication date: 09/05/2022
Field of study

Drone Delivery of CBNRECy – DEW Weapons: Emerging Threats of Mini-Weapons of Mass Destruction and Disruption (WMDD) is our sixth textbook in a series covering the world of UASs and UUVs. Our textbook takes on a whole new purview for UAS / CUAS/ UUV (drones) – how they can be used to deploy Weapons of Mass Destruction and Deception against CBRNE and civilian targets of opportunity. We are concerned with the future use of these inexpensive devices and their availability to maleficent actors. Our work suggests that UASs in air and underwater UUVs will be the future of military and civilian terrorist operations. UAS / UUVs can deliver a huge punch for a low investment and minimize human casualties.https://newprairiepress.org/ebooks/1046/thumbnail.jp

Kansas State University

Recommended from our members

Efficient Machine Teaching Frameworks for Natural Language Processing

Author: Karamanolakis Ioannis
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

The past decade has seen tremendous growth in potential applications of language technologies in our daily lives due to increasing data, computational resources, and user interfaces. An important step to support emerging applications is the development of algorithms for processing the rich variety of human-generated text and extracting relevant information. Machine learning, especially deep learning, has seen increasing success on various text benchmarks. However, while standard benchmarks have static tasks with expensive human-labeled data, real-world applications are characterized by dynamic task specifications and limited resources for data labeling, thus making it challenging to transfer the success of supervised machine learning to the real world. To deploy language technologies at scale, it is crucial to develop alternative techniques for teaching machines beyond data labeling. In this dissertation, we address this data labeling bottleneck by studying and presenting resource-efficient frameworks for teaching machine learning models to solve language tasks across diverse domains and languages. Our goal is to (i) support emerging real-world problems without the expensive requirement of large-scale manual data labeling; and (ii) assist humans in teaching machines via more flexible types of interaction. Towards this goal, we describe our collaborations with experts across domains (including public health, earth sciences, news, and e-commerce) to integrate weakly-supervised neural networks into operational systems, and we present efficient machine teaching frameworks that leverage flexible forms of declarative knowledge as supervision: coarse labels, large hierarchical taxonomies, seed words, bilingual word translations, and general labeling rules. First, we present two neural network architectures that we designed to leverage weak supervision in the form of coarse labels and hierarchical taxonomies, respectively, and highlight their successful integration into operational systems. Our Hierarchical Sigmoid Attention Network (HSAN) learns to highlight important sentences of potentially long documents without sentence-level supervision by, instead, using coarse-grained supervision at the document level. HSAN improves over previous weakly supervised learning approaches across sentiment classification benchmarks and has been deployed to help inspections in health departments for the discovery of foodborne illness outbreaks. We also present TXtract, a neural network that extracts attributes for e-commerce products from thousands of diverse categories without using manually labeled data for each category, by instead considering category relationships in a hierarchical taxonomy. TXtract is a core component of Amazon’s AutoKnow, a system that collects knowledge facts for over 10K product categories, and serves such information to Amazon search and product detail pages. Second, we present architecture-agnostic machine teaching frameworks that we applied across domains, languages, and tasks. Our weakly-supervised co-training framework can train any type of text classifier using just a small number of class-indicative seed words and unlabeled data. In contrast to previous work that use seed words to initialize embedding layers, our iterative seed word distillation (ISWD) method leverages the predictive power of seed words as supervision signals and shows strong performance improvements for aspect detection in reviews across domains and languages. We further demonstrate the cross-lingual transfer abilities of our co-training approach via cross-lingual teacher-student (CLTS), a method for training document classifiers across diverse languages using labeled documents only in English and a limited budget for bilingual translations. Not all classification tasks, however, can be effectively addressed using human supervision in the form of seed words. To capture a broader variety of tasks, we present weakly-supervised self-training (ASTRA), a weakly-supervised learning framework for training a classifier using more general labeling rules in addition to labeled and unlabeled data. As a complete set of accurate rules may be hard to obtain all in one shot, we further present an interactive framework that assists human annotators by automatically suggesting candidate labeling rules. In conclusion, this thesis demonstrates the benefits of teaching machines with different types of interaction than the standard data labeling paradigm and shows promising results for new applications across domains and languages. To facilitate future research, we publish our code implementations and design new challenging benchmarks with various types of supervision. We believe that our proposed frameworks and experimental findings will influence research and will enable new applications of language technologies without the costly requirement of large manually labeled datasets

Columbia University Academic Commons