113 research outputs found
Constructing Word-Context-Coupled Space Aligned with Associative Knowledge Relations for Interpretable Language Modeling
As the foundation of current natural language processing methods, pre-trained
language model has achieved excellent performance. However, the black-box
structure of the deep neural network in pre-trained language models seriously
limits the interpretability of the language modeling process. After revisiting
the coupled requirement of deep neural representation and semantics logic of
language modeling, a Word-Context-Coupled Space (W2CSpace) is proposed by
introducing the alignment processing between uninterpretable neural
representation and interpretable statistical logic. Moreover, a clustering
process is also designed to connect the word- and context-level semantics.
Specifically, an associative knowledge network (AKN), considered interpretable
statistical logic, is introduced in the alignment process for word-level
semantics. Furthermore, the context-relative distance is employed as the
semantic feature for the downstream classifier, which is greatly different from
the current uninterpretable semantic representations of pre-trained models. Our
experiments for performance evaluation and interpretable analysis are executed
on several types of datasets, including SIGHAN, Weibo, and ChnSenti. Wherein a
novel evaluation strategy for the interpretability of machine learning models
is first proposed. According to the experimental results, our language model
can achieve better performance and highly credible interpretable ability
compared to related state-of-the-art methods.Comment: Accepted at ACL 2023, Finding
An Adversarial Multi-Task Learning Method for Chinese Text Correction with Semantic Detection
Text correction, especially the semantic correction of more widely used
scenes, is strongly required to improve, for the fluency and writing efficiency
of the text. An adversarial multi-task learning method is proposed to enhance
the modeling and detection ability of character polysemy in Chinese sentence
context. Wherein, two models, the masked language model and scoring language
model, are introduced as a pair of not only coupled but also adversarial
learning tasks. Moreover, the Monte Carlo tree search strategy and a policy
network are introduced to accomplish the efficient Chinese text correction task
with semantic detection. The experiments are executed on three datasets and
five comparable methods, and the experimental results show that our method can
obtain good performance in Chinese text correction task for better semantic
rationality.Comment: Published on 31st International Conference on Artificial Neural
Networ
A study of evacuation efficiency of a hopper-shape exit by using mice under high competition
Exit is the bottleneck of an evacuation from a room and the flow rate through an exit is believed to be depended on its width. A series of experiments were conducted in a bi-dimensional container where mice were driven to pass through two kinds of exit of the identical width, i.e., a conventional exit and a hopper-shape exit. The evacuation efficiency of the two exits was experimentally compared by using mice under competition. The results showed that a hopper-shape exit reduces the escape time by 25% compared with a conventional exit. Further study was conducted with the presence of a column in front of the two exits. The presence of a column in front of the conventional exit increases the escape time by 22.5%. On the contrary, the placement of column in front of the hopper-shape exit reduces the escape time by 48%. The study showed that the escape efficiency could be greatly improved by appropriately redesigning configuration of exit
OpenDigger: Data Mining and Information Service System for Open Collaboration Digital Ecosystem
The widespread development and adoption of open-source software have built an
ecosystem for open development and collaboration. In this ecosystem,
individuals and organizations collaborate to create high-quality software that
can be used by everyone. Social collaboration platforms like GitHub have
further facilitated large-scale, distributed, and fine-grained code
collaboration and technical interactions. Countless developers contribute code,
review code, report bugs, and propose new features on these platforms every
day, generating a massive amount of valuable behavioral data from the open
collaboration process. This paper presents the design and implementation of
OpenDigger, a comprehensive data mining and information service system for open
collaboration in the digital ecosystem. The goal is to build a data
infrastructure for the open-source domain and promote the continuous
development of the open-source ecosystem. The metrics and analysis models in
the OpenDigger system can mine various knowledge from the macro to micro levels
in the open-source digital ecosystem. Through a unified information service
interface, OpenDigger provides various open-source information services to
different user groups, including governments, enterprises, foundations, and
individuals. As a novel information service system in the open-source
ecosystem, this paper demonstrates the effectiveness of the metrics and models
in OpenDigger through several real-world scenarios, including products, tools,
applications, and courses. It showcases the significant and diverse practical
applications of the metrics and models in both algorithmic and business
aspects.Comment: in Chinese languag
OpenPerf: A Benchmarking Framework for the Sustainable Development of the Open-Source Ecosystem
Benchmarking involves designing scientific test methods, tools, and
frameworks to quantitatively and comparably assess specific performance
indicators of certain test subjects. With the development of artificial
intelligence, AI benchmarking datasets such as ImageNet and DataPerf have
gradually become consensus standards in both academic and industrial fields.
However, constructing a benchmarking framework remains a significant challenge
in the open-source domain due to the diverse range of data types, the wide
array of research issues, and the intricate nature of collaboration networks.
This paper introduces OpenPerf, a benchmarking framework designed for the
sustainable development of the open-source ecosystem. This framework defines 9
task benchmarking tasks in the open-source research, encompassing 3 data types:
time series, text, and graphics, and addresses 6 research problems including
regression, classification, recommendation, ranking, network building, and
anomaly detection. Based on the above tasks, we implemented 3 data science task
benchmarks, 2 index-based benchmarks, and 1 standard benchmark. Notably, the
index-based benchmarks have been adopted by the China Electronics
Standardization Institute as evaluation criteria for open-source community
governance. Additionally, we have developed a comprehensive toolkit for
OpenPerf, which not only offers robust data management, tool integration, and
user interface capabilities but also adopts a Benchmarking-as-a-Service (BaaS)
model to serve academic institutions, industries, and foundations. Through its
application in renowned companies and institutions such as Alibaba, Ant Group,
and East China Normal University, we have validated OpenPerf's pivotal role in
the healthy evolution of the open-source ecosystem
Locale-varying relationships between tourism development and retail property prices in a shopping destination
Existing literature has inadequately examined the nexus between tourism and property prices. Additionally, it mainly focuses on hotels and housing, thereby overlooking other property categories (e.g., retail properties). The relationship between tourism development and retail property prices in shopping destinations (e.g., Hong Kong and Singapore) may hinge on the locale. More specifically, the relationship may be different in the tourist precinct or popular tourism shopping area (PTSA) and the unpopular tourism shopping area (UTSA). This study examines locale-varying relationships between tourism development (measured by tourist volume and tourism expenditure) and retail property prices from 2002Q1 to 2014Q4 in Hong Kong using standard and error-correction-model-based (ECM-based) Granger causality tests. Results of standard Granger causality tests indicate that tourism development Granger causes the increase in retail property prices in the PTSA but not in the UTSA. Moreover, results of ECM-based Granger causality tests further verify the robustness and plausibility of the tourism-led growth (in retail property prices) hypothesis in the PTSA. In other words, we find that tourism development measures can be used to better predict changes in retail property prices in the PTSA than simply referring to the price history
Voltage control strategy of a high-permeability photovoltaic distribution network based on cluster division
The use of distributed photovoltaics (PVs) on a large scale often causes voltage over-limit problems in distribution networks. This paper proposes a distributed photovoltaic cluster collaborative optimization voltage control strategy based on an improved community algorithm to address the issue of centralized control being unable to respond quickly to the randomness of distributed photovoltaics and the difficulty of achieving overall coordination with local control. First, by improving the community algorithm, the division of reactive and active clusters, considering the power balance and node coupling degree, is realized. Then, the cluster-coordinated voltage control strategy is proposed by making full use of the power control ability of a photovoltaic inverter. Finally, a voltage regulation ability evaluation index is proposed to assess the node regulation ability within the cluster and select key nodes. This effectively reduces the number of control nodes. The simulation analysis of the improved IEEE 69 distribution network shows that the proposed voltage control strategy can mitigate the issue of voltage over-limit in high-permeability distributed photovoltaic access distribution and enhance the photovoltaic consumption capacity
Highly efficient room-temperature nonvolatile magnetic switching by current in Fe3GaTe2 thin flakes
Effectively tuning magnetic state by using current is essential for novel
spintronic devices. Magnetic van der Waals (vdW) materials have shown superior
properties for the applications of magnetic information storage based on the
efficient spin torque effect. However, for most of known vdW ferromagnets, the
ferromagnetic transition temperatures lower than room temperature strongly
impede their applications and the room-temperature vdW spintronic device with
low energy consumption is still a long-sought goal. Here, we realize the highly
efficient room-temperature nonvolatile magnetic switching by current in a
single-material device based on vdW ferromagnet Fe3GaTe2. Moreover, the
switching current density and power dissipation are about 300 and 60000 times
smaller than conventional spin-orbit-torque devices of magnet/heavymetal
heterostructures. These findings make an important progress on the applications
of magnetic vdW materials in the fields of spintronics and magnetic information
storage.Comment: 18 page2, 4 figure
- …