30 research outputs found
Exact Single-Source SimRank Computation on Large Graphs
SimRank is a popular measurement for evaluating the node-to-node similarities
based on the graph topology. In recent years, single-source and top- SimRank
queries have received increasing attention due to their applications in web
mining, social network analysis, and spam detection. However, a fundamental
obstacle in studying SimRank has been the lack of ground truths. The only exact
algorithm, Power Method, is computationally infeasible on graphs with more than
nodes. Consequently, no existing work has evaluated the actual
trade-offs between query time and accuracy on large real-world graphs. In this
paper, we present ExactSim, the first algorithm that computes the exact
single-source and top- SimRank results on large graphs. With high
probability, this algorithm produces ground truths with a rigorous theoretical
guarantee. We conduct extensive experiments on real-world datasets to
demonstrate the efficiency of ExactSim. The results show that ExactSim provides
the ground truth for any single-source SimRank query with a precision up to 7
decimal places within a reasonable query time.Comment: ACM SIGMOD 202
Inefficiency of K-FAC for Large Batch Size Training
In stochastic optimization, using large batch sizes during training can
leverage parallel resources to produce faster wall-clock training times per
training epoch. However, for both training loss and testing error, recent
results analyzing large batch Stochastic Gradient Descent (SGD) have found
sharp diminishing returns, beyond a certain critical batch size. In the hopes
of addressing this, it has been suggested that the Kronecker-Factored
Approximate Curvature (\mbox{K-FAC}) method allows for greater scalability to
large batch sizes, for non-convex machine learning problems such as neural
network optimization, as well as greater robustness to variation in model
hyperparameters. Here, we perform a detailed empirical analysis of large batch
size training %of these two hypotheses, for both \mbox{K-FAC} and SGD,
evaluating performance in terms of both wall-clock time and aggregate
computational cost. Our main results are twofold: first, we find that both
\mbox{K-FAC} and SGD doesn't have ideal scalability behavior beyond a certain
batch size, and that \mbox{K-FAC} does not exhibit improved large-batch
scalability behavior, as compared to SGD; and second, we find that
\mbox{K-FAC}, in addition to requiring more hyperparameters to tune, suffers
from similar hyperparameter sensitivity behavior as does SGD. We discuss
extensive results using ResNet and AlexNet on \mbox{CIFAR-10} and SVHN,
respectively, as well as more general implications of our findings
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Transformer based architectures have become de-facto models used for a range
of Natural Language Processing tasks. In particular, the BERT based models
achieved significant accuracy gain for GLUE tasks, CoNLL-03 and SQuAD. However,
BERT based models have a prohibitive memory footprint and latency. As a result,
deploying BERT based models in resource constrained environments has become a
challenging task. In this work, we perform an extensive analysis of fine-tuned
BERT models using second order Hessian information, and we use our results to
propose a novel method for quantizing BERT models to ultra low precision. In
particular, we propose a new group-wise quantization scheme, and we use a
Hessian based mix-precision method to compress the model further. We
extensively test our proposed method on BERT downstream tasks of SST-2, MNLI,
CoNLL-03, and SQuAD. We can achieve comparable performance to baseline with at
most performance degradation, even with ultra-low precision
quantization down to 2 bits, corresponding up to compression of the
model parameters, and up to compression of the embedding table as
well as activations. Among all tasks, we observed the highest performance loss
for BERT fine-tuned on SQuAD. By probing into the Hessian based analysis as
well as visualization, we show that this is related to the fact that current
training/fine-tuning strategy of BERT does not converge for SQuAD
Social media mining under the COVID-19 context: Progress, challenges, and opportunities
Social media platforms allow users worldwide to create and share information, forging vast sensing networks that
allow information on certain topics to be collected, stored, mined, and analyzed in a rapid manner. During the
COVID-19 pandemic, extensive social media mining efforts have been undertaken to tackle COVID-19 challenges
from various perspectives. This review summarizes the progress of social media data mining studies in the
COVID-19 contexts and categorizes them into six major domains, including early warning and detection, human
mobility monitoring, communication and information conveying, public attitudes and emotions, infodemic and
misinformation, and hatred and violence. We further document essential features of publicly available COVID-19
related social media data archives that will benefit research communities in conducting replicable and repro�ducible studies. In addition, we discuss seven challenges in social media analytics associated with their potential
impacts on derived COVID-19 findings, followed by our visions for the possible paths forward in regard to social
media-based COVID-19 investigations. This review serves as a valuable reference that recaps social media mining
efforts in COVID-19 related studies and provides future directions along which the information harnessed from
social media can be used to address public health emergencies
Identifying the Conformational Isomers of Single-Molecule Cyclohexane at Room Temperature
构象异构是化学中的基本问题。然而对于环己烷等柔性分子,由于其在室温下极快的互变异构过程,基于系综的表征方法(如核磁等)只能得到所有构象平均贡献的结果。为了应对这一挑战,化学化工学院洪文晶教授与夏海平教授课题组为在室温条件下对柔性分子构象的定量分析与表征这一挑战,课题组成功实现了在室温条件下对环己烷两种椅式构象的电学表征与比例识别。同时,通过纳米电极间隙对分子的限域作用,发现在宏观尺度下极不稳定的扭船式中间体得以在单分子尺度稳定存在,这为不稳定中间体的研究提供了重要表征方法。
这一研究工作是在化学化工学院洪文晶教授、夏海平教授共同指导下完成的,iChEM直博生唐淳与化工系研究生唐永翔为论文共同第一作者。师佳副教授与刘俊扬副研究员为该工作提供了指导,博士后陈志昕、博士研究生陈李珏以及研究生叶艺玲、严哲玮、张珑漪共同参与了该工作。【Abstract】Isomerism reflects the ubiquitous nature that molecules with the same molecular formula show different structures. The interconversion between conformational isomers of flexible molecules is quite fast owing to the low barriers of around 10 kcal mol−1, leading to average signal contributed by all the possible isomers characterized by ensemble methods. On this account, identifying the conformational isomers of flexible molecules at room temperature has a substantial challenge. Here, we develop a single-molecule approach to identify the conformational isomers of cyclohexane at room temperature through the single-molecule electrical characterization. By noise analysis and feature extraction of the conductance of single-molecule junctions, we quantificationally identified two chair isomers of cyclohexane at room temperature, while such identification is only feasible at low temperatures by ensemble characterization. The strategy to apply the single-molecule approach to identify conformational isomers paves the avenue to investigate the isomerization of flexible molecules beyond the ensemble methods.This work was supported by the National Natural Science Foundation of China (nos, 21722305, 21673195, 21703188, and U1705254), the National Key R&D Program of China (2017YFA0204902), China Postdoctoral Science Foundation (no. 2017M622060), and the Fundamental Research Funds for Xiamen University (20720190002).该工作获得了科技部国家重点研发计划、国家自然科学基金等项目的资助,也得到了固体表面物理化学国家重点实验室、能源材料化学协同创新中心的支持
Deep learning assisted diagnosis system: improving the diagnostic accuracy of distal radius fractures
ObjectivesTo explore an intelligent detection technology based on deep learning algorithms to assist the clinical diagnosis of distal radius fractures (DRFs), and further compare it with human performance to verify the feasibility of this method.MethodsA total of 3,240 patients (fracture: n = 1,620, normal: n = 1,620) were included in this study, with a total of 3,276 wrist joint anteroposterior (AP) X-ray films (1,639 fractured, 1,637 normal) and 3,260 wrist joint lateral X-ray films (1,623 fractured, 1,637 normal). We divided the patients into training set, validation set and test set in a ratio of 7:1.5:1.5. The deep learning models were developed using the data from the training and validation sets, and then their effectiveness were evaluated using the data from the test set. Evaluate the diagnostic performance of deep learning models using receiver operating characteristic (ROC) curves and area under the curve (AUC), accuracy, sensitivity, and specificity, and compare them with medical professionals.ResultsThe deep learning ensemble model had excellent accuracy (97.03%), sensitivity (95.70%), and specificity (98.37%) in detecting DRFs. Among them, the accuracy of the AP view was 97.75%, the sensitivity 97.13%, and the specificity 98.37%; the accuracy of the lateral view was 96.32%, the sensitivity 94.26%, and the specificity 98.37%. When the wrist joint is counted, the accuracy was 97.55%, the sensitivity 98.36%, and the specificity 96.73%. In terms of these variables, the performance of the ensemble model is superior to that of both the orthopedic attending physician group and the radiology attending physician group.ConclusionThis deep learning ensemble model has excellent performance in detecting DRFs on plain X-ray films. Using this artificial intelligence model as a second expert to assist clinical diagnosis is expected to improve the accuracy of diagnosing DRFs and enhance clinical work efficiency
Electric-Field-Induced Connectivity Switching in Single-Molecule Junctions
Summary(#br)The manipulation of molecule-electrode interaction is essential for the fabrication of molecular devices and determines the connectivity from electrodes to molecular components. Although the connectivity of molecular devices could be controlled by molecular design to place anchor groups in different positions of molecule backbones, the reversible switching of such connectivities remains challenging. Here, we develop an electric-field-induced strategy to switch the connectivity of single-molecule junctions reversibly, leading to the manipulation of different connectivities in the same molecular backbone. Our results offer a new concept of single-molecule manipulation and provide a feasible strategy to regulate molecule-electrode interaction
Tunable near-infrared epsilon-near-zero and plasmonic properties of Ag-ITO co-sputtered composite films
<p>Series of co-sputtered silver-indium tin oxide (Ag-ITO) films are systematically fabricated. By tuning the atomic ratio of silver, composite films are manifested to have different microstructures with limited silver amount (<3 at.%). Two stages for film morphology changing are proposed to describe different status and growth mechanisms. The introduction of silver improves the preferred orientations of In<sub>2</sub>O<sub>3</sub> component significantly. Remarkably, dielectric permittivity of Ag-ITO films is highly adjustable, allowing the cross-over wavelengths <i>λ</i>
<sub><i>c</i></sub> to be changed by more than 300 nm through rapid post-annealing, and thus resulting in tunable epsilon-near-zero and plasmonic properties in the near-infrared region. Lower imaginary permittivity compared with pure metal films, as well as larger tunability in <i>λ</i>
<sub><i>c</i></sub> than pure ITO films suggest the potentiality of Ag-ITO films as substituted near-infrared plasmonic materials. Extended Maxwell-Garnett model is applied for effective medium approximation and the red-shifting of epsilon-near-zero region with the increase of silver content is well-fitted. Angle-variable prism coupling is carried out to reveal the surface plasmon polariton features of our films at optical communication wavelength. Broad dips in reflectance curves around 52–56° correspond to the SPP in Ag-ITO films.</p