57 research outputs found
CKG: Dynamic Representation Based on Context and Knowledge Graph
Recently, neural language representation models pre-trained on large corpus
can capture rich co-occurrence information and be fine-tuned in downstream
tasks to improve the performance. As a result, they have achieved
state-of-the-art results in a large range of language tasks. However, there
exists other valuable semantic information such as similar, opposite, or other
possible meanings in external knowledge graphs (KGs). We argue that entities in
KGs could be used to enhance the correct semantic meaning of language
sentences. In this paper, we propose a new method CKG: Dynamic Representation
Based on \textbf{C}ontext and \textbf{K}nowledge \textbf{G}raph. On the one
side, CKG can extract rich semantic information of large corpus. On the other
side, it can make full use of inside information such as co-occurrence in large
corpus and outside information such as similar entities in KGs. We conduct
extensive experiments on a wide range of tasks, including QQP, MRPC, SST-5,
SQuAD, CoNLL 2003, and SNLI. The experiment results show that CKG achieves SOTA
89.2 on SQuAD compared with SAN (84.4), ELMo (85.8), and BERT (88.5)
Does Faith Has Impact on Investment Return: Evidence From REITs
This paper investigates whether faith has impact on investment returns. Specifically, we choose the Shariah compliance and REITs investment for the purpose of investigation. Synthetic Shariah compliant portfolios are constructed with various interpretation of compliance. We compare the performance of Shariah compliant portfolios with US Equity REIT portfolio during 1993-2017 by examining the abnormal returns using CAPM and Carhart four-factor model. We find no evidence of underperformance or outperformance of the Shariah compliant investments. This is also true during the financial crisis periods which is confirmed by the sub-sample analysis. Our findings suggest that Shariah compliant REIT investor faces no cost or gain in his investments as a result of his faith
Can rainmakers justify their pay? The role of investment banks in REIT M&As
This study explicitly rejects the prima facie proposition that the top-tier investment banks are capable of delivering supernormal value creation to the shareholders of a REIT acquirer in a corporate acquisition. Using the event study method, we find that REIT acquirers advised by market-leading investment banks suffer an average cumulative abnormal return of −4.41% following the M&A announcement, whereas REIT acquirers advised by non-top-tier investment banks only suffer an average cumulative abnormal return of −1.49%. The evidence shows that the contemporary practice of employing investment banks based on the prestige of the advisory firms could potentially result in value-destroying M&As for the REIT acquirers
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Transformer-based models, such as BERT, have revolutionized various language
tasks, but still struggle with large file classification due to their input
limit (e.g., 512 tokens). Despite several attempts to alleviate this
limitation, no method consistently excels across all benchmark datasets,
primarily because they can only extract partial essential information from the
input file. Additionally, they fail to adapt to the varied properties of
different types of large files. In this work, we tackle this problem from the
perspective of correlated multiple instance learning. The proposed approach,
LaFiCMIL, serves as a versatile framework applicable to various large file
classification tasks covering binary, multi-class, and multi-label
classification tasks, spanning various domains including Natural Language
Processing, Programming Language Processing, and Android Analysis. To evaluate
its effectiveness, we employ eight benchmark datasets pertaining to Long
Document Classification, Code Defect Detection, and Android Malware Detection.
Leveraging BERT-family models as feature extractors, our experimental results
demonstrate that LaFiCMIL achieves new state-of-the-art performance across all
benchmark datasets. This is largely attributable to its capability of scaling
BERT up to nearly 20K tokens, running on a single Tesla V-100 GPU with 32G of
memory.Comment: 12 pages; update results; manuscript revisio
A Pre-Trained BERT Model for Android Applications
The automation of an increasingly large number of software engineering tasks
is becoming possible thanks to Machine Learning (ML). One foundational building
block in the application of ML to software artifacts is the representation of
these artifacts (e.g., source code or executable code) into a form that is
suitable for learning. Many studies have leveraged representation learning,
delegating to ML itself the job of automatically devising suitable
representations. Yet, in the context of Android problems, existing models are
either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted
for one specific downstream task (e.g., smali2vec). Our work is part of a new
line of research that investigates effective, task-agnostic, and fine-grained
universal representations of bytecode to mitigate both of these two
limitations. Such representations aim to capture information relevant to
various low-level downstream tasks (e.g., at the class-level). We are inspired
by the field of Natural Language Processing, where the problem of universal
representation was addressed by building Universal Language Models, such as
BERT, whose goal is to capture abstract semantic information about sentences,
in a way that is reusable for a variety of tasks. We propose DexBERT, a
BERT-like Language Model dedicated to representing chunks of DEX bytecode, the
main binary format used in Android applications. We empirically assess whether
DexBERT is able to model the DEX language and evaluate the suitability of our
model in two distinct class-level software engineering tasks: Malicious Code
Localization and Defect Prediction. We also experiment with strategies to deal
with the problem of catering to apps having vastly different sizes, and we
demonstrate one example of using our technique to investigate what information
is relevant to a given task
Does faith has impact on investment return: evidence from REITs
TThis paper investigates whether faith has impact on investment returns. Specifically, we choose the Shariah compliance and REITs investment for the purpose of investigation. Synthetic Shariah compliant portfolios are constructed with various interpretation of compliance. We compare the performance of Shariah compliant portfolios with US Equity REIT portfolio during 1993–2017 by examining the abnormal returns using CAPM and Carhart four-factor model. We find no evidence of underperformance or outperformance of the Shariah compliant investments. This is also true during the financial crisis periods which is confirmed by the sub-sample analysis. Our findings suggest that Shariah compliant REIT investor faces no cost or gain in his investments as a result of his faith
DexBERT: Effective, Task-Agnostic and Fine-Grained Representation Learning of Android Bytecode
peer reviewedThe automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts (e.g., source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable representations and selections of the most relevant features. Yet, in the context of Android problems, existing models are either limited to coarse-grained whole-app level (e.g., apk2vec) or conducted for one specific downstream task (e.g., smali2vec). Thus, the produced representation may turn out to be unsuitable for fine-grained tasks or cannot generalize beyond the task that they have been trained on. Our work is part of a new line of research that investigates effective, task-agnostic, and fine-grained universal representations of bytecode to mitigate both of these two limitations. Such representations aim to capture information relevant to various low-level downstream tasks (e.g., at the class-level). We are inspired by the field of Natural Language Processing, where the problem of universal representation was addressed by building Universal Language Models, such as BERT, whose goal is to capture abstract semantic information about sentences, in a way that is reusable for a variety of tasks. We propose DexBERT, a BERT-like Language Model dedicated to representing chunks of DEX bytecode, the main binary format used in Android applications. We empirically assess whether DexBERT is able to model the DEX language and evaluate the suitability of our model in three distinct class-level software engineering tasks: Malicious Code Localization, Defect Prediction, and Component Type Classification. We also experiment with strategies to deal with the problem of catering to apps having vastly different sizes, and we demonstrate one example of using our technique to investigate what information is relevant to a given task
Emergence of Fatal PRRSV Variants: Unparalleled Outbreaks of Atypical PRRS in China and Molecular Dissection of the Unique Hallmark
Porcine reproductive and respiratory syndrome (PRRS) is a severe viral disease in pigs, causing great economic losses worldwide each year. The causative agent of the disease, PRRS virus (PRRSV), is a member of the family Arteriviridae. Here we report our investigation of the unparalleled large-scale outbreaks of an originally unknown, but so-called “high fever” disease in China in 2006 with the essence of PRRS, which spread to more than 10 provinces (autonomous cities or regions) and affected over 2,000,000 pigs with about 400,000 fatal cases. Different from the typical PRRS, numerous adult sows were also infected by the “high fever” disease. This atypical PRRS pandemic was initially identified as a hog cholera-like disease manifesting neurological symptoms (e.g., shivering), high fever (40–42°C), erythematous blanching rash, etc. Autopsies combined with immunological analyses clearly showed that multiple organs were infected by highly pathogenic PRRSVs with severe pathological changes observed. Whole-genome analysis of the isolated viruses revealed that these PRRSV isolates are grouped into Type II and are highly homologous to HB-1, a Chinese strain of PRRSV (96.5% nucleotide identity). More importantly, we observed a unique molecular hallmark in these viral isolates, namely a discontinuous deletion of 30 amino acids in nonstructural protein 2 (NSP2). Taken together, this is the first comprehensive report documenting the 2006 epidemic of atypical PRRS outbreak in China and identifying the 30 amino-acid deletion in NSP2, a novel determining factor for virulence which may be implicated in the high pathogenicity of PRRSV, and will stimulate further study by using the infectious cDNA clone technique
Prediction and Analysis of Dew Point Indirect Evaporative Cooler Performance by Artificial Neural Network Method
The artificial neural network method has been widely applied to the performance prediction of fillers and evaporative coolers, but its application to the dew point indirect evaporative coolers is rare. To fill this research gap, a novel performance prediction model for dew point indirect evaporative cooler based on back propagation neural network was established using Matlab2018. Simulation based on the test date in the moderately humid region of Yulin City (Shaanxi Province, China) finds that: the root mean square error of the evaporation efficiency of the back propagation model is 3.1367, and the r2 is 0.9659, which is within the acceptable error range. However, the relative error of individual data (sample 7) is a little bit large, which is close to 10%. In order to improve the accuracy of the back propagation model, an optimized model based on particle swarm optimization was established. The relative error of the optimized model is generally smaller than that of the BP neural network especially for sample 7. It is concluded that the optimized artificial neural network is more suitable for solving the performance prediction problem of dew point indirect evaporative cooling units
Packet-Forwarding Algorithm in DTN Based on the Pheromone of Destination Node
The pheromone can be used for target tracking and, consequently, for packet forwarding to the destination node in delay-tolerant networking. In this study, an initiative community model is proposed by simulating pheromone production and diffusion, which contains all of the nodes that can receive the core node pheromone. We can set the distance of the edge node to the core to be less than five hops by establishing the appropriate spread coefficient ω . Packet forwarding is then converted into the process of tracing the pheromone of the destination node. A set of simulation results shows that the proposed initiative community model can effectively increase the delivery ratio and reduce delay when the community structure is relatively stable
- …