712 research outputs found

    Heterogeneous Entity Matching with Complex Attribute Associations using BERT and Neural Networks

    Full text link
    Across various domains, data from different sources such as Baidu Baike and Wikipedia often manifest in distinct forms. Current entity matching methodologies predominantly focus on homogeneous data, characterized by attributes that share the same structure and concise attribute values. However, this orientation poses challenges in handling data with diverse formats. Moreover, prevailing approaches aggregate the similarity of attribute values between corresponding attributes to ascertain entity similarity. Yet, they often overlook the intricate interrelationships between attributes, where one attribute may have multiple associations. The simplistic approach of pairwise attribute comparison fails to harness the wealth of information encapsulated within entities.To address these challenges, we introduce a novel entity matching model, dubbed Entity Matching Model for Capturing Complex Attribute Relationships(EMM-CCAR),built upon pre-trained models. Specifically, this model transforms the matching task into a sequence matching problem to mitigate the impact of varying data formats. Moreover, by introducing attention mechanisms, it identifies complex relationships between attributes, emphasizing the degree of matching among multiple attributes rather than one-to-one correspondences. Through the integration of the EMM-CCAR model, we adeptly surmount the challenges posed by data heterogeneity and intricate attribute interdependencies. In comparison with the prevalent DER-SSM and Ditto approaches, our model achieves improvements of approximately 4% and 1% in F1 scores, respectively. This furnishes a robust solution for addressing the intricacies of attribute complexity in entity matching


    Get PDF
    People always travel with their friends. Some of them would like to design their travel plan together while some others would like to design their plan singly. Prior studies most focus on the single decision context. This paper investigates the collaborative customization in the joint decision and joint consumption context, and the information presentation format (attribute-based vs. bundle-based) effect on the tourists’ decision and behaviour is discussed. We also consider the relationship effect. And finally the potential theoretical contribution and practical implication are discussed

    Stochastic Efficiency Analysis with a Reliability Consideration

    Get PDF
    Stochastic Data Envelopment Analysis (DEA) models have been introduced in the literature to assess the performance of operating entities with random input and output data. A stochastic DEA model with a reliability constraint is proposed in this study that maximizes the lower bound of an entity\u27s efficiency score with some pre-selected probability. We define the concept of stochastic efficiency and develop a solution procedure. The economic interpretations of the stochastic efficiency index are presented when the inputs and outputs of each entity follow a multivariate joint normal distribution
    • …