48 research outputs found

    MinoanER: Schema-Agnostic, Non-Iterative, Massively Parallel Resolution of Web Entities

    Get PDF
    Entity Resolution (ER) aims to identify different descriptions in various Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of entity descriptions published in the Web of Data. To address them, we propose the MinoanER framework that simultaneously fulfills full automation, support of highly heterogeneous entities, and massive parallelization of the ER process. MinoanER leverages a token-based similarity of entities to define a new metric that derives the similarity of neighboring entities from the most important relations, as they are indicated only by statistics. A composite blocking method is employed to capture different sources of matching evidence from the content, neighbors, or names of entities. The search space of candidate pairs for comparison is compactly abstracted by a novel disjunctive blocking graph and processed by a non-iterative, massively parallel matching algorithm that consists of four generic, schema-agnostic matching rules that are quite robust with respect to their internal configuration. We demonstrate that the effectiveness of MinoanER is comparable to existing ER tools over real KBs exhibiting low Variety, but it outperforms them significantly when matching KBs with high Variety.Comment: Presented at EDBT 2001

    Digital Student Conference Platform Implementation: The case study of the “Research Project” course

    Get PDF
    Today, during the ‘fourth industrial revolution’ which is led by the Internet and the digital ecosystem it creates, schools are expected to achieve the development of not only the functional skills of literacy and numeracy but also of general knowledge. The apparent inadequacy of the standardized education system to respond to the needs and interests of 21st-century students urges researchers to adopt new forms of teaching as meaningful and high-quality teaching requires a more active use of innovative educational methods and tools. With the rapid development of IT globally, there is a tendency to utilize the capabilities of e-learning as a mode of distance learning since itcan function both independently of and in conjunction with conventional teaching. The varied applications of Web 2.0 tools create new possibilities in the educational sector. It provides the ability to develop innovative educational methods that transform students from passive recipients of information to knowledge creators through an active involvement in the learning process often within a modern interactive environment. This study presents the results of the implementation of a teaching intervention, with the use of a flexible and student-centered web system developed and used as complementary to the ‘Research Project’ course during the first term of the 2015-2016 school year. The ultimate goal of this effort was to highlight and consequently incorporate the use of a digital platform for student conferences which we implemented in schools as a means to research, learning, and skill development. The students had the opportunity to participate in a digital community which employed distance learning tools for communication, cooperation, and learning during a digital conference in which they had leading roles as writers and reviewers. The initial results of the pilot study indicated that the use of the digital platform increased the interest of students, supported the development of various skills and contributed to the overall improvement of the teaching and learning process.Today, during the ‘fourth industrial revolution’ which is led by the Internet and the digital ecosystem it creates, schools are expected to achieve the development of not only the functional skills of literacy and numeracy but also of general knowledge. The apparent inadequacy of the standardized education system to respond to the needs and interests of 21st-century students urges researchers to adopt new forms of teaching as meaningful and high-quality teaching requires a more active use of innovative educational methods and tools. With the rapid development of IT globally, there is a tendency to utilize the capabilities of e-learning as a mode of distance learning since it can function both independently of and in conjunction with conventional teaching. The varied applications of Web 2.0 tools create new possibilities in the educational sector. It provides the ability to develop innovative educational methods that transform students from passive recipients of information to knowledge creators through an active involvement in the learning process often within a modern interactive environment. This study presents the results of the implementation of a teaching intervention, with the use of a flexible and student-centered web system developed and used as complementary to the ‘Research Project’ course during the first term of the 2015-2016 school year. The ultimate goal of this effort was to highlight and consequently incorporate the use of a digital platform for student conferences which we implemented in schools as a means to research, learning, and skill development. The students had the opportunity to participate in a digital community which employed distance learning tools for communication, cooperation, and learning during a digital conference in which they had leading roles as writers and reviewers. The initial results of the pilot study indicated that the use of the digital platform increased the interest of students, supported the development of various skills and contributed to the overall improvement of the teaching and learning process

    End-to-End Entity Resolution for Big Data: A Survey

    Get PDF
    One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions

    Simplifying Entity Resolution on Web Data with Schema-agnostic, Non-iterative Matching

    Get PDF
    International audienceEntity Resolution (ER) aims to identify different descriptions in various Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of descriptions published in the Web of Data. To address them, we propose the MinoanER framework that fulfills full automation and support of highly heterogeneous entities. MinoanER leverages a token-based similarity of entities to define a new metric that derives the similarity of neighboring entities from the most important relations, indicated only by statistics. For high efficiency, similarities are computed from a set of schema-agnostic blocks and processed in a non-iterative way that involves four threshold-free heuristics. We demonstrate that the effectiveness of MinoanER is comparable to existing ER tools over real KBs exhibiting low heterogeneity in terms of entity types and content. Yet, MinoanER outperforms state-of-the-art ER tools when matching highly heterogeneous KBs

    Coverage-Based Summaries for RDF KBs

    Get PDF
    As more and more data become available as linked data, the need for efficient and effective methods for their exploration becomes apparent. Semantic summaries try to extract meaning from data, while reducing its size. State of the art structural semantic summaries, focus primarily on the graph structure of the data, trying to maximize the summary’s utility for query answering, i.e. the query coverage. In this poster paper, we present an algorithm, trying to maximize the aforementioned query coverage, using ideas borrowed from result diversification. The key idea of our algorithm is that, instead of focusing only to the “central” nodes, to push node selection also to the perimeter of the graph. Our experiments show the potential of our algorithm and demonstrate the considerable advantages gained for answering larger fragments of user queries.acceptedVersionPeer reviewe

    Interfacility transfers in a non-trauma system setting: an assessment of the Greek reality

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Quality assessment of any trauma system involves the evaluation of the transferring patterns. This study aims to assess interfacility transfers in the absence of a formal trauma system setting and to estimate the benefits from implementing a more organized structure.</p> <p>Methods</p> <p>The 'Report of the Epidemiology and Management of Trauma in Greece' is a one year project of trauma patient reporting throughout the country. It provided data concerning the patterns of interfacility transfers. We compared the transferred patient group to the non transferred patient group. Information reviewed included patient and injury characteristics, need for an operation, Intensive Care Unit (ICU) admittance and mortality. Analysis employed descriptive statistics and Chi-square test. Interfacility transfers were then assessed according to each health care facility's availability of five requirements; Computed Tomography scanner, ICU, neurosurgeon, orthopedic and vascular surgeon.</p> <p>Results</p> <p>Data on 8,524 patients were analyzed; 86.3% were treated at the same facility, whereas 13.7% were transferred. Transferred patients tended to be younger, male, and more severely injured than non transferred patients. Moreover, they were admitted to ICU more often, had a higher mortality rate but were less operated on compared to non transferred patients. The 34.3% of transfers was from facilities with none of the five requirements, whereas the 12.4% was from those with one requirement. Low level facilities, with up to three requirements transferred 43.2% of their transfer volume to units of equal resources.</p> <p>Conclusion</p> <p>Trauma management in Greece results in a high number of transfers. Patients are frequently transferred between low level facilities. Better coordination could lead to improved outcomes and less cost.</p

    Dedifferentiated Paratesticular Liposarcoma with Osseous Metaplasia

    Get PDF
    Paratesticular liposarcoma is a rare tumour of the genitourinary track but the most common of all sarcomas in adults. The dedifferentiated variation occurs only in 10% of liposarcoma cases. The typical clinical presentation is similar to an inguinal hernia or a benign lipoma. We present the case of a dedifferentiated paratesticular liposarcoma with osseous metaplasia of the spermatic cord, in a male presented with acute scrotum
    corecore