1,621 research outputs found

    Semantic retrieval of trademarks based on conceptual similarity

    Get PDF
    Trademarks are signs of high reputational value. Thus, they require protection. This paper studies conceptual similarities between trademarks, which occurs when two or more trademarks evoke identical or analogous semantic content. This paper advances the state-of-the-art by proposing a computational approach based on semantics that can be used to compare trademarks for conceptual similarity. A trademark retrieval algorithm is developed that employs natural language processing techniques and an external knowledge source in the form of a lexical ontology. The search and indexing technique developed uses similarity distance, which is derived using Tversky's theory of similarity. The proposed retrieval algorithm is validated using two resources: a trademark database of 1400 disputed cases and a database of 378,943 company names. The accuracy of the algorithm is estimated using measures from two different domains: the R-precision score, which is commonly used in information retrieval and human judgment/collective human opinion, which is used in human-machine systems

    Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

    Full text link
    Mathematical formulae represent complex semantic information in a concise form. Especially in Science, Technology, Engineering, and Mathematics, mathematical formulae are crucial to communicate information, e.g., in scientific papers, and to perform computations using computer algebra systems. Enabling computers to access the information encoded in mathematical formulae requires machine-readable formats that can represent both the presentation and content, i.e., the semantics, of formulae. Exchanging such information between systems additionally requires conversion methods for mathematical representation formats. We analyze how the semantic enrichment of formulae improves the format conversion process and show that considering the textual context of formulae reduces the error rate of such conversions. Our main contributions are: (1) providing an openly available benchmark dataset for the mathematical format conversion task consisting of a newly created test collection, an extensive, manually curated gold standard and task-specific evaluation metrics; (2) performing a quantitative evaluation of state-of-the-art tools for mathematical format conversions; (3) presenting a new approach that considers the textual context of formulae to reduce the error rate for mathematical format conversions. Our benchmark dataset facilitates future research on mathematical format conversions as well as research on many problems in mathematical information retrieval. Because we annotated and linked all components of formulae, e.g., identifiers, operators and other entities, to Wikidata entries, the gold standard can, for instance, be used to train methods for formula concept discovery and recognition. Such methods can then be applied to improve mathematical information retrieval systems, e.g., for semantic formula search, recommendation of mathematical content, or detection of mathematical plagiarism.Comment: 10 pages, 4 figure

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Content Based Image Retrieval (CBIR) by Statistical Methods

    Get PDF
            نظام استرجاع الصور هو نظام كمبيوتر لتصفح الصور والبحث فيها واستعادتها من قاعده بيانات ضخمه من الصور المتقدمه. الهدف من أساليب استرجاع الصور المستندة إلى المحتوى (CBIR) هو أساسًا استخراج عدد محدد من الصور المتشابهة في المحتوى المرئي والدلالي ، من قاعدة بيانات كبيرة (للصور) إلى صورة الاستعلام المزعومة. كان الباحثون يطورون آلية جديدة لاسترجاع الأنظمة التي تعتمد بشكل أساسي على إجراءين. يعتمد الإجراء الأول على استخراج الميزة الإحصائية لكل من الصورة الأصلية والتقليدية باستخدام المدرج الإحصائي والخصائص الإحصائية (متوسط ,انحراف معياري). يعتمد الإجراء الثاني على قياس الاستقلال بين أكثر من صوره، (معامل الارتباط ، اختبار T ، مستوى الأهمية ، العثور على القرار) ، ومن خلال الاختبارات التجريبية وجد ان الطريقة المقترحة لتقنية الاسترجاع (T- اختبار) هو افضل من نظام استرجاع الكلاسيكية.            An image retrieval system is a computer system for browsing, looking and recovering pictures from a huge database of advanced pictures. The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. The researchers were developing a new mechanism to retrieval systems which is mainly based on two procedures. The first procedure relies on extract the statistical feature of both original, traditional image by using the histogram and statistical characteristics (mean, standard deviation). The second procedure relies on the T- test to measure the independence between more than images, (coefficient of correlate, T- test, Level of significance, find the decision), and, through experimental test, it was found that this proposed method of retrieval technique is powerful than the classical retrieval System

    Multi-faceted Assessment of Trademark Similarity

    Get PDF
    Trademarks are intellectual property assets with potentially high reputational value. Their infringement may lead to lost revenue, lower profits and damages to brand reputation. A test normally conducted to check whether a trademark is highly likely to infringe other existing, already registered, trademarks is called a likelihood of confusion test. One of the most influential factors in this test is establishing similarity in appearance, meaning or sound. However, even though the trademark registration process suggests a multi-faceted similarity assessment, relevant research in expert systems mainly focuses on computing individual aspects of similarity between trademarks. Therefore, this paper contributes to the knowledge in this field by proposing a method, which, similar to the way people perceive trademarks, blends together the three fundamental aspects of trademark similarity and produces an aggregated score based on the individual visual, semantic and phonetic assessments. In particular, semantic similarity is a new aspect, which has not been considered by other researchers in approaches aimed at providing decision support in trademark similarity assessment. Another specific scientific contribution of this paper is the innovative integration, using a fuzzy engine, of three independent assessments, which collectively provide a more balanced and human-centered view on potential infringement problems. In addition, the paper introduces the concept of degree of similarity since the line between similar and dissimilar trademarks is not always easy to define especially when dealing with blending three very different assessments. The work described in the paper is evaluated using a database comprising 1,400 trademarks compiled from a collection of real legal cases of trademark disputes. The evaluation involved two experiments. The first experiment employed information retrieval measures to test the classification accuracy of the proposed method while the second used human collective opinion to examine correlations between the trademark scoring/rating and the ranking of the proposed method, and human judgment. In the first experiment, the proposed method improved the F-score, precision and accuracy of classification by 12.5%, 35% and 8.3%, respectively, against the best score computed using individual similarity. In the second experiment, the proposed method produced a perfect positive Spearman rank correlation score of 1.00 in the ranking task and a pairwise Pearson correlation score of 0.92 in the rating task. The test of significance conducted on both scores rejected the null hypotheses of the experiment and showed that both scores correlated well with collective human judgment. The combined overall assessment could add value to existing support systems and be beneficial for both trademark examiners and trademark applicants. The method could be further used in addressing recent cyberspace phenomena related to trademark infringement such as customer hijacking and cybersquatting. Keywords—Trademark assessment, trademark infringement, trademark retrieval, degree of similarity, fuzzy aggregation, semantic similarity, phonetic similarity, visual similarity

    The magic words: Using computers to uncover mental associations for use in magic trick design

    Get PDF
    This work was supported by EPSRC grant number EP/J50029X/1

    CNN-Siamese 네트워크를 활용한 문자 상표 발음 유사성 탐지

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 공과대학 산업공학과, 2022. 8. 조성준.Recently, as the number of registered trademarks has rapidly increased, research to determine trademark similarity based on machine learning has been actively con- ducted. Similarity of trademarks is judged based on shapes, meaning, and pronun- ciation. In the case of pronunciation, there is a limit in judging similarity because the standards for similarity are ambiguous and spellings do not correspond to pro- nunciation in many cases. On the other hand, the performance of converting text into speech has been remarkably improved due to the recent development of speech synthesis technology. In this paper, we propose a deep learning framework that au- tomatically determines the pronunciation similarity of trademarks using speech data converted using speech synthesis technology. First, after synthesizing the trademark text into speech, it is converted into a log Mel spectrogram, and feature learning is performed through a convolutional neural network with a triplet loss. To compare the proposed method with previous studies, the trademark text dataset provided by AIhub was used, and our proposed method showed superior performance than the previous studies.최근 등록되는 상표의 수가 빠르게 증가함에 따라 기계학습을 기반으로 상표 유사성을 판단하려는 연구가 활발히 진행되어 왔다. 상표의 유사성은 도형, 관념, 발음을 기준으 로 판단되는데, 발음의 경우 유사함의 기준이 모호하며 철자가 발음에 대응되지 않는 경우가 많기 때문에 유사성을 판단하는데 한계가 존재한다. 한편, 최근 음성 합성 기술의 발달로 인해 텍스트를 음성으로 변환하는 성능이 눈에 띄게 향상하였다. 본 논문은 음 성합성기술을 활용하여 상표의 발음 유사성을 자동으로 판단하는 딥러닝 프레임워크를 제안한다. 먼저, 상표 텍스트를 음성으로 합성한 뒤, log Mel Spectrogram 으로 변환 하고 합성곱 신경망과 삼중항 손실을 통해 feature 학습을 진행한다. 제안하는 방법과 선행 연구를 비교하기 위해 AIhub 에서 제공하는 상표 텍스트 데이터셋을 활용하였고, 제안하는 방식이 선행 연구를 앞서는 것을 확인하였다.Chapter 1 Introduction 1 Chapter 2 Related Work 5 Chapter 3 Proposed Method 8 3.1 Model Architecture 8 3.2 EvaluationMetric 12 Chapter 4 Datasets 14 4.1 Traindataset 14 4.2 Testdataset 15 4.3 Speechdataset 15 4.4 Preprocessing 15 Chapter 5 Experimental Results 18 5.1 Experiment1: Compare different input type 18 5.2 Experiment 2: Compare signal processing methods 19 5.3 Experiment3:Comparebackbonenetworks 20 5.4 Experiment4:Comparebaselinemodels 21 Chapter 6 Conclusion 23 Bibliography 25 국문초록 28 감사의 글 29석

    An Historical Epistemology of Perception in the Use of Mobile Computers

    Get PDF
    Recently, the interaction between humans and mobile computers, as a part of the broader problem of technology use in Human-Computer studies, has received some research attention. Researchers have explained mobile technology use in terms rhythms, negotiation, contextual influences and boundary control. However, these explanations do not exude sufficient cognitive accounts of mobile technology use. To supplement existing explanations, the use of mobile computers is explained in terms of the historical epistemology of perception. In this epistemology, perception is deemed as a mode of human action that is endowed with goal-orientation and teleological consciousness. A cognitive-based explanation of mobile technology use will enhance our understanding of the mediating role of technology representations and of how human mobility and mobile work filter these representations in mobile computing. The explanations provide guidelines for research, design and integration of mobile technologies in mobile activities

    Finding Functional Gene Relationships Using the Semantic Gene Organizer (SGO)

    Get PDF
    Understanding functional gene relationships is a major challenge in bioninformatics and computational biology. Currently, many approaches extract gene relationships via term co-occurrence models from the biomedical literature. Unfortunately, however, many genes that are experimentally identified to be related have not been previously studied together. As a result, many automated models fail to help researchers understand the nature of the relationships. In this work, the particular schema used tomine genomic data is called LatentSemantic Indexing (LSI). LSI performs a singular-value decomposition (SVD) to produce a low-rank approximation of the data set. Effectively, it allows queries to be interpreted in a more concept-based space and can allow for gene relationships to be discovered that would ordinarily be overlooked by other models
    corecore