1,697 research outputs found

    ViTS: Video tagging system from massive web multimedia collections

    Get PDF
    The popularization of multimedia content on the Web has arised the need to automatically understand, index and retrieve it. In this paper we present ViTS, an automatic Video Tagging System which learns from videos, their web context and comments shared on social networks. ViTS analyses massive multimedia collections by Internet crawling, and maintains a knowledge base that updates in real time with no need of human supervision. As a result, each video is indexed with a rich set of labels and linked with other related contents. ViTS is an industrial product under exploitation with a vocabulary of over 2.5M concepts, capable of indexing more than 150k videos per month. We compare the quality and completeness of our tags with respect to the ones in the YouTube-8M dataset, and we show how ViTS enhances the semantic annotation of the videos with a larger number of labels (10.04 tags/video), with an accuracy of 80,87%.Postprint (published version

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Lack of standards in evaluating YouTube health videos

    Get PDF
    This paper is a systematised literature review of YouTube research in health with the aim of identify the different keyword search strategies, retrieval strategies and scoring systems to assess video content. A total of 176 peer-reviewed papers about video content analysis and video evaluation were extracted from the PubMed database. Concerning keyword search strategy, 16 papers (9.09 %) reported that search terms were obtained from tools like Google Trends or other sources. In just one paper, a librarian was included in the research team. Manual retrieval is a common technique, and just four studies (2.27 %) reported using a different methodology. Manual retrieval also produces YouTube algorithm dependencies and consequently obtains biased results. Most other methodologies to analyse video content are based on written medical guidelines instead of video because a standard methodology is lacking. For several reasons, reliability cannot be verified. In addition, because studies cannot be repeated, the results cannot be verified and compared. This paper reports some guidelines to improve research on YouTube, including guidelines to avoid YouTube dependencies and scoring system issues

    Ideas Matchmaking for Supporting Innovators and Entrepreneurs

    Get PDF
    Käesolevas töös esitletakse süsteemi, mis on võimeline sirvima veebist ettevõtluse ja tehnoloogiaga seotud andmeid, mida saab siduda kasutajate poolt Innovvoice platvormil välja pakutud ideedega. Selline teenus on ideabator platvormi väärtuslik osa, mis toetab ettevõtluse uuendajaid ja potentsiaalseid ettevõtjaid.In this paper we show a system able to crawl content from the Web related to entrepreneurship and technology, to be matched with ideas proposed by users in the Innovvoice platform. We argue that such a service is a valuable component of an ideabator platform, supporting innovators and possible entrepreneurs

    Annotation of multimedia learning materials for semantic search

    Get PDF
    Multimedia is the main source for online learning materials, such as videos, slides and textbooks, and its size is growing with the popularity of online programs offered by Universities and Massive Open Online Courses (MOOCs). The increasing amount of multimedia learning resources available online makes it very challenging to browse through the materials or find where a specific concept of interest is covered. To enable semantic search on the lecture materials, their content must be annotated and indexed. Manual annotation of learning materials such as videos is tedious and cannot be envisioned for the growing quantity of online materials. One of the most commonly used methods for learning video annotation is to index the video, based on the transcript obtained from translating the audio track of the video into text. Existing speech to text translators require extensive training especially for non-native English speakers and are known to have low accuracy. This dissertation proposes to index the slides, based on the keywords. The keywords extracted from the textbook index and the presentation slides are the basis of the indexing scheme. Two types of lecture videos are generally used (i.e., classroom recording using a regular camera or slide presentation screen captures using specific software) and their quality varies widely. The screen capture videos, have generally a good quality and sometimes come with metadata. But often, metadata is not reliable and hence image processing techniques are used to segment the videos. Since the learning videos have a static background of slide, it is challenging to detect the shot boundaries. Comparative analysis of the state of the art techniques to determine best feature descriptors suitable for detecting transitions in a learning video is presented in this dissertation. The videos are indexed with keywords obtained from slides and a correspondence is established by segmenting the video temporally using feature descriptors to match and align the video segments with the presentation slides converted into images. The classroom recordings using regular video cameras often have poor illumination with objects partially or totally occluded. For such videos, slide localization techniques based on segmentation and heuristics is presented to improve the accuracy of the transition detection. A region prioritized ranking mechanism is proposed that integrates the keyword location in the presentation into the ranking of the slides when searching for a slide that covers a given keyword. This helps in getting the most relevant results first. With the increasing size of course materials gathered online, a user looking to understand a given concept can get overwhelmed. The standard way of learning and the concept of “one size fits all” is no longer the best way to learn for millennials. Personalized concept recommendation is presented according to the user’s background knowledge. Finally, the contributions of this dissertation have been integrated into the Ultimate Course Search (UCS), a tool for an effective search of course materials. UCS integrates presentation, lecture videos and textbook content into a single platform with topic based search capabilities and easy navigation of lecture materials

    Quality of information about oral cancer in Brazilian Portuguese available on Google, Youtube, and Instagram

    Get PDF
    To evaluate the quality of oral cancer information in Brazilian Portuguese on Google, YouTube, and Instagram. The first 100 links of each platform characterized the initial sample. The websites and Instagram were evaluated using the JAMA benchmarks, the Discern instrument, and the Flesch readability index (Flesch Reading Ease). The existence of Health on the Net (HON) code was also registered on websites. The usefulness of each video on YouTube was classified as not useful, slightly useful, moderately useful, or very useful. Thirty-four websites, 39 Instagram posts, and 57 videos were evaluated, of which 18 (33.3%) websites and 19 (48.7%) Instagram posts covered only 2 of the 4 JAMA benchmarks. For the Discern instrument, 20 (37%) and 18 (33.3%) websites exhibited low and moderate reliability, respectively, while 26 (66.7%) Instagram posts were of low confidence. The level of intelligibility of both websites and Instagram was difficult. Only three websites exhibited the HONcode. Forty-one (71.9%) videos on YouTube were moderately useful. Information on oral cancer on the Internet in Brazilian Portuguese is of low quality. Thus, educational and governmental institutions have a responsibility to produce and indicate reliable sources of information for the population

    Solutions to Detect and Analyze Online Radicalization : A Survey

    Full text link
    Online Radicalization (also called Cyber-Terrorism or Extremism or Cyber-Racism or Cyber- Hate) is widespread and has become a major and growing concern to the society, governments and law enforcement agencies around the world. Research shows that various platforms on the Internet (low barrier to publish content, allows anonymity, provides exposure to millions of users and a potential of a very quick and widespread diffusion of message) such as YouTube (a popular video sharing website), Twitter (an online micro-blogging service), Facebook (a popular social networking website), online discussion forums and blogosphere are being misused for malicious intent. Such platforms are being used to form hate groups, racist communities, spread extremist agenda, incite anger or violence, promote radicalization, recruit members and create virtual organi- zations and communities. Automatic detection of online radicalization is a technically challenging problem because of the vast amount of the data, unstructured and noisy user-generated content, dynamically changing content and adversary behavior. There are several solutions proposed in the literature aiming to combat and counter cyber-hate and cyber-extremism. In this survey, we review solutions to detect and analyze online radicalization. We review 40 papers published at 12 venues from June 2003 to November 2011. We present a novel classification scheme to classify these papers. We analyze these techniques, perform trend analysis, discuss limitations of existing techniques and find out research gaps

    A Search Engine for Youtube and TikTok Videos

    Get PDF
    Σε αυτή τη πτυχιακή εργασία υλοποιήθηκε μία μηχανή αναζήτησης.Η μηχανή αναζήτησης που υλοποιήθηκε βασίστηκε επάνω σε δύο εκδόσεις . Για τη λήψη και τη συλλογή των δεδομένων που απαιτούνταν για τηλειτουργία της μηχανής αναζήτησης χρησιμοποιήθηκαν δεδομένα από το Youtube και το Tik Tok, τα οποία είχαν ληφθεί μέσω της μεθόδου web scraping με τη βοήθεια ενός web crawler .Στην πρώτη έκδοση της μηχανής αναζήτησης ο χρήστης έχει τη δυνατότητα να εκτελέσει sql ερωτήματα τα οποία είναι αποθηκευμένα σε πίνακες μιας τοπικής βάσης δεδομένων Postgresql , ενώ στη δεύτερη έκδοση ο χρήστης έχει τη δυνατότητα να αναζητήσει μία λέξη ή ένα κείμενο σε έγγραφα που είναι αποθηκευμένα σε ένα ανεστραμμένο ευρετήριο Lucene . Τα έγγραφα αυτά περιέχουν δεδομένα τα οποία ανακτήθηκαν από τη βάση δεδομένων Postgresql .In this project we created a search engine for online videos . We developed two versions of the search engine. Data from Youtube and TikTok, which were gathered via the web scraping method with the help of a web crawler, were utilized to download and collect the data required for the search engine's operation.The user can run sql queries stored in tables of a local Postgresql database in the first version of the search engine, while in the second version, the user can search for a word or text in documents stored in a Lucene inverted index. Data from the Postgresql database is contained in these documents
    corecore