Search CORE

63 research outputs found

GameRecs: Video Games Group Recommendations

Author: C Desrosiers
Gediminas Adomavicius
J Beel
JJ Sandvig
K Stefanidis
M Balabanovic
M Eirinaki
M Koskela
M Stratigi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/10/2019
Field of study

Crossref

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Ratings vs. Reviews in Recommender Systems: A Case Study on the Amazon Movies Dataset

Author: JJ Sandvig
Kostas Stefanidis
M Eirinaki
M Koskela
M Stratigi
MJ Pazzani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/10/2019
Field of study

Crossref

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

ResearchFlow: Understanding the Knowledge Flow between Academia and Industry

Author: AA Salatino
AA Salatino
AA Salatino
Angelo A Salatino
C Grimpe
J Stilgoe
K Wang
L Bolelli
L Krumov
L Weinstein
M Bikard
M-H Huang
MS Anderson
Q Michaudel
RL Ohniwa
S Altuntas
S Angioni
S Ankrah
S Choi
S Peroni
S Peroni
SN Ankrah
TS Kuhn
WW Powell
YD Marinakis
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2020
Field of study

Understanding, monitoring, and predicting the flow of knowledge between academia and industry is of critical importance for a variety of stakeholders, including governments, funding bodies, researchers, investors, and companies. To this purpose, we introduce ResearchFlow, an approach that integrates semantic technologies and machine learning to quantifying the diachronic behaviour of research topics across academia and industry. ResearchFlow exploits the novel Academia/Industry DynAmics (AIDA) Knowledge Graph in order to characterize each topic according to the frequency in time of the related i) publications from academia, ii) publications from industry, iii) patents from academia, and iv) patents from industry. This representation is then used to produce several analytics regarding the academia/industry knowledge flow and to forecast the impact of research topics on industry. We applied ResearchFlow to a dataset of 3.5M papers and 2M patents in Computer Science and highlighted several interesting patterns. We found that 89.8% of the topics first emerge in academic publications, which typically precede industrial publications by about 5.6 years and industrial patents by about 6.6 years. However this does not mean that academia always dictates the research agenda. In fact, our analysis also shows that industrial trends tend to influence academia more than academic trends affect industry. We evaluated ResearchFlow on the task of forecasting the impact of research topics on the industrial sector and found that its granular characterization of topics improves significantly the performance with respect to alternative solutions

Crossref

Open Research Online (The Open University)

B!SON: A Tool for Open Access Journal Recommendation

Author: Entrup Elias
Eppelin Anita
Ewerth Ralph
Hartwig Josephine
Hoppe Anett
Tullney Marco
Wohlgemuth Michael
Publication venue: Heidelberg : Springer
Publication date: 01/01/2022
Field of study

Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project

Repositorium für Naturwissenschaften und Technik

Interrogating the politics and performativity of web archives

Author: Ogden Jessica
Publication venue
Publication date
Field of study

Since the mid-1990s institutions such as national libraries and the Internet Archive have been ‘archiving the Web’ through the harvesting, collection and preservation of ‘web objects’ (e.g. websites, web pages, social media) in web archives [55]. Much of the focus of the web archiving community has been on the continued development of technologies and practices for web collection development [38], with an increased attention in recent years on facilitating the scholarly use of web archives [25, 24, 61]. This research will take a step back to consider the place of web archives in light of ‘the archival turn’ and emergent questions over the ever- expansive role of the archive and the Web in everyday life. First coined by Stoler [81], ‘the archival turn’ denotes a shift from ‘archive as source’ to ‘archive as subject,’ signalling wide-ranging epistemological questions concerning the role of the archive (and the archivist) in shaping and legitimising knowledge and particular ways of knowing. This research proposes to re-situate web archives as places of knowledge and cultural production in their own right, by implicating both the web archivist and the technologies in the shaping of the ‘politics of ephemerality’ [82] that lead to the creation and maintenance of web archives. This study will identify key underlying assumptions about what the Web is (e.g. a ‘Web of Documents,’ ‘abstract information space’), what of the contemporary Web is (or isn’t) being archived, and the relative affordances for web archival practice and scholarly use. Furthermore, drawing on critical approaches to information, Science and Technology Studies and Web Science, this research will engage with the performativity of web archiving, the practices of selection, collection and classification, and the possible implications for a socio-technical understanding of web archives

Southampton (e-Prints Soton)

Recommended from our members

CSO Classifier 3.0: a scalable unsupervised method for classifying documents in terms of research topics

Author: Motta Enrico
Osborne Francesco
Salatino Angelo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Classifying scientific articles, patents, and other documents according to the relevant research topics is an important task, which enables a variety of functionalities, such as categorising documents in digital libraries, monitoring and predicting research trends, and recommending papers relevant to one or more topics. In this paper, we present the latest version of the CSO Classifier (v3.0), an unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive taxonomy of research areas in the field of Computer Science. The CSO Classifier takes as input the textual components of a research paper (usually title, abstract, and keywords) and returns a set of research topics drawn from the ontology. This new version includes a new component for discarding outlier topics and offers improved scalability. We evaluated the CSO Classifier on a gold standard of manually annotated articles, demonstrating a significant improvement over alternative methods. We also present an overview of applications adopting the CSO Classifier and describe how it can be adapted to other fields

Open Research Online (The Open University)

Using Natural Language Processing and Artificial Intelligence to Explore the Nutrition and Sustainability of Recipes and Food

Author: Adriano Martins C
Alvarez de Toledo D
Andres F
Bosma U
Brewer S
Bridle S
da Silva JT
Frankowska A
Ibáñez Martín R
Kluczkovski A
Leite MCA
Levy RB
Maynard D
Rauber F
Reynolds C
Schmidt Rivera X
Starke A
Trattner C
van Erp M
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Copyright © 2021 van Erp, Reynolds, Maynard, Starke, Ibáñez Martín, Andres, Leite, Alvarez de Toledo, Schmidt Rivera, Trattner, Brewer, Adriano Martins, Kluczkovski, Frankowska, Bridle, Levy, Rauber, Tereza da Silva and Bosma. In this paper, we discuss the use of natural language processing and artificial intelligence to analyze nutritional and sustainability aspects of recipes and food. We present the state-of-the-art and some use cases, followed by a discussion of challenges. Our perspective on addressing these is that while they typically have a technical nature, they nevertheless require an interdisciplinary approach combining natural language processing and artificial intelligence with expert domain knowledge to create practical tools and comprehensive analysis for the food domain.Research Councils UK, the University of Manchester, the University of Sheffield, the STFC Food Network+ and the HEFCE Catalyst-funded N8 AgriFood Resilience Programme with matched funding from the N8 group of Universities; AHRC funded AHRC US-UK Food Digital Scholarship Network (Grant Reference: AH/S012591/1), STFC GCRF funded project “Trends in greenhouse gas emissions from Brazilian foods using GGDOT” (ST/S003320/1), the STFC funded project “Piloting Zooniverse for food, health and sustainability citizen science” (ST/T001410/1), and the STFC Food Network+ Awarded Scoping Project “Piloting Zooniverse to help us understand citizen food perceptions”; ESRC via the University of Sheffield Social Sciences Partnerships, Impact and Knowledge Exchange fund for “Recipe environmental impact calculator”; and through Research England via the University of Sheffield QR Strategic Priorities Fund projects “Cooking as part of a Sustainable Food System – creating an wider evidence base for policy makers”, and “Food based citizen science in the UK as a policy tool”; N8 AgriFood-funded project “Greenhouse Gas and Dietary choices Open-source Toolkit (GGDOT) hacknights.’; Brunel University internal Research England GCRF QR Fund; The University of Manchester GCRF QR Visiting Researcher Fellowship; National Institute of Informatics, Japan

NORA - Norwegian Open Research Archives

White Rose Research Online

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Brunel University Research Archive

Recommended from our members

AIDA: a Knowledge Graph about Research Dynamics in Academia and Industry

Author: Angioni Simone
Motta Enrico
Osborne Francesco
Recupero Diego Reforgiato
Salatino Angelo
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2021
Field of study

Academia and industry share a complex, multifaceted, and symbiotic relationship. Analysing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonise their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to analyze this space, but current datasets of scholarly data cannot be used for such a purpose since they lack a high-quality characterization of the relevant research topics and industrial sectors. In this paper, we introduce the Academia/Industry DynAmics (AIDA) Knowledge Graph, which describes 21M publications and 8M patents according to the research topics drawn from the Computer Science Ontology. 5.1M publications and 5.6M patents are further characterized according to the type of the author’s affiliations and 66 industrial sectors from the proposed Industrial Sectors Ontology (INDUSO). AIDA was generated by an automatic pipeline that integrates data from Microsoft Academic Graph, Dimensions, DBpedia, the Computer Science Ontology, and the Global Research Identifier Database. It is publicly available under CC BY 4.0 and can be downloaded as a dump or queried via a triplestore. We evaluated the different parts of the generation pipeline on a manually crafted gold standard yielding competitive results

Open Research Online (The Open University)

MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing

Author: Alam Sawood
Publication venue: ODU Digital Commons
Publication date: 01/12/2020
Field of study

With the proliferation of public web archives, it is becoming more important to better profile their contents, both to understand their immense holdings as well as to support routing of requests in Memento aggregators. A memento is a past version of a web page and a Memento aggregator is a tool or service that aggregates mementos from many different web archives. To save resources, the Memento aggregator should only poll the archives that are likely to have a copy of the requested Uniform Resource Identifier (URI). Using the Crawler Index (CDX), we generate profiles of the archives that summarize their holdings and use them to inform routing of the Memento aggregator’s URI requests. Additionally, we use full text search (when available) or sample URI lookups to build an understanding of an archive’s holdings. Previous work in profiling ranged from using full URIs (no false positives, but with large profiles) to using only top-level domains (TLDs) (smaller profiles, but with many false positives). This work explores strategies in between these two extremes. For evaluation we used CDX files from Archive-It, UK Web Archive, Stanford Web Archive Portal, and Arquivo.pt. Moreover, we used web server access log files from the Internet Archive’s Wayback Machine, UK Web Archive, Arquivo.pt, LANL’s Memento Proxy, and ODU’s MemGator Server. In addition, we utilized historical dataset of URIs from DMOZ. In early experiments with various URI-based static profiling policies we successfully identified about 78% of the URIs that were not present in the archive with less than 1% relative cost as compared to the complete knowledge profile and 94% URIs with less than 10% relative cost without any false negatives. In another experiment we found that we can correctly route 80% of the requests while maintaining about 0.9 recall by discovering only 10% of the archive holdings and generating a profile that costs less than 1% of the complete knowledge profile. We created MementoMap, a framework that allows web archives and third parties to express holdings and/or voids of an archive of any size with varying levels of details to fulfil various application needs. Our archive profiling framework enables tools and services to predict and rank archives where mementos of a requested URI are likely to be present. In static profiling policies we predefined the maximum depth of host and path segments of URIs for each policy that are used as URI keys. This gave us a good baseline for evaluation, but was not suitable for merging profiles with different policies. Later, we introduced a more flexible means to represent URI keys that uses wildcard characters to indicate whether a URI key was truncated. Moreover, we developed an algorithm to rollup URI keys dynamically at arbitrary depths when sufficient archiving activity is detected under certain URI prefixes. In an experiment with dynamic profiling of archival holdings we found that a MementoMap of less than 1.5% relative cost can correctly identify the presence or absence of 60% of the lookup URIs in the corresponding archive without any false negatives (i.e., 100% recall). In addition, we separately evaluated archival voids based on the most frequently accessed resources in the access log and found that we could have avoided more than 8% of the false positives without introducing any false negatives. We defined a routing score that can be used for Memento routing. Using a cut-off threshold technique on our routing score we achieved over 96% accuracy if we accept about 89% recall and for a recall of 99% we managed to get about 68% accuracy, which translates to about 72% saving in wasted lookup requests in our Memento aggregator. Moreover, when using top-k archives based on our routing score for routing and choosing only the topmost archive, we missed only about 8% of the sample URIs that are present in at least one archive, but when we selected top-2 archives, we missed less than 2% of these URIs. We also evaluated a machine learning-based routing approach, which resulted in an overall better accuracy, but poorer recall due to low prevalence of the sample lookup URI dataset in different web archives. We contributed various algorithms, such as a space and time efficient approach to ingest large lists of URIs to generate MementoMaps and a Random Searcher Model to discover samples of holdings of web archives. We contributed numerous tools to support various aspects of web archiving and replay, such as MemGator (a Memento aggregator), Inter- Planetary Wayback (a novel archival replay system), Reconstructive (a client-side request rerouting ServiceWorker), and AccessLog Parser. Moreover, this work yielded a file format specification draft called Unified Key Value Store (UKVS) that we use for serialization and dissemination of MementoMaps. It is a flexible and extensible file format that allows easy interactions with Unix text processing tools. UKVS can be used in many applications beyond MementoMaps

Old Dominion University

Divide-and-Learn: A Random Indexing Approach to Attribute Inference Attacks in Online Social Networks

Author: Eidizadehakhcheloo Sanaz
Imine Abdessamad
Pijani Bizhan,
Rusinowitch Michaël
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/07/2021
Field of study

Part 6: Potpourri IIInternational audienceWe present a Divide-and-Learn machine learning methodology to investigate a new class of attribute inference attacks against Online Social Networks (OSN) users. Our methodology analyzes commenters' preferences related to some user publications (e.g., posts or pictures) to infer sensitive attributes of that user. For classification performance, we tune Random Indexing (RI) to compute several embeddings for textual units (e.g., word, emoji), each one depending on a specific attribute value. RI guarantees the comparability of the generated vectors for the different values. To validate the approach, we consider three Facebook attributes: gender, age category and relationship status, which are highly relevant for targeted advertising or privacy threatening applications. By using an XGBoost classifier, we show that we can infer Facebook users' attributes from commenters' reactions to their publications with AUC from 94% to 98%, depending on the traits

INRIA a CCSD electronic archive server