Research outputs

CORE vision and mission

If you use CORE, CORE services or CORE data in your research, we kindly request you cite this publication.

Knoth, P. and Zdrahal, Z. (2012) CORE: Three Access Levels to Underpin Open Access, D-Lib Magazine, 18, 11/12, Corporation for National Research Initiatives.
Sets the vision for creating the CORE service, developing global-wide content aggregation of all open access research literature (on top of OAI-PMH protocol for metadata harvesting and other protocols). It sets the mission to develop the three access levels (access at the granularity of papers; analytical access; access to raw data) via CORE.

CORE aggregation approaches and infrastructure

Knoth, P. (2013). From Open Access Metadata to Open Access Content: Two Principles for Increased Visibility of Open Access Content. Open Repositories 2013, Charlottetown, Prince Edward Island, Canada.
This paper describes the two principles that should be followed to ensure that content can be properly harvested from repositories. This paper could be of great interest to repository managers.

Knoth, P. (2013) CORE: Aggregation Use Cases for Open Access. Demo at Joint Conference on Digital Libraries (JCDL 2013), Indianapolis, Indiana, United States.
This paper describes the use cases that must be supported by open access aggregators, it establishes the CORE use case and demonstrates the benefits of open access content aggregators.

Knoth, P. and Pontika, N. (2016). Aggregating Research Papers from Publishers' Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not?, Workshop: INTEROP2016 at 10th Language Resources and Evaluation Conference.
This paper describes the technical challenges relating to machine interfaces, the interoperability issues on obtaining open access content and the complications of achieving a harmonisation across repositories’ and publishers’ systems.

Knoth, P. and Zdrahal, Z. (2011) CORE: Connecting Repositories in the Open Access Domain. CERN workshop on Innovations in Scholarly Communication (OAI7), Geneva, Switzerland.

Knoth, P., Robotka, V. and Zdrahal, Z. (2011). Connecting Repositories in the Open Access Domain using Text Mining and Semantic Data. International Conference on Theory and Practice of Digital Libraries 2011 (TPDL 2011), Berlin, Germany.
This paper describes the CORE system in its early stages with a focus on the original idea of the CORE recommender.

CORE Recommender

Petr Knoth, Lucas Anastasiou, Aristotelis Charalampous, Matteo Cancellieri, Samuel Pearce, Nancy Pontika, Vaclav Bayer (2017). ‘Towards effective research recommender systems for repositories’. Open Repositories 2017.
In this paper, we argue why and how the integration of recommender systems for research can enhance the functionality and user experience in repositories. We present the latest technical innovations in the CORE Recommender, which provides research article recommendations across the global network of repositories and journals. The CORE Recommender has been recently redeveloped and released into production in the CORE system and has also been deployed in several third-party repositories. We explain the design choices of this unique system and the evaluation processes we have in place to continue raising the quality of the provided recommendations. By drawing on our experience, we discuss the main challenges in offering a state-of-the-art recommender solution for repositories. We highlight two of the key limitations of the current repository infrastructure with respect to developing research recommender systems: 1) the lack of a standardised protocol and capabilities for exposing anonymised user-interaction logs, which represent critically important input data for recommender systems based on collaborative filtering and 2) the lack of a voluntary global sign-on capability in repositories, which would enable the creation of personalised recommendation and notification solutions based on past user interactions.

Knoth, P., Novotny, J. and Zdrahal, Z. (2010). Automatic generation of inter-passage links based on semantic similarity, The 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China.
This paper describes the algorithm that was used in the original version of the CORE recommender.

CORE Repositories Dashboard

Pontika, N., Knoth, P., Cancellieri, M. and Pearce, S. (2016). Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories, LIBER Quarterly, 25, 4.
This paper presents the CORE Repositories Dashboard, a tool designed primarily for repository managers. It describes how the Dashboard improves the quality of the harvested papers, advances the collaboration between the repository managers and CORE, enables a straightforward management of their collections and enhances the transparency of the harvested content.

CORE and Download Statistics

Pearce, S. and Pontika, N. (2016). Integration of the IRUS-UK Statistics in the CORE Repositories Dashboard. Poster. Open Repositories 2016 (OR2016).
This poster presents the integration of the IRUS-UK service with the CORE Repositories Dashboard tool, which enables repository managers access reliable download statistics of the full-text papers harvested by CORE.

Knoth, P., Anastasiou, L. and Pearce, S. (2014). My repository is being aggregated: a blessing or a curse?, Open Repositories 2014 (OR 2014), Helsinki, Finland.
This paper describes the collaboration between aggregators and repositories in terms of sharing download usage statistics.

A more extensive list of our papers can be found further down:

Herrmannova, Drahomira ; Knoth, Petr and Patton, Robert (2018). Analyzing Citation-Distance Networks for Evaluating Publication Impact. In: 11th edition of the Language Resources and Evaluation Conference, May 7, 2018 - May 12, 2018, Miyazaki, Japan.
Herrmannova, Drahomira ; Patton, Robert M.; Knoth, Petr and Stahl, Christopher G. (2018). Do citations and readership identify seminal publications? Scientometrics (Early Access).
Pride, David and Knoth, Petr (2017). Incidental or influential? – A decade of using text-mining for citation function classification. In: 16th International Society of Scientometrics and Informetrics Conference, 16-20 October 2017, Wuhan.
Knoth, Petr ; Gooch, Phil and Jack, Kris (2017). What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers. Lecture Notes in Computer Science, 10450 pp. 287–299.
Pontika, Nancy ; Knoth, Petr ; Anastasiou, Lucas ; Charalampous, Aristotelis ; Cancellieri, Matteo ; Pearce, Samuel and Bayer, Vaclav (2017). The uptake of the CORE recommender in repositories. OpenRepositories2017.
Knoth, Petr ; Anastasiou, Lucas ; Basile, Giorgio ; Pearce, Samuel and Pontika, Nancy (2017). Machine accessibility of Open Access scientific publications from publisher systems via ResourceSync. OAI10.
Knoth, Petr ; Anastasiou, Lucas ; Charalampous, Aristotelis ; Cancellieri, Matteo ; Pearce, Samuel ; Pontika, Nancy and Bayer, Vaclav (2017). Towards effective research recommender systems for repositories. In: Open Repositories 2017, 26 -30 June 2017, Brisbane, Australia.
Oudenhoven, Martine and Pontika, Nancy (2017). Learning about text and data mining: The future of Open Science. Open Science Conference, Berlin, Germany.
Cancellieri, Matteo ; Pontika, Nancy ; Pearce, Samuel ; Anastasiou, Lucas and Knoth, Petr (2017). Building scalable digital library ingestion pipelines using microservices. In: MSTR 2017: 11th International Conference on Metadata and Semantics Research, 28th November - 1st December 2017, Tallinn, Estonia.
Knoth, Petr and Khadka, Anita (2017). Can we do better than co-citations? Bringing Citation Proximity Analysis from idea to practice in research articles recommendation. In: Proceedings of the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017) (Mayr, Philipp; Chandrasekaran, Muthu Kumar and Jaidka, Kokil eds.), pp. 14–25.
Pearce, Samuel and Pontika, Nancy (2016). Integration of the IRUS-UK Statistics in the CORE Repositories Dashboard. Open Repositories.
Orth, Astrid; Pontika, Nancy and Ball, David (2016). FOSTER’s Open Science Training Tools and Best Practices. In: Positioning and Power in Academic Publishing: Players, Agents and Agendas: Proceedings of the 20th International Conference on Electronic Publishing (Loizides, Fernando and Schmidt, Birgit eds.), IOS Press, pp. 135–141.
Knoth, Petr and Pontika, Nancy (2016). Aggregating Research Papers from Publishers’ Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not? In: INTEROP2016 (Eckart de Castilho, Richard; Ananiadou, Sophia; Margoni, Thomas; Peters, Wim and Piperidis, Stelios eds.), 23 May 2016.
Pontika, Nancy ; Knoth, Petr ; Cancellieri, Matteo and Pearce, Samuel (2016). Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories. LIBER Quarterly, 25(4) pp. 172–188.
Pontika, Nancy (2015). Open Access: What’s in it for me as an early career researcher? JCOM : Journal of Science Communication, 14(4), article no. C04.
Herrmannova, Drahomira and Knoth, Petr (2015). Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysing Patterns of Research Collaboration. D-Lib Magazine, 21(11/12)
Evaluating Weekly Predictions of At-Risk Students at The Open University: Results and Issues
Linking Textual Resources to Support Information Discovery
Kuzilek, Jakub ; Hlosta, Martin ; Herrmannova, Drahomira ; Zdrahal, Zdenek and Wolff, Annika (2015). OU Analyse: analysing at-risk students at The Open University. Learning Analytics Review, LAK15-1 pp. 1–16.
Pontika, Nancy and Rozenberga, Dace (2015). Developing strategies to ensure compliance with funders’ open access policies. Insights, 28(1) pp. 32–36.
Pontika, Nancy and Knoth, Petr (2015). Open Science Taxonomy. FOSTER.
Fostering Open Science to Research using a Taxonomy and an eLearning Portal
Knoth, Petr and Herrmannova, Drahomira (2014). Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing a Research Publication's Contribution. D-Lib Magazine, 20(11/12)
Kats, Pavel ; Knoth , Petr ; Mamakis, Georgios ; Mielnicki, Marcin ; Muhr, Markus and Werla, Marcin (2014). Design of Europeana Cloud technical infrastructure. In: Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL)., pp. 491–492.
Knoth, Petr ; Anastasiou, Lucas and Pearce, Samuel (2014). My repository is being aggregated: a blessing or a curse? In: Open Repositories 2014 (OR2014), 9-13 June 2014, Helsinki, Finland.
Hlosta, Martin ; Herrmannova, Drahomira ; Vachova, Lucie; Kuzilek, Jakub ; Zdrahal, Zdenek and Wolff, Annika (2014). Modelling student online behaviour in a virtual learning environment. In: Machine Learning and Learning Analytics workshop at The 4th International Conference on Learning Analytics and Knowledge (LAK14), 24-28 March 2014, Indianapolis, Indiana, USA, 24-28 March 2014, Indianapolis, Indiana, USA.
Developing predictive models for early detection of at-risk students on distance learning modules
Knoth, Petr and Herrmannova, Drahomira (2013). Simple yet effective methods for cross-lingual link discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2. In: NTCIR-10 Evaluation of Information Access Technologies, 18 - 21 June 2013, Tokyo, Japan, pp. 39–46.
Knoth, Petr (2013). From open access metadata to open access content: two principles for increased visibility of open access content. In: Open Repositories 2013, 8 - 12 July 2013, Charlottetown, Prince Edward Island, Canada.
Knoth, Petr and Zdrahal, Zdenek (2013). CORE: aggregation use cases for open access. In: 2nd International Workshop on Mining Scientific Publications (WOSP 2013), 26 July 2013, Indianapolis, IN.
Predicting student performance from combined data sources
Guest editorial
Visual search for supporting content exploration in large document collections
CORE: three access levels to underpin open access
Knoth, Petr and Zdrahal, Zdenek (2011). Mining cross-document relationships from text. In: The First International Conference on Advances in Information Mining and Management (IMMM 2011), 23 - 28 Oct 2011, Barcelona, Spain.
Knoth, Petr and Zdrahal, Zdenek (2011). CORE: connecting repositories in the open access domain. In: CERN Workshop on Innovations in Scholarly Communication (OAI7), 22-24 June 2011, Geneva, Switzerland.
Herrmannova, Drahomira (2011). Social Network Integration into an Information Portal: Social Media in Practice. Germany: LAP Lambert Academic Publishing.
KMI, The Open University at NTCIR-9 CrossLink: Cross-Lingual Link Discovery in Wikipedia using explicit semantic analysis
Knoth, Petr ; Robotka, Vojtech and Zdrahal, Zdenek (2011). Connecting repositories in the open access domain using text mining and semantic data. In: Research and Advanced Technology for Digital Libraries, pp. 483–487.
Knoth, Petr ; Zilka, Lukas and Zdrahal, Zdenek (2011). Using Explicit Semantic Analysis for Cross-Lingual Link Discovery. In: 5th International Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies (CLIA) at The 5th International Joint Conference on Natural Language Processing (IJC-NLP 2011), 08 - 13 Nov 2011, Chiang Mai, Thailand.
Automatic generation of inter-passage links based on semantic similarity
Facilitating cross-language retrieval and machine translation by multilingual domain ontologies
EUROGENE: multilingual retrieval and machine translation applied to human genetics
Knoth, Petr ; Sova, Jan and Zdrahal, Zdenek (2010). Eurogene - the First pan-European learning service in the field of Genetics. In: Znalosti (Knowledge) 2010, 3-5 Feb 2010, Jindrichuv Hradec, Czech Republic.
Fernandez, Miriam ; Sabou, Marta ; Knoth, Petr and Motta, Enrico (2010). Predicting the quality of semantic relations by applying Machine Learning classifiers. In: EKAW 2010 - Knowledge Engineering and Knowledge Management by the Masses, 11-15 Oct 2010, Lisbon, Portugal.
Schmidt, Marek ; Knoth, Petr and Smrz, Pavel (2009). Information extraction in the KiWi Project. In: Znalosti 2009 , 4-6 Feb 2009, Bratislava, Slovakia.
Zdrahal, Zdenek ; Knoth, Petr ; Collins, Trevor and Mulholland, Paul (2009). Reasoning across multilingual learning resources in human genetics. In: The International 2009 ICL Conference on Interactive Computer Aided Learning, 23-25 Sep 2009, Villach, Austria.
Towards a framework for comparing automatic term recognition methods
Knoth, Petr (2009). Semantic annotation of multilingual learning objects based on a domain ontology. In: Doctoral consortium Workshop at The Fourth European Conference on Technology Enhanced Learning (EC-TEL 2009), 29 Sep - 02 Oct 2009, Nice, France.
Knoth, Petr (2008). Extraction of semantic relations from texts. In: Conference and Student EEICT 2008, 24 Apr 2008, Brno, Czech Republic.
Opsomer, Rob; Knoth, Petr ; van Polen, Freek; Trapman, Jantine and Wiering, Marco (2008). Categorizing children: automated text classification of CHILDES files. In: The 20th Belgian-Netherlands Conference on Artificial Intelligence (BNAIC 2008), 30 - 31 Oct 2008, Enchede, The Netherlands.