19 research outputs found
Ranking methods and impact metrics for scientific publications
The constantly increasing number of scientific publications affects researchers, students, academic hiring officials and search engines alike in discerning the high-impact works among them. Therefore, there is a need to develop methods to rank scientific papers. Despite a prolific literature on query-independent (or static) paper ranking algorithms, which aim to rank papers based on their impact, no systematic review of the field has been conducted. Past literature lacks in terms of defining impact, often failing to discern among short- term and long-term scientific impact. Further, no extensive experimental evaluation of the various proposed methods has been conducted. This thesis examines impact-based paper ranking in terms of methods, search engine applications, and its relation to paper abstract readability. In short, the contributions of the thesis are as follows:• Long-term and short-term impact are formally defined and the various ranking and evaluation approaches encountered so far in the literature are examined and classified.• An extensive experimental evaluation is conducted to identify which proposed mechanismsperform best in ranking by short- and long-term impact.• Motivated by the observed improvement margin in ranking based on short-term impact, a novel method is proposed building on recent advances of network science.• The development of specialized and general academic search engines enhanced with short- and long-term impact-based ranking methods is presented.• Finally, paper abstract readability and its relation to paper impact is examined.Η διαρκής αύξηση του όγκου των επιστημονικών δημοσιεύσεων δημιουργεί προβλήματα στη διάκριση των σημαντικότερων από αυτές, επηρεάζοντας ταυτόχρονα ερευνητές, φοιτητές, υπεύθυνους ακαδημαϊκών προσλήψεων, αλλά και τις μηχανές αναζήτησης. Για το λόγο αυτό υπάρχει ανάγκη ανάπτυξης μηχανισμών κατάταξης (ή αλλιώς ιεράρχησης) των επιστημονικών δημοσιεύσεων. Παρ' ότι υπάρχει πλούσια βιβλιογραφία γύρω από την ανεξαρτήτως ερωτήματος κατάταξη (query independent ranking) επιστημονικών δημοσιεύσεων (γνωστή και ως στατική κατάταξη), στόχος της οποίας είναι η περιγραφή μεθόδων κατάταξης επιστημονικών δημοσιεύσεων, με βάση την απήχηση τους, δεν έχει πραγματοποιηθεί έως τώρα κάποια μεθοδική και συστηματική ανασκόπηση του αντικειμένου. Συγκεκριμένα, στη τρέχουσα βιβλιογραφία υφίστανται κενά στον ορισμό της απήχησης, ενώ δε γίνεται διάκριση μεταξύ μακροχρόνιας και βραχυχρόνιας επιστημονικής απήχησης. Επίσης, δεν εξετάζεται η σχέση της απήχησης των δημοσιεύσεων με άλλα χαρακτηριστικά των κειμένων, όπως η αναγνωσιμότητα. Επιπλέον, δεν έχει πραγματοποιηθεί καμία εκτενής πειραματική αξιολόγηση των επιμέρους μεθόδων που έχουν προταθεί στη βιβλιογραφία. Αντικείμενο του διδακτορικού είναι η εξέταση της κατάταξης (ιεράρχησης) δημοσιεύσεων με βάση την απήχησή τους, των μεθόδων που έχουν προταθεί στη βιβλιογραφία, της εφαρμογής τους στα πλαίσια πραγματικών μηχανών αναζήτησης επιστημονικών δημοσιεύσεων και της συσχέτισης απήχησης - αναγνωσιμότητας δημοσιεύσεων. Συνοπτικά:• Ορίζονται τυπικά η μακροχρόνια και βραχυχρόνια απήχηση και εξετάζονται και κατηγοριοποιούνται οι προσεγγίσεις που έχουν διατυπωθεί στη τρέχουσα βιβλιογραφία.• Πραγματοποιείται μια εκτενής πειραματική αξιολόγηση για τη διερεύνηση των μηχανισμών που οδηγούν στη βέλτιστη αποτελεσματικότητα για την παραγωγή κατατάξεων των δημοσιεύσεων με βάση αυτά τα δύο είδη απήχησης.• Καθώς τα αποτελέσματα της αξιολόγησης αποκαλύπτουν περιθώρια βελτίωσης στην κατάταξη με βάση τη βραχυχρόνια απήχηση, προτείνεται μια νέα μέθοδος κατάταξης, επηρεασμένη από πρόσφατες εξελίξεις της επιστήμης δικτύων (network science), ενσωματώνοντας τροποποιήσεις στη μέθοδο PageRank. Με εκτενή πειραματική αξιολόγηση αναδεικνύεται η αποτελεσματικότητα της νέας μεθόδου σε σχέση με τις άλλες τρέχουσες τεχνολογίες αιχμής.• Παρουσιάζεται, επιπλέον, η ανάπτυξη εξειδικευμένων, αλλά και γενικών ακαδημαϊκών μηχανών αναζήτησης, οι οποίες κάνουν χρήση μεθόδων κατάταξης που έχουν εξεταστεί.• Τέλος, εξετάζεται η σχέση της αναγνωσιμότητας των περιλήψεων επιστημονικών δημοσιεύσεων με την απήχησή τους
Querying scientific databases considering the historic evolution of data
117 σ.Σκοπός της εργασίας ήταν η ανάπτυξη μιας εφαρμογής προβολής των αλλαγών που αφορούν τα δεδομένα που καταγράφονται για βιομόρια micro RNA. Η εφαρμογή ενσωματώνεται ως επέκταση σε λογισμικό που διατίθεται από την ιστοσελίδα DIANA, η οποία αναπτύχθηκε σε συνεργασία με το ερευνητικό κέντρο «Αλέξανδρος Φλέμινγκ». Τα μόρια micro RNA αποτελούν ένα σημαντικό τομέα έρευνας της βιολογίας, καθώς σχετίζονται με την έκφραση γονιδίων και συνεπώς ευθύνονται για την εκδήλωση ασθενειών. Καθώς εμπλουτίζονται τα δεδομένα που προκύπτουν από την έρευνα γύρω από τα μόρια αυτά, εκδίδονται νέες εκδόσεις των βάσεων δεδομένων που τα καταγράφουν. Από έκδοση σε έκδοση των βάσεων αυτών, ενδέχεται να παρουσιάζονται αλλαγές στα δεδομένα αυτά. Το αποτέλεσμα είναι η παρουσίαση δυσκολιών στην έρευνα, καθώς ο ερευνητής δεν μπορεί να έχει τη συνολική εικόνα για τη γνώση που υπάρχει γύρω από ένα συγκεκριμένο μόριο micro RNA. Στα πλαίσια της διπλωματικής (α) κατασκευάστηκε μια βάση δεδομένων που κωδικοποιεί όλη την εξέλιξη των δεδομένων για τα micro RNAs που καταγράφονται από την βάση mirbase (www.mirbase.org) και (β) υλοποιήθηκε μια διαδικτυακή εφαρμογή που προβάλλει τους γράφους εξέλιξης των δεδομένων για micro RNAs από έκδοση σε έκδοση, κάνοντας χρήση της παραπάνω βάσης.The aim of this thesis is to implement a web application for presenting changes recorded in multiple versions of microRNA databanks. The application has been integrated into the DIANA web application toolkit, which is a set of tools for analyzing microRNA data, and has been developed in cooperation with the research center “Alexander Fleming”. MicroRNA molecules constitute an important branch of biological research, since their function is correlated with gene expression, and thus, microRNAs are related to diseases and treatments. As new data emerges from research concerning these biological molecules, new versions of databases recording them are issued. It is possible that recorded data differs between various versions of these databases. As a result, there can be difficulties in research, since a researcher cannot have a complete overview of all recorded knowledge concerning a specific microRNA. The aim of this thesis is (i) to implement a database containing information about the evolution of microRNA data recorded in the mirbase archives (www.mirbase.org) and (ii) to implement a web application that presents evolution graphs for recorded microRNA data.Ηλίας Ι. Κανέλλο
How many pigs within a group need to be sick to lead to a diagnostic change in the group’s behavior?
Piloting topic-aware research impact assessment features in BIP! Services
Various research activities rely on citation-based impact indicators. However
these indicators are usually globally computed, hindering their proper
interpretation in applications like research assessment and knowledge
discovery. In this work, we advocate for the use of topic-aware categorical
impact indicators, to alleviate the aforementioned problem. In addition, we
extend BIP! Services to support those indicators and showcase their benefits in
real-world research activities.Comment: 5 pages, 2 figure
BIP! DB: A Dataset of Impact Measures for Scientific Publications
<p>This dataset contains citation-based impact indicators (a.k.a, "measures") for ~153M distinct DOIs that correspond to scientific articles. In particular, for each DOI, we have calculated the following indicators (organized in categories based on the semantics of the impact aspect that they better capture):</p><p><strong>Influence indicators</strong> (i.e., indicators of the "total" impact of each article; how established the article is in general)</p><p><i>Citation Count:</i> The total number of citations of the article, the most well-known influence indicator.</p><p><i>PageRank score:</i><strong> </strong>An influence indicator based on the PageRank [1], a popular network analysis method. PageRank estimates the influence of each article based on its centrality in the whole citation network. It alleviates some issues of the Citation Count indicator (e.g., two articles with the same number of citations can have significantly different PageRank scores if the aggregated influence of the articles citing them is very different - the article receiving citations from more influential articles will get a larger score). </p><p><strong>Popularity indicators</strong> (i.e., indicators of the "current" impact of each article; how popular the article is currently)</p><p><i>RAM score:</i> A popularity indicator based on the RAM [2] method. It is essentially a Citation Count where recent citations are considered as more important. This type of "time awareness" alleviates problems of methods like PageRank, which are biased against recently published articles (new articles need time to receive a number of citations that can be indicative for their impact).</p><p><i>AttRank score:</i><strong> </strong>A popularity indicator based on the AttRank [3] method. AttRank alleviates PageRank's bias against recently published papers by incorporating an attention-based mechanism, akin to a time-restricted version of preferential attachment, to explicitly capture a researcher's preference to read papers which received a lot of attention recently.</p><p><strong>Impulse indicators</strong> (i.e., indicators of the initial momentum that the article receives after its publication)</p><p><i>Incubation Citation Count (3-year CC): </i>This impulse indicator is a time-restricted version of the Citation Count, where the time window length is fixed for all papers and the time window depends on the publication date of the paper, i.e., only citations 3 years after each paper's publication are counted.</p><p>More details about the aforementioned impact indicators, the way they are calculated and their interpretation can be found <a href="https://bip.imsi.athenarc.gr/site/indicators">here</a> and in the respective references (e.g., in [5]).</p><p>From version 5.1 onward, the impact indicators are calculated in two levels:</p><ul><li>The <strong>DOI level</strong> (assuming that each DOI corresponds to a distinct scientific article).</li><li>The <strong>OpenAIRE-id level</strong> (leveraging DOI synonyms based on OpenAIRE's deduplication algorithm [4] - each distinct article has its own OpenAIRE id).</li></ul><p>Previous versions of the dataset only provided the scores at the DOI level.</p><p>Also, from version 7 onward, for each article in our files we also offer an impact class, which informs the user about the percentile into which the article score belongs compared to the impact scores of the rest articles in the database. The impact classes are: C1 (in top 0.01%), C2 (in top 0.1%), C3 (in top 1%), C4 (in top 10%), and C5 (in bottom 90%).</p><p>Finally, before version 10, the calculation of the impact scores (and classes) was based on a citation network having one node for each article with a distinct DOI that we could find in our input data sources. However, from version 10 onward, the nodes are deduplicated using the most recent version of the <a href="https://graph.openaire.eu/docs/graph-production-workflow/deduplication/research-products">OpenAIRE article deduplication algorithm</a>. This enabled a correction of the scores (more specifically, we avoid counting citation links multiple times when they are made by multiple versions of the same article). As a result, each node in the citation network we build is a deduplicated article having a distinct OpenAIRE id. We still report the scores at DOI level (i.e., we assign a score to each of the versions/instances of the article), however these DOI-level scores are just the scores of the respective deduplicated nodes propagated accordingly (i.e., all version of the same deduplicated article will receive the same scores). We have removed a small number of instances (having a DOI) that were assigned (by error) to multiple deduplicated records in the OpenAIRE Graph.</p><p>For each calculation level (DOI / OpenAIRE-id) we provide five (5) compressed CSV files (one for each measure/score provided) where each line follows the format "identifier <tab> score <tab> class". The parameter setting of each measure is encoded in the corresponding filename. For more details on the different measures/scores see our extensive experimental study5 and the configuration of AttRank in the original paper.[3] Files for the OpenAIRE-ids case contain the keyword "openaire_ids" in the filename. </p><p>From version 9 onward, we also provide topic-specific impact classes for DOI-identified publications. In particular, we associated those articles with 2nd level concepts from OpenAlex (284 in total); we chose to keep only the three most dominant concepts for each publication, based on their confidence score, and only if this score was greater than 0.3. Then, for each publication and impact measure, we compute its class within its respective concepts. We provide finally the "topic_based_impact_classes.txt" file where each line follows the format "identifier <tab> concept <tab> pagerank_class <tab> attrank_class <tab> 3-cc_class <tab> cc_class".</p><p>The data used to produce the citation network on which we calculated the provided measures have been gathered from the OpenAIRE Graph v6.0.1, including data from (a) the OpenCitations' COCI dataset (Jan-2023 version), (b) a MAG [6,7] snapshot from Dec-2021, and (c) a Crossref snapshot from May-2023 (before version 10, these sources were gathered independently). The union of all distinct citations that could be found in these sources have been considered. In addition, versions later than v.10 leverage the filtering rules described <a href="https://graph.openaire.eu/docs/graph-production-workflow/aggregation/non-compatible-sources/doiboost/#crossref-filtering">here</a> to remove from the dataset DOIs with problematic metadata.</p><p>References:</p><p>[1] R. Motwani L. Page, S. Brin and T. Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.</p><p>[2] Rumi Ghosh, Tsung-Ting Kuo, Chun-Nan Hsu, Shou-De Lin, and Kristina Lerman. 2011. Time-Aware Ranking in Dynamic Citation Networks. In Data Mining Workshops (ICDMW). 373–380</p><p>[3] I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Ranking Papers by their Short-Term Scientific Impact. CoRR abs/2006.00951 (2020)</p><p>[4] P. Manghi, C. Atzori, M. De Bonis, A. Bardi, Entity deduplication in big data graphs for scholarly communication, Data Technologies and Applications (2020).</p><p>[5] I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Impact-Based Ranking of Scientific Publications: A Survey and Experimental Evaluation. TKDE 2019 (early access)</p><p>[6] Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MA) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246. DOI=http://dx.doi.org/10.1145/2740908.2742839</p><p>[7] K. Wang et al., "A Review of Microsoft Academic Services for Science of Science Studies", Frontiers in Big Data, 2019, doi: 10.3389/fdata.2019.00045 </p><p>Find our Academic Search Engine built on top of these data <a href="https://bip.imsi.athenarc.gr/">here</a>. Further note, that we also provide all calculated scores through <a href="https://bip-api.imsi.athenarc.gr/documentation">BIP! Finder's API</a>. </p><p><i>Terms of use:</i> These data are provided "as is", without any warranties of any kind. The data are provided under the Creative Commons Attribution 4.0 International license.</p><p>More details about BIP! DB can be found in our relevant peer-reviewed publication:</p><p><i>Thanasis Vergoulis, Ilias Kanellos, Claudio Atzori, Andrea Mannocci, Serafeim Chatzopoulos, Sandro La Bruzzo, Natalia Manola, Paolo Manghi: BIP! DB: A Dataset of Impact Measures for Scientific Publications. WWW (Companion Volume) 2021: 456-460</i></p><p>We kindly request that any published research that makes use of BIP! DB cite the above article.</p>Please cite: Thanasis Vergoulis, Ilias Kanellos, Claudio Atzori, Andrea Mannocci, Serafeim Chatzopoulos, Sandro La Bruzzo, Natalia Manola, Paolo Manghi: BIP! DB: A Dataset of Impact Measures for Scientific Publications. WWW (Companion Volume) 2021: 456-46
BIP! DB: A Dataset of Impact Measures for Research Products
<p>This dataset contains citation-based impact indicators (a.k.a, "measures") for ~168.8M distinct PIDs (persistent identifiers) that correspond to research products (scientific publications, datasets, etc). In particular, for each PID, we have calculated the following indicators (organized in categories based on the semantics of the impact aspect that they better capture):</p>
<p><strong>Influence indicators</strong> (i.e., indicators of the "total" impact of each research product; how established it is in general)</p>
<p><em>Citation Count:</em> The total number of citations of the product, the most well-known influence indicator.</p>
<p><em>PageRank score:</em><strong> </strong>An influence indicator based on the PageRank [1], a popular network analysis method. PageRank estimates the influence of each product based on its centrality in the whole citation network. It alleviates some issues of the Citation Count indicator (e.g., two products with the same number of citations can have significantly different PageRank scores if the aggregated influence of the products citing them is very different - the product receiving citations from more influential products will get a larger score). </p>
<p><strong>Popularity indicators</strong> (i.e., indicators of the "current" impact of each research product; how popular the product is currently)</p>
<p><em>RAM score:</em> A popularity indicator based on the RAM [2] method. It is essentially a Citation Count where recent citations are considered as more important. This type of "time awareness" alleviates problems of methods like PageRank, which are biased against recently published products (new products need time to receive a number of citations that can be indicative for their impact).</p>
<p><em>AttRank score:</em><strong> </strong>A popularity indicator based on the AttRank [3] method. AttRank alleviates PageRank's bias against recently published products by incorporating an attention-based mechanism, akin to a time-restricted version of preferential attachment, to explicitly capture a researcher's preference to examine products which received a lot of attention recently.</p>
<p><strong>Impulse indicators</strong> (i.e., indicators of the initial momentum that the research product received right after its publication)</p>
<p><em>Incubation Citation Count (3-year CC): </em>This impulse indicator is a time-restricted version of the Citation Count, where the time window length is fixed for all products and the time window depends on the publication date of the product, i.e., only citations 3 years after each product's publication are counted.</p>
<p>More details about the aforementioned impact indicators, the way they are calculated and their interpretation can be found <a href="https://bip.imsi.athenarc.gr/site/indicators">here</a> and in the respective references (e.g., in [5]).</p>
<p>From version 5.1 onward, the impact indicators are calculated in two levels:</p>
<ul>
<li>The <strong>PID level</strong> (assuming that each PID corresponds to a distinct research product).</li>
<li>The <strong>OpenAIRE-id level</strong> (leveraging PID synonyms based on OpenAIRE's deduplication algorithm [4] - each distinct article has its own OpenAIRE id).</li>
</ul>
<p>Previous versions of the dataset only provided the scores at the PID level.</p>
<p>From version 12 onward, two types of PIDs are included in the dataset: DOIs and PMIDs (before that version, only DOIs were included). </p>
<p>Also, from version 7 onward, for each product in our files we also offer an impact class, which informs the user about the percentile into which the product score belongs compared to the impact scores of the rest products in the database. The impact classes are: C1 (in top 0.01%), C2 (in top 0.1%), C3 (in top 1%), C4 (in top 10%), and C5 (in bottom 90%).</p>
<p>Finally, before version 10, the calculation of the impact scores (and classes) was based on a citation network having one node for each product with a distinct PID that we could find in our input data sources. However, from version 10 onward, the nodes are deduplicated using the most recent version of the <a href="https://graph.openaire.eu/docs/graph-production-workflow/deduplication/research-products">OpenAIRE article deduplication algorithm</a>. This enabled a correction of the scores (more specifically, we avoid counting citation links multiple times when they are made by multiple versions of the same product). As a result, each node in the citation network we build is a deduplicated product having a distinct OpenAIRE id. We still report the scores at PID level (i.e., we assign a score to each of the versions/instances of the product), however these PID-level scores are just the scores of the respective deduplicated nodes propagated accordingly (i.e., all version of the same deduplicated product will receive the same scores). We have removed a small number of instances (having a PID) that were assigned (by error) to multiple deduplicated records in the OpenAIRE Graph.</p>
<p>For each calculation level (PID / OpenAIRE-id) we provide five (5) compressed CSV files (one for each measure/score provided) where each line follows the format "identifier <tab> score <tab> class". The parameter setting of each measure is encoded in the corresponding filename. For more details on the different measures/scores see our extensive experimental study [5] and the configuration of AttRank in the original paper. [3] Files for the OpenAIRE-ids case contain the keyword "openaire_ids" in the filename. </p>
<p>From version 9 onward, we also provide topic-specific impact classes for PID-identified products. In particular, we associated those products with 2nd level concepts from OpenAlex; we chose to keep only the three most dominant concepts for each product, based on their confidence score, and only if this score was greater than 0.3. Then, for each product and impact measure, we compute its class within its respective concepts. We provide finally the "topic_based_impact_classes.txt" file where each line follows the format "identifier <tab> concept <tab> pagerank_class <tab> attrank_class <tab> 3-cc_class <tab> cc_class".</p>
<p>The data used to produce the citation network on which we calculated the provided measures have been gathered from the OpenAIRE Graph v7.0.0, including data from (a) OpenCitations' COCI & POCI dataset, (b) MAG [6,7], and (c) Crossref. The union of all distinct citations that could be found in these sources have been considered. In addition, versions later than v.10 leverage the filtering rules described <a href="https://graph.openaire.eu/docs/graph-production-workflow/aggregation/non-compatible-sources/doiboost/#crossref-filtering">here</a> to remove from the dataset PIDs with problematic metadata.</p>
<p>References:</p>
<p>[1] R. Motwani L. Page, S. Brin and T. Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.</p>
<p>[2] Rumi Ghosh, Tsung-Ting Kuo, Chun-Nan Hsu, Shou-De Lin, and Kristina Lerman. 2011. Time-Aware Ranking in Dynamic Citation Networks. In Data Mining Workshops (ICDMW). 373–380</p>
<p>[3] I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Ranking Papers by their Short-Term Scientific Impact. CoRR abs/2006.00951 (2020)</p>
<p>[4] P. Manghi, C. Atzori, M. De Bonis, A. Bardi, Entity deduplication in big data graphs for scholarly communication, Data Technologies and Applications (2020).</p>
<p>[5] I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Impact-Based Ranking of Scientific Publications: A Survey and Experimental Evaluation. TKDE 2019 (early access)</p>
<p>[6] Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MA) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246. DOI=http://dx.doi.org/10.1145/2740908.2742839</p>
<p>[7] K. Wang et al., "A Review of Microsoft Academic Services for Science of Science Studies", Frontiers in Big Data, 2019, doi: 10.3389/fdata.2019.00045 </p>
<p>Find our Academic Search Engine built on top of these data <a href="https://bip.imsi.athenarc.gr/">here</a>. Further note, that we also provide all calculated scores through <a href="https://bip-api.imsi.athenarc.gr/documentation">BIP! Finder's API</a>. </p>
<p><em>Terms of use:</em> These data are provided "as is", without any warranties of any kind. The data are provided under the Creative Commons Attribution 4.0 International license.</p>
<p>More details about BIP! DB can be found in our relevant peer-reviewed publication:</p>
<p><em>Thanasis Vergoulis, Ilias Kanellos, Claudio Atzori, Andrea Mannocci, Serafeim Chatzopoulos, Sandro La Bruzzo, Natalia Manola, Paolo Manghi: BIP! DB: A Dataset of Impact Measures for Scientific Publications. WWW (Companion Volume) 2021: 456-460</em></p>
<p>We kindly request that any published research that makes use of BIP! DB cite the above article.</p>Please cite: Thanasis Vergoulis, Ilias Kanellos, Claudio Atzori, Andrea Mannocci, Serafeim Chatzopoulos, Sandro La Bruzzo, Natalia Manola, Paolo Manghi: BIP! DB: A Dataset of Impact Measures for Scientific Publications. WWW (Companion Volume) 2021: 456-46
Climate change but not unemployment explains the changing suicidality in Thessaloniki Greece (2000-2012)
INTRODUCTION:
Recently there was a debate concerning the etiology behind attempts and completed suicides. The aim of the current study was to search for possible correlations between the rates of attempted and completed suicide and climate variables and regional unemployment per year in the county of Thessaloniki, Macedonia, northern Greece, for the years 2000-12.
MATERIAL AND METHODS:
The regional rates of suicide and attempted suicide as well as regional unemployment were available from previous publications of the authors. The climate variables were calculated from the daily E-OBS gridded dataset which is based on observational data
RESULTS:
Only the male suicide rates correlate significantly with high mean annual temperature but not with unemployment. The multiple linear regression analysis results suggest that temperature is the only variable that determines male suicides and explains 51% of their variance. Unemployment fails to contribute significantly to the model. There seems to be a seasonal distribution for attempts with mean rates being higher for the period from May to October and the rates clearly correlate with temperature. The highest mean rates were observed during May and August and the lowest during December and February. Multiple linear regression analysis suggests that temperature also determines the female attempts rate although the explained variable is significant but very low (3-5%)
CONCLUSION:
Climate variables and specifically high temperature correlate both with suicide and attempted suicide rates but with a different way between males and females. The climate effect was stronger than the effect of unemployme
mirPub: a database for searching microRNA publications
A Summary: Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated data-bases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications