Search CORE

97 research outputs found

Leveraging full-text article exploration for citation analysis

Author: Baralis E.
Cagliero L.
La Quatra M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Scientific articles often include in-text citations quoting from external sources. When the cited source is an article, the citation context can be analyzed by exploring the article full-text. To quickly access the key information, researchers are often interested in identifying the sections of the cited article that are most pertinent to the text surrounding the citation in the citing article. This paper first performs a data-driven analysis of the correlation between the textual content of the sections of the cited article and the text snippet where the citation is placed. The results of the correlation analysis show that the title and abstract of the cited article are likely to include content highly similar to the citing snippet. However, the subsequent sections of the paper often include cited text snippets as well. Hence, there is a need to understand the extent to which an exploration of the full-text of the cited article would be beneficial to gain insights into the citing snippet, considering also the fact that the full-text access could be restricted. To this end, we then propose a classification approach to automatically predicting whether the cited snippets in the full-text of the paper contain a significant amount of new content beyond abstract and title. The proposed approach could support researchers in leveraging full-text article exploration for citation analysis. The experiments conducted on real scientific articles show promising results: the classifier has a 90% chance to correctly distinguish between the full-text exploration and only title and abstract cases

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Open Access Repository

Predicting student academic performance by means of associative classification

Author: Baralis E.
Cagliero L.
Canale L.
Farinetti L.
Venuto E.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The Learning Analytics community has recently paid particular attention to early predict learners’ performance. An established approach entails training classification models from past learner-related data in order to predict the exam success rate of a student well before the end of the course. Early predictions allow teachers to put in place targeted actions, e.g., supporting at-risk students to avoid exam failures or course dropouts. Although several machine learning and data mining solutions have been proposed to learn accurate predictors from past data, the interpretability and explainability of the best performing models is often limited. Therefore, in most cases, the reasons behind classifiers’ decisions remain unclear. This paper proposes an Explainable Learning Analytics solution to analyze learner-generated data acquired by our technical university, which relies on a blended learning model. It adopts classification techniques to early predict the success rate of about 5000 students who were enrolled in the first year courses of our university. It proposes to apply associative classifiers at different time points and to explore the characteristics of the models that led to assign pass or fail success rates. Thanks to their inherent interpretability, associative models can be manually explored by domain experts with the twofold aim at validating classifier outcomes through local rule-based explanations and identifying at-risk/successful student profiles by interpreting the global rule-based model. The results of an in-depth empirical evaluation demonstrate that associative models (i) perform as good as the best performing classification models, and (ii) give relevant insights into the per-student success rate assignments

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

GraphSum: discovering correlations among multiple terms for graph-based summarization

Author: Baralis E.
Cagliero L.
Fiori A.
Mahoto N. A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Generalized association rule mining with constraints

Author: BARALIS E.
CAGLIERO L.
CERQUITELLI T.
GARZA P.
Publication venue: 'Elsevier BV'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Multi-document summarization based on the Yago ontology

Author: Baralis E.
Cagliero L.
Fiori A.
Jabeen S.
Shah S.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Occupational exposure to vibrations: some considerations with reference to the recently issued regulations

Author: BARALIS L
CIGNA C
PATRUCCO M.
SAVOCA D
Publication venue: Fiordo s.r.l.
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Machine learning supported next-maintenance prediction for industrial vehicles

Author: Baralis E.
Cagliero L.
Loti R.
Mellia M.
Mishra S.
Salvatori L.
Vassio L.
Publication venue: CEUR-WS
Publication date
Field of study

Industrial and construction vehicles require tight periodic maintenance operations. Their schedule depends on vehicle characteristics and usage. The latter can be accurately monitored through various on-board devices, enabling the application of Machine Learning techniques to analyze vehicle usage patterns and design predictive analytics. This paper presents a data-driven application to automatically schedule the periodic maintenance operations of industrial vehicles. It aims to predict, for each vehicle and date, the actual remaining days until the next maintenance is due. Our Machine Learning solution is designed to address the following challenges: (i) the non-stationarity of the per-vehicle utilization time series, which limits the effectiveness of classic scheduling policies, and (ii) the potential lack of historical data for those vehicles that have recently been added to the fleet, which hinders the learning of accurate predictors from past data. Preliminary results collected in a real industrial scenario demonstrate the effectiveness of the proposed solution on heterogeneous vehicles. The system we propose here is currently under deployment, enabling further tests and tunings

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

NEMICO: Mining network data through cloud-based data mining techniques

Author: Baralis E.
Cagliero L.
Cerquitelli T.
Chiusano S.
Garza P.
Grimaudo L.
Pulvirenti F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Thanks to the rapid advances in Internet-based applications, data acquisition and storage technologies, petabyte-sized network data collections are becoming more and more common, thus prompting the need for scalable data analysis solutions. By leveraging today’s ubiquitous many-core computer architectures and the increasingly popular cloud computing paradigm, the applicability of data mining algorithms to these large volumes of network data can be scaled up to gain interesting insights. This paper proposes NEMICO, a comprehensive Big Data mining system targeted to network traffic flow analyses (e.g., traffic flow characterization, anomaly detection, multiplelevel pattern mining). NEMICO comprises new approaches that contribute to a paradigm-shift in distributed data mining by addressing most challenging issues related to Big Data, such as data sparsity, horizontal scaling, and parallel computation

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Heterogeneous industrial vehicle usage predictions: A real case

Author: Amparore E.
Baralis E.
Cagliero L.
Loti R.
Markudova D.
Mellia M.
Salvatori L.
Vassio L.
Publication venue: CEUR-WS
Publication date
Field of study

Predicting future vehicle usage based on the analysis of CAN bus data is a popular data mining application. Many of the usage indicators, like the utilization hours, are non-stationary time series. To predict their values, recent approaches based on Machine Learning combine multiple data features describing engine status, travels, and roads. While most of the proposed solutions address cars and trucks usage prediction, a smaller body of work has been devoted to industrial and construction vehicles, which are usually characterized by more complex and heterogeneous usage patterns. This paper describes a real case study performed on a 4-year CAN bus dataset collecting usage data about 2 250 construction vehicles of various types and models. We apply a statistics-based approach to select the most discriminating data features. Separately for each vehicle, we train regression algorithms on historical data enriched with contextual information. The achieved results demonstrate the effectiveness of the proposed solution

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A method to define the priority for maintenance and repair works of Italian motorway tunnels

Author: Baralis M
Barbero M
Barla M
Insana A
Marchiondelli A
Mele P
Milan L
Rosso E
Selleri A
Tripoli L
Zilli L
Publication venue: 'IOP Publishing'
Publication date: 01/01/2021
Field of study

The construction of motorways in Italy dates back to 1921 and still lasts today. Along them there is a large number of tunnels, many of which have been in service for more than 50 years and have experienced various levels of decay due to aging. An extensive assessment and inspection plan is taking place finalized to highlight situations where maintenance and repair works are needed to guarantee the continuation of service in safe conditions and functionality. Due to the number of tunnels, the need arises to classify them and define priorities for intervention on the basis of a first assessment and of a robust and scientific-based tool to orientate the investments. This paper describes the methodology that was developed by the Authors for this purpose, assessing the attention level of every tunnel. The method relies on a quantitative approach that allows quantifying the risk based on five risk factors composed of a number of relevant parameters. Their relative interaction, which guided the scores assigned to each parameter, was assessed by applying the Rock Engineering System [2]. A number of examples of existing tunnels are shown to illustrate the application of the method and to draw conclusions about its validity and reliability

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)