1,174 research outputs found

    Prediction of student success: A smart data-driven approach

    Get PDF
    Predicting student’s academic performance is one of the subjects related to the Educational Data Mining process, which intends to extract useful information and new patterns from educational data. Understanding the drivers of student success may assist educators in developing pedagogical methods providing a tool for personalized feedback and advice. In order to improve the academic performance of students and create a decision support solution for higher education institutes, this dissertation proposed a methodology that uses educational data mining to compare prediction models for the students' success. Data belongs to ISCTE master students, a Portuguese university, during 2012 to 2022 academic years. In addition, it was studied which factors are the strongest predictors of the student’s success. PyCaret library was used to compare the performance of several algorithms. Factors that were proposed to influence the success include, for example, the student's gender, previous educational background, the existence of a special statute, and the parents' educational degree. The analysis revealed that the Light Gradient Boosting Machine Classifier had the best performance with an accuracy of 87.37%, followed by Gradient Boosting Classifier (accuracy = 85.11%) and Adaptive Boosting Classifier (accuracy = 83.37%). Hyperparameter tunning improved the performance of all the algorithms. Feature importance analysis revealed that the factors that impacted the student’s success most were the average grade, master time, and the gap between degrees, i.e., the number of years between the last degree and the start of the master.A previsão do sucesso académico de estudantes é um dos tópicos relacionados com a mineração de dados educacionais, a qual pretende extrair informação útil e encontrar padrões a partir de dados académicos. Compreender que fatores afetam o sucesso dos estudantes pode ajudar, as instituições de educação, no desenvolvimento de métodos pedagógicos, dando uma ferramenta de feedback e aconselhamento personalizado. Com o fim de melhorar o desempenho académico dos estudantes e criar uma solução de apoio à decisão, para instituições de ensino superior, este artigo propõe uma metodologia que usa mineração de dados para comparar modelos de previsão para o sucesso dos alunos. Os dados pertencem a alunos de mestrado que frequentaram o ISCTE, uma universidade portuguesa, durante os anos letivos de 2012 a 2022. Além disso, foram estudados quais os fatores que mais afetam o sucesso do aluno. Os vários algoritmos foram comparados pela biblioteca PyCaret. Alguns dos fatores que foram propostos como relevantes para o sucesso incluem, o género do aluno, a formação educacional anterior, a existência de um estatuto especial e o grau de escolaridade dos pais. A análise dos resultados demonstrou que o classificador Light Gradient Boosting Machine (LGBMC) é o que tem o melhor desempenho com uma accuracy de 87.37%, seguindo-se o classificador Gradient Boosting Classifier (accuracy=85.11%) e o classificador Adaptive Boosting (accuracy=83.37%). A afinação de hiperparâmetros melhorou o desempenho de todos os algoritmos. As variáveis que demonstraram ter maior impacto foram a média dos estudantes, a duração do mestrado e o intervalo entre estudos

    Relevance Feedback Search Based on Automatic Annotation and Classification of Texts

    Get PDF
    The idea behind Relevance Feedback Search (RFBS) is to build search queries as an iterative and interactive process in which they are gradually refined based on the results of the previous search round. This can be helpful in situations where the end user cannot easily formulate their information needs at the outset as a well-focused query, or more generally as a way to filter and focus search results. This paper concerns (1) a framework that integrates keyword extraction and unsupervised classification into the RFBS paradigm and (2) the application of this framework to the legal domain as a use case. We focus on the Natural Language Processing (NLP) methods underlying the framework and application, where an automatic annotation tool is used for extracting document keywords as ontology concepts, which are then transformed into word embeddings to form vectorial representations of the texts. An unsupervised classification system that employs similar techniques is also used in order to classify the documents into broad thematic classes. This classification functionality is evaluated using two different datasets. As the use case, we describe an application perspective in the semantic portal LawSampo - Finnish Legislation and Case Law on the Semantic Web. This online demonstrator uses a dataset of 82145 sections in 3725 statutes of Finnish legislation and another dataset that comprises 13470 court decisions

    Relevance Feedback Search Based on Automatic Annotation and Classification of Texts

    Get PDF
    Peer reviewe

    Information Technology and Lawyers. Advanced Technology in the Legal Domain, from Challenges to Daily Routine

    Get PDF

    Discrimination and the Effects of Drug Testing on Black Employment

    Get PDF
    Nearly half of U.S. employers test job applicants and workers for drugs. I use variation in the timing and nature of drug testing regulation to study discrimination against blacks related to perceived drug use. Black employment in the testing sector is suppressed in the absence of testing, consistent with ex ante discrimination on the basis of drug use perceptions. Adoption of pro-testing legislation increases black employment in the testing sector by 7–30 percent and relative wages by 1.4–13.0 percent, with the largest shifts among low skilled black men. Results suggest that employers substitute white women for blacks in the absence of testing

    Citizen Science for Citizen Access to Law

    Get PDF
    This papers sits at the intersection of citizen access to law, legal informatics and plain language. The paper reports the results of a joint project of the Cornell University Legal Information Institute and the Australian National University which collected thousands of crowdsourced assessments of the readability of law through the Cornell LII site. The aim of the project is to enhance accuracy in the prediction of the readability of legal sentences. The study requested readers on legislative pages of the LII site to rate passages from the United States Code and the Code of Federal Regulations and other texts for readability and other characteristics. The research provides insight into who uses legal rules and how they do so. The study enables conclusions to be drawn as to the current readability of law and spread of readability among legal rules. The research is intended to enable the creation of a dataset of legal rules labelled by human judges as to readability. Such a dataset, in combination with machine learning, will assist in identifying factors in legal language which impede readability and access for citizens. As far as we are aware, this research is the largest ever study of readability and usability of legal language and the first research which has applied crowdsourcing to such an investigation. The research is an example of the possibilities open for enhancing access to law through engagement of end users in the online legal publishing environment for enhancement of legal accessibility and through collaboration between legal publishers and researchers

    Gathering Danger: The Urgent Need to Regulate Toxic Substances That Can Bioaccumulate

    Get PDF

    Is Algorithmic Affirmative Action Legal?

    Get PDF
    This Article is the first to comprehensively explore whether algorithmic affirmative action is lawful. It concludes that both statutory and constitutional antidiscrimination law leave room for race-aware affirmative action in the design of fair algorithms. Along the way, the Article recommends some clarifications of current doctrine and proposes the pursuit of formally race-neutral methods to achieve the admittedly race-conscious goals of algorithmic affirmative action. The Article proceeds as follows. Part I introduces algorithmic affirmative action. It begins with a brief review of the bias problem in machine learning and then identifies multiple design options for algorithmic fairness. These designs are presented at a theoretical level, rather than in formal mathematical detail. It also highlights some difficult truths that stakeholders, jurists, and legal scholars must understand about accuracy and fairness trade-offs inherent in fairness solutions. Part II turns to the legality of algorithmic affirmative action, beginning with the statutory challenge under Title VII of the Civil Rights Act. Part II argues that voluntary algorithmic affirmative action ought to survive a disparate treatment challenge under Ricci and under the antirace-norming provision of Title VII. Finally, Part III considers the constitutional challenge to algorithmic affirmative action by state actors. It concludes that at least some forms of algorithmic affirmative action, to the extent they are racial classifications at all, ought to survive strict scrutiny as narrowly tailored solutions designed to mitigate the effects of past discrimination

    A Right to Access Implies A Right to Know: An Open Online Platform for Research on the Readability of Law

    Get PDF
    The widespread availability of legal materials online has opened the law to a new and greatly expanded readership. These new readers need the law to be readable by them when they encounter it. However, the available empirical research supports a conclusion that legislation is difficult to read if not incomprehensible to most citizens. We review approaches that have been used to measure the readability of text including readability metrics, cloze testing and application of machine learning. We report the creation and testing of an open online platform for readability research. This platform is made available to researchers interested in undertaking research on the readability of legal materials. To demonstrate the capabilities ofthe platform, we report its initial application to a corpus of legislation. Linguistic characteristics are extracted using the platform and then used as input features for machine learning using the Weka package. Wide divergences are found between sentences in a corpus of legislation and those in a corpus of graded reading material or in the Brown corpus (a balanced corpus of English written genres). Readability metrics are found to be of little value in classifying sentences by grade reading level (noting that such metrics were not designed to be used with isolated sentences)
    • …
    corecore