Search CORE

1,756 research outputs found

A Data Science Course for Undergraduates: Thinking with Data

Author: Baumer Ben
Publication venue
Publication date: 18/03/2015
Field of study

Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be non-traditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students with a variety of techniques to analyze small, neat, and clean data sets. However, whether they pursue more formal training in statistics or not, many of these students will end up working with data that is considerably more complex, and will need facility with statistical computing techniques. More importantly, these students require a framework for thinking structurally about data. We describe an undergraduate course in a liberal arts environment that provides students with the tools necessary to apply data science. The course emphasizes modern, practical, and useful skills that cover the full data analysis spectrum, from asking an interesting question to acquiring, managing, manipulating, processing, querying, analyzing, and visualizing data, as well communicating findings in written, graphical, and oral forms.Comment: 21 pages total including supplementary material

arXiv.org e-Print Archive

Smith College: Smith ScholarWorks

Open Science in Software Engineering

Author: A Rowhani-Farid
B Saunders
C Lambert
D Graziotin
D Mendez
DE Knuth
EW Dijkstra
G Eysenbach
JP Bolam
JW Houghton
K Dickersin
L Prechelt
ML Head
NL Kerr
R Core Team
S Auer
S Chacon
S Childs
T Boisseau
V Van den Eynden
W Koehler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Open science describes the movement of making any research artefact available to the public and includes, but is not limited to, open access, open data, and open source. While open science is becoming generally accepted as a norm in other scientific disciplines, in software engineering, we are still struggling in adapting open science to the particularities of our discipline, rendering progress in our scientific community cumbersome. In this chapter, we reflect upon the essentials in open science for software engineering including what open science is, why we should engage in it, and how we should do it. We particularly draw from our experiences made as conference chairs implementing open science initiatives and as researchers actively engaging in open science to critically discuss challenges and pitfalls, and to address more advanced topics such as how and under which conditions to share preprints, what infrastructure and licence model to cover, or how do it within the limitations of different reviewing models, such as double-blind reviewing. Our hope is to help establishing a common ground and to contribute to make open science a norm also in software engineering.Comment: Camera-Ready Version of a Chapter published in the book on Contemporary Empirical Methods in Software Engineering; fixed layout issue with side-note

arXiv.org e-Print Archive

Crossref

Juelich Shared Electronic Resources

Manajemen User Mikrotik Berbasis Telegram Bot

Author: Prihanto Agus
Suqma Ananda Adhe
Publication venue: 'Universitas Negeri Surabaya'
Publication date: 13/10/2021
Field of study

Smartphone merupakan alat yang penting untuk menunjang aktivitas manusia dalam melakukan pekerjaan, hal itu dikarenakan smartphone dapat mengakses informasi dari satu tempat ke tempat lain dengan mudah dan cepat walaupun jaraknya jauh. Salah satu contoh layanan smartphone adalah sosial media yang berfungsi sebagai media telekomunikasi dan informasi. Berbagai aplikasi sosial media tersedia pada smartphone, salah satunya adalah telegram. Di dalam telegram terdapat fitur telegram bot. Saat ini telegram bot mulai dikembangkan untuk dapat monitoring mikrotik dan melakukan perintah untuk manjemen user mikrotik. Untuk melakukan hal tersebut, administrator harus terkoneksi dengan jaringan router untuk dapat melakukan manajemen user hotspot. Berdasarkan masalah tersebut, administrator yang mengelola user hotspot dapat menggunakan sebuah Telegram Bot tanpa melalui 1 jaringan yang sama dengan router mikrotik. Ketika user login, Telegram Bot menampilkan informasi IP, Mac Address, dan username user yang melakukan login ke hotspot. Begitu juga apabila administrator melakukan manajemen user, langkah dan waktu yang diperlukan untuk melakukan perintah manajemen user hotspot pada Telegram bot memerlukan langkah dan waktu yang lebih sedikit dibandingkan dengan melalui Winbox dan juga administrator bisa melakukan manajemen user dan monitoring dari jarak jauh. Hasil penelitian menunjukkan bahwa menjalankan perintah manajemen user hotspot Mikrotik dengan Telegram lebih efektif dan efisensien jika dbandingkan dengan menggunakan Winbox

Online Electronic Journal Portal Universitas Negeri Surabaya

Instruction-Following Evaluation for Large Language Models

Author: Basu Sujoy
Brahma Siddhartha
Hou Le
Lu Tianjian
Luan Yi
Mishra Swaroop
Zhou Denny
Zhou Jeffrey
Publication venue
Publication date: 14/11/2023
Field of study

One core capability of Large Language Models (LLMs) is to follow natural language instructions. However, the evaluation of such abilities is not standardized: Human evaluations are expensive, slow, and not objectively reproducible, while LLM-based auto-evaluation is potentially biased or limited by the ability of the evaluator LLM. To overcome these issues, we introduce Instruction-Following Eval (IFEval) for large language models. IFEval is a straightforward and easy-to-reproduce evaluation benchmark. It focuses on a set of "verifiable instructions" such as "write in more than 400 words" and "mention the keyword of AI at least 3 times". We identified 25 types of those verifiable instructions and constructed around 500 prompts, with each prompt containing one or more verifiable instructions. We show evaluation results of two widely available LLMs on the market. Our code and data can be found at https://github.com/google-research/google-research/tree/master/instruction_following_eva

arXiv.org e-Print Archive

Code generation based on inference and controlled natural language input

Author: Dittmer Howard R.
Publication venue: DePaul University
Publication date: 21/04/2023
Field of study

Over time the level of abstraction embodied in programming languages has continued to grow. Paradoxically, most programming languages still require programmers to conform to the language\u27s rigid constructs. These constructs have been implemented in the name of efficiency for the computer. However, the continual increase in computing power allows us to consider techniques not so limited. To this end, we have created CABERNET, a Controlled Natural Language (CNL) based approach to program creation. CABERNET allows programmers to use a simple outline-based syntax. This syntax enables increased programmer efficiency. CNLs have previously been used to document requirements. We have taken this approach beyond the typical application of creating requirements documents to creating functional programs. Using heuristics and inference to analyze and determine the programmer\u27s intent, the CABERNET toolchain can create functional mobile applications. This approach allows programs to align with how humans think rather than how computers process information. Using customizable templates, a CABERNET application can be processed to run on multiple run-time environments. Since processing a CABERNET program file results in a native application program, performance is maintained. This research explores whether a CNL-based programming tool can provide a readable, flexible, extensible, and easy-to-learn development methodology. To answer this question, we compared sample applications created in Swift, SwiftUI, and a prototype of the CABERNET toolchain. The CABERNET implementations were consistently shorter than those produced in the other two languages. In addition, users surveyed consistently found the CABERNET samples easier to understand

Via Sapientiae: The Institutional Repository at DePaul University