33,041 research outputs found
Discovering Job Preemptions in the Open Science Grid
The Open Science Grid(OSG) is a world-wide computing system which facilitates
distributed computing for scientific research. It can distribute a
computationally intensive job to geo-distributed clusters and process job's
tasks in parallel. For compute clusters on the OSG, physical resources may be
shared between OSG and cluster's local user-submitted jobs, with local jobs
preempting OSG-based ones. As a result, job preemptions occur frequently in
OSG, sometimes significantly delaying job completion time.
We have collected job data from OSG over a period of more than 80 days. We
present an analysis of the data, characterizing the preemption patterns and
different types of jobs. Based on observations, we have grouped OSG jobs into 5
categories and analyze the runtime statistics for each category. we further
choose different statistical distributions to estimate probability density
function of job runtime for different classes.Comment: 8 page
Bridging the biodiversity data gaps: Recommendations to meet users’ data needs
A strong case has been made for freely available, high quality data on species occurrence, in order to track changes in biodiversity. However, one of the main issues surrounding the provision of such data is that sources vary in quality, scope, and accuracy. Therefore publishers of such data must face the challenge of maximizing quality, utility and breadth of data coverage, in order to make such data useful to users. Here, we report a number of recommendations that stem from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF). Through this survey, we aimed to distil the main user needs regarding biodiversity data. We find a broad range of recommendations from the survey respondents, principally concerning issues such as data quality, bias, and coverage, and extending ease of access. We recommend a candidate set of actions for the GBIF that fall into three classes: 1) addressing data gaps, data volume, and data quality, 2) aggregating new kinds of data for new applications, and 3) promoting ease-of-use and providing incentives for wider use. Addressing the challenge of providing high quality primary biodiversity data can potentially serve the needs of many international biodiversity initiatives, including the new 2020 biodiversity targets of the Convention on Biological Diversity, the emerging global biodiversity observation network (GEO BON), and the new Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES)
A Method to Improve the Early Stages of the Robotic Process Automation Lifecycle
The robotic automation of processes is of much interest to
organizations. A common use case is to automate the repetitive manual
tasks (or processes) that are currently done by back-office staff
through some information system (IS). The lifecycle of any Robotic Process
Automation (RPA) project starts with the analysis of the process
to automate. This is a very time-consuming phase, which in practical
settings often relies on the study of process documentation. Such documentation
is typically incomplete or inaccurate, e.g., some documented
cases never occur, occurring cases are not documented, or documented
cases differ from reality. To deploy robots in a production environment
that are designed on such a shaky basis entails a high risk. This paper
describes and evaluates a new proposal for the early stages of an RPA
project: the analysis of a process and its subsequent design. The idea is to
leverage the knowledge of back-office staff, which starts by monitoring
them in a non-invasive manner. This is done through a screen-mousekey-
logger, i.e., a sequence of images, mouse actions, and key actions
are stored along with their timestamps. The log which is obtained in
this way is transformed into a UI log through image-analysis techniques
(e.g., fingerprinting or OCR) and then transformed into a process model
by the use of process discovery algorithms. We evaluated this method for
two real-life, industrial cases. The evaluation shows clear and substantial
benefits in terms of accuracy and speed. This paper presents the method,
along with a number of limitations that need to be addressed such that
it can be applied in wider contexts.Ministerio de Economía y Competitividad TIN2016-76956-C3-2-
- …