6 research outputs found
A Comprehensive Survey of Regression Based Loss Functions for Time Series Forecasting
Time Series Forecasting has been an active area of research due to its many
applications ranging from network usage prediction, resource allocation,
anomaly detection, and predictive maintenance. Numerous publications published
in the last five years have proposed diverse sets of objective loss functions
to address cases such as biased data, long-term forecasting, multicollinear
features, etc. In this paper, we have summarized 14 well-known regression loss
functions commonly used for time series forecasting and listed out the
circumstances where their application can aid in faster and better model
convergence. We have also demonstrated how certain categories of loss functions
perform well across all data sets and can be considered as a baseline objective
function in circumstances where the distribution of the data is unknown. Our
code is available at GitHub:
https://github.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow.Comment: 13 pages, 23 figure
Leveraging Generative AI Models for Synthetic Data Generation in Healthcare: Balancing Research and Privacy
The widespread adoption of electronic health records and digital healthcare
data has created a demand for data-driven insights to enhance patient outcomes,
diagnostics, and treatments. However, using real patient data presents privacy
and regulatory challenges, including compliance with HIPAA and GDPR. Synthetic
data generation, using generative AI models like GANs and VAEs offers a
promising solution to balance valuable data access and patient privacy
protection. In this paper, we examine generative AI models for creating
realistic, anonymized patient data for research and training, explore synthetic
data applications in healthcare, and discuss its benefits, challenges, and
future research directions. Synthetic data has the potential to revolutionize
healthcare by providing anonymized patient data while preserving privacy and
enabling versatile applications.Comment: 4 pages, 3 figure
Auto-labelling of Bug Report using Natural Language Processing
The exercise of detecting similar bug reports in bug tracking systems is
known as duplicate bug report detection. Having prior knowledge of a bug
report's existence reduces efforts put into debugging problems and identifying
the root cause. Rule and Query-based solutions recommend a long list of
potential similar bug reports with no clear ranking. In addition, triage
engineers are less motivated to spend time going through an extensive list.
Consequently, this deters the use of duplicate bug report retrieval solutions.
In this paper, we have proposed a solution using a combination of NLP
techniques. Our approach considers unstructured and structured attributes of a
bug report like summary, description and severity, impacted products,
platforms, categories, etc. It uses a custom data transformer, a deep neural
network, and a non-generalizing machine learning method to retrieve existing
identical bug reports. We have performed numerous experiments with significant
data sources containing thousands of bug reports and showcased that the
proposed solution achieves a high retrieval accuracy of 70% for [email protected]: 7 Pages, 11 Figure
Distributed Kafka Clusters: A Novel Approach to Global Message Ordering
In contemporary distributed systems, logs are produced at an astounding rate,
generating terabytes of data within mere seconds. These logs, containing
pivotal details like system metrics, user actions, and diverse events, are
foundational to the system's consistent and accurate operations. Precise log
ordering becomes indispensable to avert potential ambiguities and discordances
in system functionalities. Apache Kafka, a prevalent distributed message queue,
offers significant solutions to various distributed log processing challenges.
However, it presents an inherent limitation while Kafka ensures the in-order
delivery of messages within a single partition to the consumer, it falls short
in guaranteeing a global order for messages spanning multiple partitions. This
research delves into innovative methodologies to achieve global ordering of
messages within a Kafka topic, aiming to bolster the integrity and consistency
of log processing in distributed systems. Our code is available on GitHub.Comment: 6 Pages, 6 Figure
aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-PyTorch: Final Release
This repository compares the performance of 8 different regression loss functions used in Time Series Forecasting using Temporal Fusion Transformers
Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies
Flare frequency distributions represent a key approach to addressing one of
the largest problems in solar and stellar physics: determining the mechanism
that counter-intuitively heats coronae to temperatures that are orders of
magnitude hotter than the corresponding photospheres. It is widely accepted
that the magnetic field is responsible for the heating, but there are two
competing mechanisms that could explain it: nanoflares or Alfv\'en waves. To
date, neither can be directly observed. Nanoflares are, by definition,
extremely small, but their aggregate energy release could represent a
substantial heating mechanism, presuming they are sufficiently abundant. One
way to test this presumption is via the flare frequency distribution, which
describes how often flares of various energies occur. If the slope of the power
law fitting the flare frequency distribution is above a critical threshold,
as established in prior literature, then there should be a
sufficient abundance of nanoflares to explain coronal heating. We performed
600 case studies of solar flares, made possible by an unprecedented number
of data analysts via three semesters of an undergraduate physics laboratory
course. This allowed us to include two crucial, but nontrivial, analysis
methods: pre-flare baseline subtraction and computation of the flare energy,
which requires determining flare start and stop times. We aggregated the
results of these analyses into a statistical study to determine that . This is below the critical threshold, suggesting that Alfv\'en
waves are an important driver of coronal heating.Comment: 1,002 authors, 14 pages, 4 figures, 3 tables, published by The
Astrophysical Journal on 2023-05-09, volume 948, page 7