9 research outputs found
Characterizing and Predicting Email Deferral Behavior
Email triage involves going through unhandled emails and deciding what to do
with them. This familiar process can become increasingly challenging as the
number of unhandled email grows. During a triage session, users commonly defer
handling emails that they cannot immediately deal with to later. These deferred
emails, are often related to tasks that are postponed until the user has more
time or the right information to deal with them. In this paper, through
qualitative interviews and a large-scale log analysis, we study when and what
enterprise email users tend to defer. We found that users are more likely to
defer emails when handling them involves replying, reading carefully, or
clicking on links and attachments. We also learned that the decision to defer
emails depends on many factors such as user's workload and the importance of
the sender. Our qualitative results suggested that deferring is very common,
and our quantitative log analysis confirms that 12% of triage sessions and 16%
of daily active users had at least one deferred email on weekdays. We also
discuss several deferral strategies such as marking emails as unread and
flagging that are reported by our interviewees, and illustrate how such
patterns can be also observed in user logs. Inspired by the characteristics of
deferred emails and contextual factors involved in deciding if an email should
be deferred, we train a classifier for predicting whether a recently triaged
email is actually deferred. Our experimental results suggests that deferral can
be classified with modest effectiveness. Overall, our work provides novel
insights about how users handle their emails and how deferral can be modeled
Evolution of Conversations in the Age of Email Overload
Email is a ubiquitous communications tool in the workplace and plays an
important role in social interactions. Previous studies of email were largely
based on surveys and limited to relatively small populations of email users
within organizations. In this paper, we report results of a large-scale study
of more than 2 million users exchanging 16 billion emails over several months.
We quantitatively characterize the replying behavior in conversations within
pairs of users. In particular, we study the time it takes the user to reply to
a received message and the length of the reply sent. We consider a variety of
factors that affect the reply time and length, such as the stage of the
conversation, user demographics, and use of portable devices. In addition, we
study how increasing load affects emailing behavior. We find that as users
receive more email messages in a day, they reply to a smaller fraction of them,
using shorter replies. However, their responsiveness remains intact, and they
may even reply to emails faster. Finally, we predict the time to reply, length
of reply, and whether the reply ends a conversation. We demonstrate
considerable improvement over the baseline in all three prediction tasks,
showing the significant role that the factors that we uncover play, in
determining replying behavior. We rank these factors based on their predictive
power. Our findings have important implications for understanding human
behavior and designing better email management applications for tasks like
ranking unread emails.Comment: 11 page, 24th International World Wide Web Conferenc
An automated email classification system for the Ashesi Support Center
Applied project submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, April 2019The widespread usage of the internet has made email an indispensable tool for communication
within organizations. Today, email is used by support centers as one of the mediums for
providing solutions to the daily internal problems’ organizations face. An example is the Ashesi
Support Center which is the hub for solutions for all problems and questions relating to IT,
facilities, logistics, and other issues on the Ashesi University campus. In dealing with problems,
the Ashesi support center classifies emails as either an IT related issue or an operations related
issue. However, the support center does not have a way to automatically classify the emails.
Hence, a support personnel manually sifts through the emails to group them. This can be a
cumbersome process considering the support center receives over 40 emails daily during peak
periods. Harnessing the power of machine learning, a classification model is built to
automatically group emails the Ashesi support center receives.Ashesi Universit
SchedMail: Sender-Assisted Message Delivery Scheduling to Reduce Time-Fragmentation
Although early efforts aimed at dealing with large amounts of emails focused on filtering out spam, there is growing interest in prioritizing non-spam emails, with the objective of reducing information overload and time fragmentation experienced by recipients. However, most existing approaches place the burden of classifying emails exclusively on the recipients' side, either directly or through recipients' email service mechanisms. This disregards the fact that senders typically know more about the nature of the contents of outgoing messages before the messages are read by recipients. This thesis presents mechanisms collectively called SchedMail which can be added to popular email clients, to shift a part of the user efforts and computational resources required for email prioritization to the senders' side. Particularly, senders declare the urgency of their messages, and recipients specify policies about when different types of messages should be delivered. Recipients also judge the accuracy of sender-side urgency, which becomes the basis for learned reputations of senders; these reputations are then used to interpret urgency declarations from the recipients' perspectives. In order to experimentally evaluate the proposed mechanisms, a proof-of-concept prototype was implemented based on a popular open source email client K-9 Mail. By comparing the amount of email interruptions experienced by recipients, with and without SchedMail, the thesis concludes that SchedMail can effectively reduce recipients' time fragmentation, without placing demands on email protocols or adding significant computational overhead
Attributes of Personal Electronic Records
The purpose of this article is to identify the key attributes of personal electronic records in order to develop systems that may enable people to manage them in the home. As more personal information becomes electronic, this is increasingly necessary. Personal electronic records were identified and categorised using interviews and virtual guided tours. Three main attributes were identified: primary user-subjective categories; attributes which identify the circumstances that give rise to the records; and attributes which describe the legal validity of each record. In addition to providing an improved understanding of personal electronic records in the home, these attributes are developed into a set of potential metadata fields
Detecting the Intent of Email Using Embeddings, Deep Learning and Transfer Learning
Throughout the years\u27 several strategies and tools were proposed and developed to help the users cope with the problem of email overload, but each of these solutions had its own limitations and, in some cases, contribute to further problems. One major theme that encapsulates many of these solutions is automatically classifying emails into predefined categories (ex: Finance, Sport, Promotion, etc.) then move/tag the incoming email to that particular category. In general, these solutions have two main limitations: 1) they need to adapt to changing user’s behavior. 2) they require handcrafted features engineering which in turn need a lot of time, effort, and domain knowledge to produce acceptable performance.This dissertation aims to explore the email phenomenon and provide a scalable solution that addresses the above limitations. Our proposed system requires no handcrafted features engineering and utilizes the Speech Act Theory to design a classification system that detects whether an email required an action (i.e. to do) or no action (i.e. to read). We can automate both the features extraction and the classification phases by using our own word embeddings, trained on the entire Enron Email dataset, to represent the input. Then, we use a convolutional layer to capture local tri-gram features, followed by an LSTM layer to consider the meaning of a given feature (trigrams) concerning some “memory” of words that could occur much earlier in the email. Our system detects the email intent with 89% accuracy outperforming other related works.
In developing this system, we followed the concept of Occam’s razor (i.e. law of parsimony). It is a problem-solving principle stating that entities should not be multiplied without necessity. Chapter four present our efforts to simplify the above-proposed model by dropping the use of the CNN layer and showing that fine-tuning a pre-trained Language Model on the Enron email dataset can achieve comparable results. To the best of our knowledge, this is the first attempt of using transfer learning to develop a deep learning model in the email domain. Finally, we showed that we could even drop the LSTM layer by representing each email’s sentences using contextual word/sentence embeddings. Our experimental results using three different types of embeddings: context-free word embeddings (word2vec and GloVe), contextual word embeddings (ELMo and BERT), and sentence embeddings (DAN-based Universal Sentence Encoder and Transformer-based Universal Sentence Encoder) suggest that using ELMo embeddings produce the best result. We achieved an accuracy of 90.10%, comparing with word2vec (82.02%), BERT (58.08%), DAN-based USE (86.66%), and Transformer-based USE (88.16%)
Re-finding Tweets - Analyse der Personal-Information-Management-Praktik Re-finding im Kontext der Social-Media-Plattform Twitter
Diese Arbeit untersucht das Informationsverhalten von Social-Media-Anwendern aus der Perspektive des Personal Information Management und fokussiert dabei auf Re-finding-Verhalten, also das Wiederfinden von bereits wahrgenommener Information. Als Untersuchungsgegenstand dient die Social-Media-Plattform Twitter. Ziel der Arbeit ist die Beobachtung, Dokumentation, Beschreibung und Interpretation des Nutzerverhaltens beim Wiederfinden von Tweets und die Erarbeitung von Designvorschlägen, um Twitter-Nutzer bei diesem Informationsbedürfnis zu unterstützen. Als Forschungsstrategie dient ein Sequential-Mixed-Methods-Design, welches die sukzessive Erhebung und Auswertung von qualitativen bzw. subjektiven und quantitativen bzw. objektiven Daten in Form von zwei großen Studien --- einer Umfrage und einer Logstudie --- ermöglicht und es schließlich erlaubt, durch Kombination und Diskussion der Einzelergebnisse ein holistisches Bild von Wiederfindensverhalten auf Twitter zu zeichnen. Die Arbeit zeigt, dass Nutzer sehr häufig das Bedürfnis haben, zu bereits gesehenen Tweets zurückzukehren. Twitter, obwohl es einen Fokus auf Echtzeitinformationen legt, besitzt Archivcharakter, da häufig auch ältere Nachrichten wieder aufgerufen werden und persönliche Tweets einen längeren Lebenszyklus besitzen, als man dies von ihnen erwarten würde. Wiederfindensstrategien --- besonders Orienteering-Verhalten --- die bereits in anderen Personal-Information-Management-Kontexten wie mit E-Mails oder bei der Nutzung von Dateimanagern identifiziert werden konnten, treten auch beim Wiederfinden von Tweets auf. Wiederfinden kann eine komplexe Aufgabe sein, die Nutzer frustriert zurücklässt. Darüber hinaus haben Nutzer Schwierigkeiten bei der Einschätzung, ob Tweets in Zukunft von Relevanz sein könnten. Angemessen trainierte Algorithmen können Nutzer beim Wiederfinden von Tweets unterstützen