134,481 research outputs found
Human Document Classification Using Bags of Words
Humans are remarkably adept at classifying text documents into cate-gories. For instance, while reading a news story, we are rapidly able to assess whether it belongs to the domain of finance, politics or sports. Automating this task would have applications for content-based search or filtering of digital documents. To this end, it is interesting to investigate the nature of information humans use to classify documents. Here we report experimental results suggesting that this information might, in fact, be quite simple. Using a paradigm of progressive revealing, we determined classification performance as a function of number of words. We found that subjects are able to achieve similar classification accuracy with or without syntactic information across a range of passage sizes. These results have implications for models of human text-understanding and also allow us to estimate what level of performance we can expect, in principle, from a system without requiring a prior step of complex natural language processing
From Data Extraction to Data Leaking. Data-activism in Italian and Spanish anti-corruption campaigns
This article investigates how activists employ Information and Communication Technologies (ICTs) and engage with data-activism in grassroots struggles against corruption. Based on a comparative research design that triangulates three qualitative data sources — in-depth interviews, movements' documents and participatory platforms — the article analyses two campaigns: Riparte il Futuro in Italy and 15MpaRato in Spain. In so doing, the article casts light on how activists engage with digital data, revealing how their employment is connected to and consistent with the type of organizational structure and communication strategy of the campaign. Moreover, the article evaluates how activists engage with three specific digital data-related practices — digital data creation, data usage and data transformation. Finally, the article illustrates that grasping the features of digital data-related practices also reflects how activists perceive and enact distinct ideas of active citizenship and data transparency in their fight against corruption
From data extraction to data leaking: Data-activism in Italian and Spanish anti-corruption campaigns
This article investigates how activists employ Information and Communication Technologies (ICTs) and engage with data-activism in grassroots struggles against corruption. Based on a comparative research design that triangulates three qualitative data sources - in-depth interviews, movements' documents and participatory platforms - the article analyses two campaigns: Riparte il Futuro in Italy and 15MpaRato in Spain. In so doing, the article casts light on how activists engage with digital data, revealing how their employment is connected to and consistent with the type of organizational structure and communication strategy of the campaign. Moreover, the article evaluates how activists engage with three specific digital data-related practices - digital data creation, data usage and data transformation. Finally, the article illustrates that grasping the features of digital data-related practices also reflects how activists perceive and enact distinct ideas of active citizenship and data transparency in their fight against corruption
Yours ever (well, maybe): Studies and signposts in letter writing
Electronic mail and other digital communications technologies seemingly threaten to end the era of handwritten and typed letters, now affectionately seen as part of snail mail. In this essay, I analyze a group of popular and scholarly studies about letter writing-including examples of pundits critiquing the use of e-mail, etiquette manuals advising why the handwritten letter still possesses value, historians and literary scholars studying the role of letters in the past and what it tells us about our present attitudes about digital communications technologies, and futurists predicting how we will function as personal archivists maintaining every document including e-mail. These are useful guideposts for archivists, providing both a sense of the present and the past in the role, value and nature of letters and their successors. They also provide insights into how such documents should be studied, expanding our gaze beyond the particular letters, to the tools used to create them and the traditions dictating their form and function. We also can discern a role for archivists, both for contributing to the literature about documents and in using these studies and commentaries, suggesting not a new disciplinary realm but opportunities for new interdisciplinary work. Examining a documentary form makes us more sensitive to both the innovations and traditions as it shifts from the analog to the digital; we can learn not to be caught up in hysteria or nostalgia about one form over another and archivists can learn about what they might expect in their labors to document society and its institutions. At one time, paper was part of an innovative technology, with roles very similar to the Internet and e-mail today. It may be that the shifts are far less revolutionary than is often assumed. Reading such works also suggests, finally, that archivists ought to rethink how they view their own knowledge and how it is constructed and used. © 2010 Springer Science+Business Media B.V
Recommended from our members
Journalistic freedom and the surveillance of journalists post-Snowden
A paradigmatic shift is sometimes revealed by an unanticipated and extraordinary event, and so it was with Edward Snowden in 2013. A National Security Agency (NSA) contractor, Snowden was so appalled at the exponential expansion of covert digital surveillance that he decided it was his moral duty to inform the public, indeed the world. This he did from a hotel room in Hong Kong when he gave a small group of selected journalists access to 1.7 million classified documents taken from the NSA. These documents revealed the global snooping capabilities of the NSA and its ‘Five Eyes’ intelligence agency partners (ASIO in Australia, CSE in Canada, GCSB in New Zealand, and the GCHQ in United Kingdom). The Five Eyes can vacuum up just about all digital communications anywhere, anytime, and much else besides if they are so minded. Many who take a deep interest in signals intelligence thought these Anglo-Saxon agencies had probably increased their capabilities since 9/11, but even they were shocked when Snowden revealed the sheer scale – it far exceeded any estimate of capability
Global Heuristic Search on Encrypted Data (GHSED)
Important document are being kept encrypted in remote servers. In order to retrieve these encrypted data, efficient search methods needed to enable the retrieval of the document without knowing the content of the documents In this paper a technique called a global heuristic search on encrypted data (GHSED) technique will be described for search in an encrypted files using public key encryption stored on an untrusted server and retrieve the files that satisfy a certain search pattern without revealing any information about the original files. GHSED technique would satisfy the following: (1) Provably secure, the untrusted server cannot learn anything about the plaintext given only the cipher text. (2) Provide controlled searching, so that the untrusted server cannot search for a word without the user's authorization. (3) Support hidden queries, so that the user may ask the untrusted server to search for a secret word without revealing the word to the server. (4) Support query isolation, so the untrusted server learns nothing more than the search result about the plaintext
Recommended from our members
Opportunity Creation in Innovation Networks: Interactive Revealing Practices
Innovating in networks with partners that have diverse knowledge is challenging. The challenges stem from the fact that the commonly used knowledge protection mechanisms often are neither available nor suitable in early stage exploratory collaborations. This article focuses on how company participants in heterogeneous industry networks share private knowledge while protecting firm-specific appropriation. We go beyond the prevailing strategic choice perspectives to discuss interactive revealing practices that sustain joint opportunity creation in the fragile phase of early network formation.Center for Business, Technology and La
Information seeking in the Humanities: physicality and digitality
This paper presents a brief overview of a research project
that is examining the information seeking practices of
humanities scholars. The results of this project are being
used to develop digital resources to better support these
work activities. Initial findings from a recent set of
interviews is offered, revealing the importance of physical
artefacts in the humanities scholars’ research processes and
the limitations of digital resources. Finally, further work
that is soon to be undertaken is summarised, and it is hoped
that after participation in this workshop these ideas will be
refined
- …