114,003 research outputs found
Template Mining for Information Extraction from Digital Documents
published or submitted for publicatio
Digital Technologies in Humanities
The presentation outlines the key issues related to the application of digital technologies in humanities scholarship with a special focus on the role of open-source software in this area. Although the application of computers in humanities scholarship dates back to the mid-20th century and spans a wide range of outputs and practices from concordance indices, text tagging, quantitative methods in history and archaeology, to modern-day digital humanities, it is still often inferred that the poor uptake of digital technologies in humanities and the prevalence of print culture have to do with the poor computer skills of humanities scholars and their lack of interest in digital services and infrastructures. At the same time, it is also argued that major services, databases and infrastructures are designed for science and technology, while failing to meet the specific needs of humanities scholars (e.g. multilingual and multi-alphabet support, complex publishing requirements, variety of outputs beyond journal articles and their visibility, etc.). The major areas of development in digital technologies for humanities include text encoding, text and data mining,natural language processing, semantic tools, visualization tools, publishing management software, library and repository software, and web publishing software. The corpus of available solutions is diversified but it is also marked by the lack of interoperability and coordination among the active projects, which is a significant challenge for long-term sustainability. As an area of scholarship that is by far less likely to engender profit than science and technology, humanities rely on a considerably smaller research community and are less attractive for investors and IT developers, which is another crucial sustainability challenge. This is one of the reasons why open-source software plays an important role in humanities-related digital technologies. Bearing in mind the fear of proprietary lock-in, which has followed recent research infrastructure acquisitions by commercial publishers, and efforts towards creating open and interoperable international infrastructures (esp. European Open Science Cloud), it is reasonable to expect that the role of open-source software will be even greater in future
SciTech News Volume 71, No. 1 (2017)
Columns and Reports From the Editor 3
Division News Science-Technology Division 5 Chemistry Division 8 Engineering Division Aerospace Section of the Engineering Division 9 Architecture, Building Engineering, Construction and Design Section of the Engineering Division 11
Reviews Sci-Tech Book News Reviews 12
Advertisements IEEE
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Recommended from our members
The role of human factors in stereotyping behavior and perception of digital library users: A robust clustering approach
To deliver effective personalization for digital library users, it is necessary to identify which human factors are most relevant in determining the behavior and perception of these users. This paper examines three key human factors: cognitive styles, levels of expertise and gender differences, and utilizes three individual clustering techniques: k-means, hierarchical clustering and fuzzy clustering to understand user behavior and perception. Moreover, robust clustering, capable of correcting the bias of individual clustering techniques, is used to obtain a deeper understanding. The robust clustering approach produced results that highlighted the relevance of cognitive style for user behavior, i.e., cognitive style dominates and justifies each of the robust clusters created. We also found that perception was mainly determined by the level of expertise of a user. We conclude that robust clustering is an effective technique to analyze user behavior and perception
Text Analytics for Android Project
Most advanced text analytics and text mining tasks include text classification, text clustering, building ontology, concept/entity extraction, summarization, deriving patterns within the structured data, production of granular taxonomies, sentiment and emotion analysis, document summarization, entity relation modelling, interpretation of the output. Already existing text analytics and text mining cannot develop text material alternatives (perform a multivariant design), perform multiple criteria analysis,
automatically select the most effective variant according to different aspects (citation index of papers (Scopus, ScienceDirect, Google Scholar) and authors (Scopus, ScienceDirect, Google Scholar), Top 25 papers, impact factor of journals, supporting phrases, document name and contents, density of keywords), calculate utility degree and market value. However, the Text Analytics for Android Project can perform the aforementioned functions. To the best of the knowledge herein, these functions have not been previously implemented; thus this is the first attempt to do so. The Text Analytics for Android Project is briefly described in this article
- …