60,262 research outputs found
Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy
Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd
Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages
Principal component analysis (PCA) and related techniques have been
successfully employed in natural language processing. Text mining applications
in the age of the online social media (OSM) face new challenges due to
properties specific to these use cases (e.g. spelling issues specific to texts
posted by users, the presence of spammers and bots, service announcements,
etc.). In this paper, we employ a Robust PCA technique to separate typical
outliers and highly localized topics from the low-dimensional structure present
in language use in online social networks. Our focus is on identifying
geospatial features among the messages posted by the users of the Twitter
microblogging service. Using a dataset which consists of over 200 million
geolocated tweets collected over the course of a year, we investigate whether
the information present in word usage frequencies can be used to identify
regional features of language use and topics of interest. Using the PCA pursuit
method, we are able to identify important low-dimensional features, which
constitute smoothly varying functions of the geographic location
SciTech News Volume 71, No. 1 (2017)
Columns and Reports From the Editor 3
Division News Science-Technology Division 5 Chemistry Division 8 Engineering Division Aerospace Section of the Engineering Division 9 Architecture, Building Engineering, Construction and Design Section of the Engineering Division 11
Reviews Sci-Tech Book News Reviews 12
Advertisements IEEE
Alaska-Canada Rail Link Economic Benefits
Construction of the 1,740 km Alaska-Canada Rail Link (ACRL) between Fort Nelson, BC and Delta Junction, Alaska to join the North American rail system to the Alaska Railroad will result in tremendous economic benefits for Canada and the US. The ACRL will provide valuable additional east-west rail capacity and tidewater access to the Pacific, hugely benefitting not only the Yukon and Eastern Alaska regions, into which it will introduce rail transport for the first time, but throughout both countries. The economic benefits of ACRL construction are consistent with Canadian government’s desire to promote Northern development and comparable in significance to those of Canadian Pacific Railway in the 1880’s and the St. Lawrence Seaway in the 1950’s. Construction of the ACRL alone will bring unprecedented economic stimulus to the region in terms of job creation, wages and income tax revenue over multiple years. Table 7-1 below summarizes the benefits from ACRL construction for the Yukon, BC and Canada as a whole. However, these estimates are conservative as they exclude benefits associated with pre-construction activities, railway operation post-construction, sales taxes and corporate taxes as well as all such benefits that will accrue to Alaska and the US
How Much Does the UK Invest in Intangible Assets?
We attempt to replicate for the UK the Corrado, Hulten and Sichel (2005, 2006) work on spending on intangible assets in the US. Their work suggests private sector expenditure (investment) on intangibles is about 13% (11%) of US GDP 1998-2000, with intangible investment about equal to tangible capital investment. Our work, using a similar method, suggests the UK private sector spent, in 2004, about £127bn on intangibles, which is about 11% of UK GDP. The implied investment figure is around £116bn (10% of GDP) which is about equal to UK investment in tangible assets. Of the £127bn expenditure, (in round numbers) about 15% is spent on software, about 10% on scientific R&D, almost 20% on non-scientific R&D (design, product development etc.), about 14% on branding, about 20% on training and the rest on organisational capital.Intangible assets, R&D, Training, Organisational capital, Investment
i-JEN: Visual interactive Malaysia crime news retrieval system
Supporting crime news investigation involves a mechanism to help monitor the current and past status of criminal events. We believe this could be well facilitated by focusing on the user interfaces and the event crime model aspects. In this paper we discuss on a development of Visual Interactive Malaysia Crime News Retrieval System (i-JEN) and describe the approach, user studies and planned, the system architecture and future plan. Our main objectives are to construct crime-based event; investigate the use of crime-based event in improving the classification and clustering; develop an interactive crime news retrieval system; visualize crime news in an effective and interactive way; integrate them into a usable and robust system and evaluate the usability and system performance. The system will serve as a news monitoring system which aims to automatically organize, retrieve and present the crime news in such a way as to support an effective monitoring, searching, and browsing for the target users groups of general public, news analysts and policemen or crime investigators. The study will contribute to the better understanding of the crime data consumption in the Malaysian context as well as the developed system with the visualisation features to address crime data and the eventual goal of combating the crimes
Application of Text Analytics in Public Service Co-Creation: Literature Review and Research Framework
The public sector faces several challenges, such as a number of external and
internal demands for change, citizens' dissatisfaction and frustration with
public sector organizations, that need to be addressed. An alternative to the
traditional top-down development of public services is co-creation of public
services. Co-creation promotes collaboration between stakeholders with the aim
to create better public services and achieve public values. At the same time,
data analytics has been fuelled by the availability of immense amounts of
textual data. Whilst both co-creation and TA have been used in the private
sector, we study existing works on the application of Text Analytics (TA)
techniques on text data to support public service co-creation. We
systematically review 75 of the 979 papers that focus directly or indirectly on
the application of TA in the context of public service development. In our
review, we analyze the TA techniques, the public service they support, public
value outcomes, and the co-creation phase they are used in. Our findings
indicate that the TA implementation for co-creation is still in its early
stages and thus still limited. Our research framework promotes the concept and
stimulates the strengthening of the role of Text Analytics techniques to
support public sector organisations and their use of co-creation process. From
policy-makers' and public administration managers' standpoints, our findings
and the proposed research framework can be used as a guideline in developing a
strategy for the designing co-created and user-centred public services
- …