74,734 research outputs found
Origin and emergence of entrepreneurship as a research field
This paper seeks to map out the emergence and evolution of entrepreneurship as an independent field in the social science literature from the early 1990s to 2009. Our analysis indicates that entrepreneurship has grown steadily during the 1990s but has truly emerged as a legitimate academic discipline in the latter part of the 2000s. The field has been dominated by researchers from Anglo-Saxon countries over the past 20 years, with particularly strong representations from the US, UK, and Canada. The results from our structural analysis, which is based on a core document approach, point to five large knowledge clusters and further 16 sub-clusters. We characterize the clusters from their cognitive structure and assess the strength of the relationships between these clusters. In addition, a list of most cited articles is presented and discussed
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of âInformation Systems Researchâ
The hybrid clustering approach combining lexical and link-based similarities suffered for a long time from the different properties of the underlying networks. We propose a method based on noun phrase extraction using natural language processing to improve the measurement of the lexical component. Term shingles of different length are created form each of the extracted noun phrases. Hybrid networks are built based on weighted combination of the two types of similarities with seven different weights. We conclude that removing all single term shingles provides the best results at the level of computational feasibility, comparability with bibliographic coupling and also in a community detection application
Contextualization of topics - browsing through terms, authors, journals and cluster allocations
This paper builds on an innovative Information Retrieval tool, Ariadne. The
tool has been developed as an interactive network visualization and browsing
tool for large-scale bibliographic databases. It basically allows to gain
insights into a topic by contextualizing a search query (Koopman et al., 2015).
In this paper, we apply the Ariadne tool to a far smaller dataset of 111,616
documents in astronomy and astrophysics. Labeled as the Berlin dataset, this
data have been used by several research teams to apply and later compare
different clustering algorithms. The quest for this team effort is how to
delineate topics. This paper contributes to this challenge in two different
ways. First, we produce one of the different cluster solution and second, we
use Ariadne (the method behind it, and the interface - called LittleAriadne) to
display cluster solutions of the different group members. By providing a tool
that allows the visual inspection of the similarity of article clusters
produced by different algorithms, we present a complementary approach to other
possible means of comparison. More particular, we discuss how we can - with
LittleAriadne - browse through the network of topical terms, authors, journals
and cluster solutions in the Berlin dataset and compare cluster solutions as
well as see their context.Comment: proceedings of the ISSI 2015 conference (accepted
- âŠ