152,361 research outputs found
Application of Text (Idea) Mining to Internet Surveys: Electronic Capture of the Structure of Ideas and Semantic Concepts
This paper demonstrates a quick and efficient method of assessing the ideation structure for a group of people via the Internet and text mining. Data collection using the Internet is increasing; although the Internet has made access easier figuring out what people think is still a challenge. Email was used to contact and then direct survey respondents to a web site. At the web site, openended questions requiring text (type written) responses were asked. Conceptual (ideation) structure was obtained via an algorithm similar to that suggested by Quillian [7] and Bonnet [1]. To discover ideation structure, a modified Hopfield neural network based text-mining algorithm was used to obtain the statistical weights of ideas and concepts and the weights of the joint occurrences of the ideas and concepts with other ideas and concepts. Applying neural network technologies to text allows the analysis of open-ended responses without incurring the expensive, timeconsuming and error-prone task of manually reading the open-ended comments. The Internet via email contact and web-based survey input dramatically speeds the process
Mind your step! : How profiling location reveals your identity - and how you prepare for it
Location-based services (LBS) are services that position your mobile phone to provide some context-based service for you. Some of these services â called âlocation trackingâ applications - need frequent updates of the current position to decide whether a service should be initiated. Thus, internet-based systems will continuously collect and process the location in relationship to a personal context of an identified customer. This paper will present the concept of location as part of a personâs identity. I will conceptualize location in information systems and relate it to concepts like privacy, geographical information systems and surveillance. The talk will present how the knowledge of a person's private life and identity can be enhanced with data mining technologies on location profiles and movement patterns. Finally, some first concepts about protecting location information
Exploring the Use of Twitter Opinion Mining (TOM) in Marketing Courses
This paper discusses the use of social media mining (and more specifically, Twitter opinion mining) in marketing courses in order to help students understand current marketing events or phenomena. Social media mining refers to âthe use of basic concepts and principal algorithms suitable for investigating massive social media data; it discusses theories and methodologies from different disciplines and encompasses the tools to formally represent, measure, model, and mine meaningful patterns from large-scale social media dataâ (Zafarani et al. 2014, p. 16). Social media sites such as Facebook, Twitter and Instagram provide opportunities to explore consumer preferences, opinions and behaviors through the examination of user-generated content (UGC). In a business world dominated by the Internet and social media, it becomes relevant to marketing educators to prepare students in the exploration, analysis and understanding of consumer insights through social media mining, and how to translate such insights into actionable intelligence that increases the effectiveness of a firmâs marketing efforts
Review the challenges of using big data in the supply chain
The increasing growth of computer networks and Internet-based technologies, followed by the growth of data and information required by their users and consumers, has led to the emergence of new concepts in this field. Big data is one of these concepts that has been considered by researchers in various fields of business in recent years. When looking at it from the outside, it is fair to assume that the more data a company or organization has, the better, because the company in question will have a larger amount of data for mining, and as a result their data will be more accurate. However, this is not always the case, because learning how to effectively manage Big Data has become a very challenging task for many businesses around the world. Working with big data involves collecting data from information sources, exploring and analyzing them, modeling them based on the desired features, and providing data security measures. For this reason, this paper examines the challenges of working with big data and the big data revolution in general and big data mining in the business supply chain as fundamental business processes
Text Analytics for Android Project
Most advanced text analytics and text mining tasks include text classification, text clustering, building ontology, concept/entity extraction, summarization, deriving patterns within the structured data, production of granular taxonomies, sentiment and emotion analysis, document summarization, entity relation modelling, interpretation of the output. Already existing text analytics and text mining cannot develop text material alternatives (perform a multivariant design), perform multiple criteria analysis,
automatically select the most effective variant according to different aspects (citation index of papers (Scopus, ScienceDirect, Google Scholar) and authors (Scopus, ScienceDirect, Google Scholar), Top 25 papers, impact factor of journals, supporting phrases, document name and contents, density of keywords), calculate utility degree and market value. However, the Text Analytics for Android Project can perform the aforementioned functions. To the best of the knowledge herein, these functions have not been previously implemented; thus this is the first attempt to do so. The Text Analytics for Android Project is briefly described in this article
Recommended from our members
Detecting Important Life Events on Twitter Using Frequent Semantic and Syntactic Subgraphs
Identifying global events from social media has been the focus of much research in recent years. However, the identification of personal life events poses new requirements and challenges that have received relatively little research attention. In this paper we explore a new approach for life event identification, where we expand social media posts into both semantic, and syntactic networks of content. Frequent graph patterns are mined from these networks and used as features to enrich life-event classifiers. Results show that our approach significantly outperforms the best performing baseline in accuracy (by 4.48% points) and F-measure (by 4.54% points) when used to identify five major life events identified from the psychology literature: Getting Married, Having Children, Death of a Parent, Starting School, and Falling in Love. In addition, our results show that, while semantic graphs are effective at discriminating the theme of the post (e.g. the topic of marriage), syntactic graphs help identify whether the post describes a personal event (e.g. someone getting married)
A Survey to Fix the Threshold and Implementation for Detecting Duplicate Web Documents
The drastic development in the information accessible on the World Wide Web has made the employment of automated tools to locate the information resources of interest, and for tracking and analyzing the same a certainty. Web Mining is the branch of data mining that deals with the analysis of World Wide Web. The concepts from various areas such as Data Mining, Internet technology and World Wide Web, and recently, Semantic Web can be said as the origin of web mining. Web mining can be defined as the procedure of determining hidden yet potentially beneficial knowledge from the data accessible in the web. Web mining comprise the sub areas: web content mining, web structure mining, and web usage mining. Web content mining is the process of mining knowledge from the web pages besides other web objects. The process of mining knowledge about the link structure linking web pages and some other web objects is defined as Web structure mining. Web usage mining is defined as the process of mining the usage patterns created by the users accessing the web pages.
The search engine technology has led to the development of World Wide. The search engines are the chief gateways for access of information in the web. The ability to locate contents of particular interest amidst a huge heap has turned businesses beneficial and productive. The search engines respond to the queries by employing the process of web crawling that populates an indexed repository of web pages. The programs construct a confined repository of the segment of the web that they visit by navigating the web graph and retrieving pages.
There are two main types of crawling, namely, Generic and Focused crawling. Generic crawlers crawls documents and links of diverse topics. Focused crawlers limit the number of pages with the aid of some prior obtained specialized knowledge. The systems that index, mine, and otherwise analyze pages (such as, the search engines) are provided with inputs from the repositories of web pages built by the web crawlers. The drastic development of the Internet and the growing necessity to incorporate heterogeneous data is accompanied by the issue of the existence of near duplicate data. Even if the near duplicate data donât exhibit bit wise identical nature they are remarkably similar. The duplicate and near duplicate web pages either increase the index storage space or slow down or increase the serving costs which annoy the users, thus causing huge problems for the web search engines. Hence it is inevitable to design algorithms to detect such pages
In-Close, a fast algorithm for computing formal concepts
This paper presents an algorithm, called In-Close, that uses incremental closure and matrix searching to quickly compute all formal concepts in a formal context. In-Close is based, conceptually, on a well known algorithm called Close-By-One. The serial version of a recently published algorithm (Krajca, 2008) was shown to be in the order of 100 times faster than several well-known algorithms, and timings of other algorithms in reviews suggest that none of them are faster than Krajca. This paper compares In-Close to Krajca, discussing computational methods, data requirements and memory considerations. From experiments using several public data sets and random data, this paper shows that In-Close is in the order of 20 times faster than Krajca. In-Close is small, straightforward, requires no matrix pre-processing and is simple to implement.</p
- âŠ