9,835 research outputs found

    Keyword enhanced web structure mining for business intelligence

    Get PDF
    The study proposed the method of keyword enhanced Web structure mining which combines the ideas of Web content mining with Web structure mining. The method was used to mine data on business competition among a group of DSLAM companies. Specifically, the keyword DSLAM was incorporated into queries that searched for co-links between pairs of company Websites. The resulting co-link matrix was analyzed using multidimensional scaling (MDS) to map business competition positions. The study shows that the proposed method improves upon the previous method of Web structure mining alone by producing a more accurate map of business competition in the DSLAM sector. ©Springer-Verlag Berlin Heidelberg 2009

    Data-driven Job Search Engine Using Skills and Company Attribute Filters

    Full text link
    According to a report online, more than 200 million unique users search for jobs online every month. This incredibly large and fast growing demand has enticed software giants such as Google and Facebook to enter this space, which was previously dominated by companies such as LinkedIn, Indeed and CareerBuilder. Recently, Google released their "AI-powered Jobs Search Engine", "Google For Jobs" while Facebook released "Facebook Jobs" within their platform. These current job search engines and platforms allow users to search for jobs based on general narrow filters such as job title, date posted, experience level, company and salary. However, they have severely limited filters relating to skill sets such as C++, Python, and Java and company related attributes such as employee size, revenue, technographics and micro-industries. These specialized filters can help applicants and companies connect at a very personalized, relevant and deeper level. In this paper we present a framework that provides an end-to-end "Data-driven Jobs Search Engine". In addition, users can also receive potential contacts of recruiters and senior positions for connection and networking opportunities. The high level implementation of the framework is described as follows: 1) Collect job postings data in the United States, 2) Extract meaningful tokens from the postings data using ETL pipelines, 3) Normalize the data set to link company names to their specific company websites, 4) Extract and ranking the skill sets, 5) Link the company names and websites to their respective company level attributes with the EVERSTRING Company API, 6) Run user-specific search queries on the database to identify relevant job postings and 7) Rank the job search results. This framework offers a highly customizable and highly targeted search experience for end users.Comment: 8 pages, 10 figures, ICDM 201

    Computer-based library or computer-based learning?

    Get PDF
    Traditionally, libraries have played the role of repository of published information resources and, more recently, gateway to online subscription databases. The library online catalog and digital library interface serve an intermediary function to help users locate information resources available through the library. With competition from Web search engines and Web portals of various kinds available for free, the library has to step up to play a more active role as guide and coach to help users make use of information resources for learning or to accomplish particular tasks. It is no longer sufficient for computer-based library systems to provide just search and access functions. They must provide the functionality and environment to support learning and become computer-based learning systems. This paper examines the kind of learning support that can be incorporated in library online catalogs and digital libraries, including 1) enhanced support for information browsing and synthesis through linking by shared meta-data, references and concepts; 2) visualization of related information; 3) adoption of Library 2.0 and social technologies; 4) adoption of Library 3.0 technologies including intelligent processing and text mining

    Extracting corpus specific knowledge bases from Wikipedia

    Get PDF
    Thesauri are useful knowledge structures for assisting information retrieval. Yet their production is labor-intensive, and few domains have comprehensive thesauri that cover domain-specific concepts and contemporary usage. One approach, which has been attempted without much success for decades, is to seek statistical natural language processing algorithms that work on free text. Instead, we propose to replace costly professional indexers with thousands of dedicated amateur volunteers--namely, those that are producing Wikipedia. This vast, open encyclopedia represents a rich tapestry of topics and semantics and a huge investment of human effort and judgment. We show how this can be directly exploited to provide WikiSauri: manually-defined yet inexpensive thesaurus structures that are specifically tailored to expose the topics, terminology and semantics of individual document collections. We also offer concrete evidence of the effectiveness of WikiSauri for assisting information retrieval

    Impliance: A Next Generation Information Management Appliance

    Full text link
    ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

    Enhance the Competitive Intelligence Capabilities in Company using Web Mining Technique

    Get PDF
    In this globalization era, the company should have a strategy to keep having the edge on the competitors. The way is enhancing the competitive intelligence of that company because CI is a part of strategy management of company that focuses on external business environment. In business world, the company must focus on not only CI but also Decision Support System because it is an important piece of management system of a company. By technology development, nowadays a company can find information about competitors or others on the web, so this paper provides an overview how web can be used for competitive intelligence. The collected information is not always the useful information. The company must also select the appropriate source to determine the best result for making decision. One of the related literature proposes a framework of Decision Support System by web mining technique. This paper also provides an enhancement of that framework in order to enhance the competitive intelligence

    An Introduction to Social Semantic Web Mining & Big Data Analytics for Political Attitudes and Mentalities Research

    Full text link
    The social web has become a major repository of social and behavioral data that is of exceptional interest to the social science and humanities research community. Computer science has only recently developed various technologies and techniques that allow for harvesting, organizing and analyzing such data and provide knowledge and insights into the structure and behavior or people on-line. Some of these techniques include social web mining, conceptual and social network analysis and modeling, tag clouds, topic maps, folksonomies, complex network visualizations, modeling of processes on networks, agent based models of social network emergence, speech recognition, computer vision, natural language processing, opinion mining and sentiment analysis, recommender systems, user profiling and semantic wikis. All of these techniques are briefly introduced, example studies are given and ideas as well as possible directions in the field of political attitudes and mentalities are given. In the end challenges for future studies are discussed
    • 

    corecore