26,410 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis
The Market Blended Insight project1 has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the unstructured text on the web, to annotate and then translate the extracted data according to the backend schema
A literature survey of methods for analysis of subjective language
Subjective language is used to express attitudes and opinions towards things, ideas and people. While content and topic centred natural language processing is now part of everyday life, analysis of subjective aspects of natural language have until recently been largely neglected by the research community. The explosive growth of personal blogs, consumer opinion sites and social network applications in the last years, have however created increased interest in subjective language analysis. This paper provides an overview of recent research conducted in the area
Sentiment Analysis Using Collaborated Opinion Mining
Opinion mining and Sentiment analysis have emerged as a field of study since
the widespread of World Wide Web and internet. Opinion refers to extraction of
those lines or phrase in the raw and huge data which express an opinion.
Sentiment analysis on the other hand identifies the polarity of the opinion
being extracted. In this paper we propose the sentiment analysis in
collaboration with opinion extraction, summarization, and tracking the records
of the students. The paper modifies the existing algorithm in order to obtain
the collaborated opinion about the students. The resultant opinion is
represented as very high, high, moderate, low and very low. The paper is based
on a case study where teachers give their remarks about the students and by
applying the proposed sentiment analysis algorithm the opinion is extracted and
represented.Comment: 5 pages, 6 figure
Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts
The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering
A unified view of data-intensive flows in business intelligence systems : a survey
Data-intensive flows are central processes in todayâs business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of todayâs research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft
Research Directions, Challenges and Issues in Opinion Mining
Rapid growth of Internet and availability of user reviews on the web for any product has provided a need for an effective system to analyze the web reviews. Such reviews are useful to some extent, promising both the customers and product manufacturers. For any popular product, the number of reviews can be in hundreds or even thousands. This creates difficulty for a customer to analyze them and make important decisions on whether to purchase the product or to not. Mining such product reviews or opinions is termed as opinion mining which is broadly classified into two main categories namely facts and opinions. Though there are several approaches for opinion mining, there remains a challenge to decide on the recommendation provided by the system. In this paper, we analyze the basics of opinion mining, challenges, pros & cons of past opinion mining systems and provide some directions for the future research work, focusing on the challenges and issues
THE OPTIMIZATION OF THE INTERNAL AND EXTERNAL REPORTING IN FINANCIAL ACCOUNTING: ADOPTING XBRL INTERNATIONAL STANDARD
More and more enterprises, especially the listed companies, have adopted newaccounting norms and regulations (IFRS or US GAAP, Bale II and, in perspective, SURFI),manifesting interest for publishing financial reports using a standard format able to considerablyimprove their communication, data collection in the receiving units, control and analysis offinancial information. When switching to the new accounting rules specified in international orregional standards and norms, regulatory and control bodies recommend the XBRL format forfinancial reporting, with recognition of the regional jurisdiction. Our paper makes a review of theliterature, presents the XBRL specific elements and proposes possible solutions for internal andexternal financial reporting of an enterprise. Finally, it concludes on the benefits of adopting XBRLat national level in a potential XBRL Romania project.accounting norms, financial reporting, XBRL, taxonomy, XBRL jurisdiction.
- âŠ