Search CORE

130,678 research outputs found

Analysis and Exploring of different recent trends in processing of Big data

Author: Kumar Dr. Praveen
Kumar Praveen
Publication venue: IJRE Publisher
Publication date
Field of study

Big Data is a whole new concept in information technology driven computing world that manages large and complex datasets through certain data mining processing tools and functional models. Research suggests that tapping the potential of this data can benefit businesses, scientific disciplines and the public sector – contributing to their economic gains as well as development in every sphere. The need is to develop efficient systems that can exploit this potential to the maximum, keeping in mind the current challenges associated with its analysis, structure, scale, timeliness and privacy As we all know the process of data mining is knowledge driven discovery process through which we can extract relevant information about a particular matter of subject. Big size datasets have numerous velocity and volume so it demands research techniques to extract useful information. As the complexity of data increases, there is more challenging work for us to implement big data methodologies. Enterprises face the challenge of processingthese huge chunks of data, and have found that none of the existing centralized architectures can efficiently handle this huge volume of data. This research highlights the concept of Big Data used in datasets, its present scenario place in industry, hidden truths behind development of big data and what will be the future of big data in coming years

International Journal of Research and Engineering

Parsing Large XES Files for Discovering Process Models: A Big Data Problem

Author: Aponte Báez Yosvanys
Marco Such Manuel
Sánchez Alexander
Publication venue: IJARCSSE
Publication date: 01/01/2015
Field of study

Process mining is a group of techniques for retrieving de-facto models using system traces. Discovering algorithms can obtain mathematical models exploiting the information contained into list of events of activities. Completeness of the traces is relevant for the accuracy of the final results. Noiseless traces appear as an ideal scenario. The performance of the algorithms is significant reduce if the log files are not processed efficiently. XES is a logical model for process logs stored in data centric xml files. In real processes the sizes of the logs increase exponentially. Parsing XES files is presented as a big data problem in real scenarios with dense traces. Lazy parsers and DOM models are not enough appropriate in scenarios with large volumes of data. We discuss this problematic and how to use indexing techniques for retrieving useful information for process mining. An XES compression schema is also discussed for reducing the index construction time

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups

Author: Cambria Erik
Mondal Anupam
Sanli Ceyda
Publication venue
Publication date: 01/01/2017
Field of study

Linguistic relations in oral conversations present how opinions are constructed and developed in a restricted time. The relations bond ideas, arguments, thoughts, and feelings, re-shape them during a speech, and finally build knowledge out of all information provided in the conversation. Speakers share a common interest to discuss. It is expected that each speaker's reply includes duplicated forms of words from previous speakers. However, linguistic adaptation is observed and evolves in a more complex path than just transferring slightly modified versions of common concepts. A conversation aiming a benefit at the end shows an emergent cooperation inducing the adaptation. Not only cooperation, but also competition drives the adaptation or an opposite scenario and one can capture the dynamic process by tracking how the concepts are linguistically linked. To uncover salient complex dynamic events in verbal communications, we attempt to discover self-organized linguistic relations hidden in a conversation with explicitly stated winners and losers. We examine open access data of the United States Supreme Court. Our understanding is crucial in big data research to guide how transition states in opinion mining and decision-making should be modeled and how this required knowledge to guide the model should be pinpointed, by filtering large amount of data.Comment: Full paper, Proceedings of FLAIRS-2017 (30th Florida Artificial Intelligence Research Society), Special Track, Artificial Intelligence for Big Social Data Analysi

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

An end-to-end process model for supervised machine learning classification: from problem to deployment in information systems

Author: Hirt Robin
Koehl Niklas J.
Satzger Gerhard
Publication venue: Karlsruher Institut für Technologie (KIT)
Publication date: 01/01/2017
Field of study

Extracting meaningful knowledge from (big) data represents a key success factor in many industries today. Supervised machine learning (SML) has emerged as a popular technique to learn patterns in complex data sets and to identify hidden correlations. When this insight is turned into action, business value is created. However, common data mining processes are generally not tailored to SML. In addition, they fall short of providing an end-to-end view that not only supports building a ”one off” model, but also covers its operational deployment within an information system. In this research-in-progress work we apply a Design Science Research (DSR) approach to develop a SML process model artifact that comprises model initiation, error estimation and deployment. In a first cycle, we evaluate the artifact in an illustrative scenario to demonstrate suitability. The results encourage us to further refine the approach and to prepare evaluations in concrete use cases. Thus, we move towards contributing a general process model that supports the systematic design of machine learning solutions to turn insights into continuous action

Irish Universities

Cork Open Research Archive

Intelligent Management and Efficient Operation of Big Data

Author: Batista Fernando
Cardoso Elsa
Moura Jose
Nunes Luis
Publication venue
Publication date: 01/01/2015
Field of study

This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

Communication Theoretic Data Analytics

Author: Chen Kwang-Cheng
Huang Shao-Lun
Poor H. Vincent
Zheng Lizhong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/01/2015
Field of study

Widespread use of the Internet and social networks invokes the generation of big data, which is proving to be useful in a number of applications. To deal with explosively growing amounts of data, data analytics has emerged as a critical technology related to computing, signal processing, and information networking. In this paper, a formalism is considered in which data is modeled as a generalized social network and communication theory and information theory are thereby extended to data analytics. First, the creation of an equalizer to optimize information transfer between two data variables is considered, and financial data is used to demonstrate the advantages. Then, an information coupling approach based on information geometry is applied for dimensionality reduction, with a pattern recognition example to illustrate the effectiveness. These initial trials suggest the potential of communication theoretic data analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan. 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

MOSDEN: A Scalable Mobile Collaborative Platform for Opportunistic Sensing Applications

Author: Georgakopoulos Dimitrios
Jayaraman Prem Prakash
Perera Charith
Zaslavsky Arkady
Publication venue
Publication date: 01/05/2014
Field of study

Mobile smartphones along with embedded sensors have become an efficient enabler for various mobile applications including opportunistic sensing. The hi-tech advances in smartphones are opening up a world of possibilities. This paper proposes a mobile collaborative platform called MOSDEN that enables and supports opportunistic sensing at run time. MOSDEN captures and shares sensor data across multiple apps, smartphones and users. MOSDEN supports the emerging trend of separating sensors from application-specific processing, storing and sharing. MOSDEN promotes reuse and re-purposing of sensor data hence reducing the efforts in developing novel opportunistic sensing applications. MOSDEN has been implemented on Android-based smartphones and tablets. Experimental evaluations validate the scalability and energy efficiency of MOSDEN and its suitability towards real world applications. The results of evaluation and lessons learned are presented and discussed in this paper.Comment: Accepted to be published in Transactions on Collaborative Computing, 2014. arXiv admin note: substantial text overlap with arXiv:1310.405

arXiv.org e-Print Archive

Online Research @ Cardiff

Directory of Open Access Journals

RMIT Research Repository