130,678 research outputs found
Analysis and Exploring of different recent trends in processing of Big data
Big Data is a whole new concept in information technology driven computing world that manages large and complex datasets through certain data mining processing tools and functional models. Research suggests that tapping the potential of this data can benefit businesses, scientific disciplines and the public sector – contributing to their economic gains as well as development in every sphere. The need is to develop efficient systems that can exploit this potential to the maximum, keeping in mind the current challenges associated with its analysis, structure, scale, timeliness and privacy As we all know the process of data mining is knowledge driven discovery process through which we can extract relevant information about a particular matter of subject. Big size datasets have numerous velocity and volume so it demands research techniques to extract useful information. As the complexity of data increases, there is more challenging work for us to implement big data methodologies. Enterprises face the challenge of processingthese huge chunks of data, and have found that none of the existing centralized architectures can efficiently handle this huge volume of data. This research highlights the concept of Big Data used in datasets, its present scenario place in industry, hidden truths behind development of big data and what will be the future of big data in coming years
Parsing Large XES Files for Discovering Process Models: A Big Data Problem
Process mining is a group of techniques for retrieving de-facto models using system traces. Discovering algorithms can obtain mathematical models exploiting the information contained into list of events of activities. Completeness of the traces is relevant for the accuracy of the final results. Noiseless traces appear as an ideal scenario. The performance of the algorithms is significant reduce if the log files are not processed efficiently. XES is a logical model for process logs stored in data centric xml files. In real processes the sizes of the logs increase exponentially. Parsing XES files is presented as a big data problem in real scenarios with dense traces. Lazy parsers and DOM models are not enough appropriate in scenarios with large volumes of data. We discuss this problematic and how to use indexing techniques for retrieving useful information for process mining. An XES compression schema is also discussed for reducing the index construction time
Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups
Linguistic relations in oral conversations present how opinions are
constructed and developed in a restricted time. The relations bond ideas,
arguments, thoughts, and feelings, re-shape them during a speech, and finally
build knowledge out of all information provided in the conversation. Speakers
share a common interest to discuss. It is expected that each speaker's reply
includes duplicated forms of words from previous speakers. However, linguistic
adaptation is observed and evolves in a more complex path than just
transferring slightly modified versions of common concepts. A conversation
aiming a benefit at the end shows an emergent cooperation inducing the
adaptation. Not only cooperation, but also competition drives the adaptation or
an opposite scenario and one can capture the dynamic process by tracking how
the concepts are linguistically linked. To uncover salient complex dynamic
events in verbal communications, we attempt to discover self-organized
linguistic relations hidden in a conversation with explicitly stated winners
and losers. We examine open access data of the United States Supreme Court. Our
understanding is crucial in big data research to guide how transition states in
opinion mining and decision-making should be modeled and how this required
knowledge to guide the model should be pinpointed, by filtering large amount of
data.Comment: Full paper, Proceedings of FLAIRS-2017 (30th Florida Artificial
Intelligence Research Society), Special Track, Artificial Intelligence for
Big Social Data Analysi
An end-to-end process model for supervised machine learning classification: from problem to deployment in information systems
Extracting meaningful knowledge from (big) data represents a key success factor in many industries today. Supervised machine learning (SML) has emerged as a popular technique to learn patterns in complex data sets and to identify hidden correlations. When this insight is turned into action, business value is created. However, common data mining processes are generally not tailored to SML. In addition, they fall short of providing an end-to-end view that not only supports building a ”one off” model, but also covers its operational deployment within an information system. In this research-in-progress work we apply a Design Science Research (DSR) approach to develop a SML process model artifact that comprises model initiation, error estimation and deployment. In a first cycle, we evaluate the artifact in an illustrative scenario to demonstrate suitability. The results encourage us to further refine the approach and to prepare evaluations in concrete use cases. Thus, we move towards contributing a general process model that supports the systematic design of machine learning solutions to turn insights into continuous action
Intelligent Management and Efficient Operation of Big Data
This chapter details how Big Data can be used and implemented in networking
and computing infrastructures. Specifically, it addresses three main aspects:
the timely extraction of relevant knowledge from heterogeneous, and very often
unstructured large data sources, the enhancement on the performance of
processing and networking (cloud) infrastructures that are the most important
foundational pillars of Big Data applications or services, and novel ways to
efficiently manage network infrastructures with high-level composed policies
for supporting the transmission of large amounts of data with distinct
requisites (video vs. non-video). A case study involving an intelligent
management solution to route data traffic with diverse requirements in a wide
area Internet Exchange Point is presented, discussed in the context of Big
Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
Communication Theoretic Data Analytics
Widespread use of the Internet and social networks invokes the generation of
big data, which is proving to be useful in a number of applications. To deal
with explosively growing amounts of data, data analytics has emerged as a
critical technology related to computing, signal processing, and information
networking. In this paper, a formalism is considered in which data is modeled
as a generalized social network and communication theory and information theory
are thereby extended to data analytics. First, the creation of an equalizer to
optimize information transfer between two data variables is considered, and
financial data is used to demonstrate the advantages. Then, an information
coupling approach based on information geometry is applied for dimensionality
reduction, with a pattern recognition example to illustrate the effectiveness.
These initial trials suggest the potential of communication theoretic data
analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan.
201
MOSDEN: A Scalable Mobile Collaborative Platform for Opportunistic Sensing Applications
Mobile smartphones along with embedded sensors have become an efficient
enabler for various mobile applications including opportunistic sensing. The
hi-tech advances in smartphones are opening up a world of possibilities. This
paper proposes a mobile collaborative platform called MOSDEN that enables and
supports opportunistic sensing at run time. MOSDEN captures and shares sensor
data across multiple apps, smartphones and users. MOSDEN supports the emerging
trend of separating sensors from application-specific processing, storing and
sharing. MOSDEN promotes reuse and re-purposing of sensor data hence reducing
the efforts in developing novel opportunistic sensing applications. MOSDEN has
been implemented on Android-based smartphones and tablets. Experimental
evaluations validate the scalability and energy efficiency of MOSDEN and its
suitability towards real world applications. The results of evaluation and
lessons learned are presented and discussed in this paper.Comment: Accepted to be published in Transactions on Collaborative Computing,
2014. arXiv admin note: substantial text overlap with arXiv:1310.405
- …