301,680 research outputs found

    Towards the cloudification of the social networks analytics

    Get PDF
    In the last years, with the increase of the available data from social networks and the rise of big data technologies, social data has emerged as one of the most profitable market for companies to increase their benefits. Besides, social computation scientists see such data as a vast ocean of information to study modern human societies. Nowadays, enterprises and researchers are developing their own mining tools in house, or they are outsourcing their social media mining needs to specialised companies with its consequent economical cost. In this paper, we present the first cloud computing service to facilitate the deployment of social media analytics applications to allow data practitioners to use social mining tools as a service. The main advantage of this service is the possibility to run different queries at the same time and combine their results in real time. Additionally, we also introduce twearch, a prototype to develop twitter mining algorithms as services in the cloud.Peer ReviewedPostprint (author’s final draft

    Technologies solutions and Oracle instruments used in the accomplishment of executive informatics systems (EIS)

    Get PDF
    The role of a system for the control of the data bases and the facilities offered by it is highly important in the success and performance of an executive informatics system. From this point of view, the analyze will take into account the facilities of working with evolved data bases and storages of data, the implementation of some OLAP functionalities an data mining but also the integration of data and applications coming from different sources, the way in which the process of extraction, transformation and loading of this data in the final storages takes place, the easiness in administration and the instruments offered for the developing of interfaces. One important point of this analyze refers to the performance in interrogation, both on operational data bases and the extraction of data from data storages.the executive Informatics System, OLAP, Data Mining

    Knowledge Discovery in Data Mining and Massive Data Mining

    Get PDF
    Knowledge discovery is a process of non trivial extraction of previously unknown and presently useful information. The rapid advancement of the technology resulted in the increasing rate of data distributions. The data generated from mobile applications, sensor applications, network monitoring, traffic management, weblogs etc. can be referred as a data stream. The data streams are massive in nature. The present work mainly aims at knowledge discovery using data mining and massive data mining techniques. The knowledge discovery process in both the techniques is compared by developing a classification model using Naive bayes classifier. The former case uses Edu-data, a data collected from technical education system and the latter case uses massive online analysis frame work to generate the data streams. Mining data stream is referred as Massive Data Mining. The data streams must be processed under very strict constraints of space and time using sophisticated techniques. The traditional data mining techniques are not advised on this massive data. Therefore the massive online analysis framework is used to mine the data streams. The present work happens to be unique in the literaturein

    Trajectory data mining: A review of methods and applications

    Get PDF
    The increasing use of location-aware devices has led to an increasing availability of trajectory data. As a result, researchers devoted their efforts to developing analysis methods including different data mining methods for trajectories. However, the research in this direction has so far produced mostly isolated studies and we still lack an integrated view of problems in applications of trajectory mining that were solved, the methods used to solve them, and applications using the obtained solutions. In this paper, we first discuss generic methods of trajectory mining and the relationships between them. Then, we discuss and classify application problems that were solved using trajectory data and relate them to the generic mining methods that were used and real world applications based on them. We classify trajectory-mining application problems under major problem groups based on how they are related. This classification of problems can guide researchers in identifying new application problems. The relationships between the methods together with the association between the application problems and mining methods can help researchers in identifying gaps between methods and inspire them to develop new methods. This paper can also guide analysts in choosing a suitable method for a specific problem. The main contribution of this paper is to provide an integrated view relating applications of mining trajectory data and the methods used

    Compiler and runtime support for shared memory parallelization of data mining algorithms

    Get PDF
    Abstract. Data mining techniques focus on finding novel and useful patterns or models from large datasets. Because of the volume of the data to be analyzed, the amount of computation involved, and the need for rapid or even interactive analysis, data mining applications require the use of parallel machines. We have been developing compiler and runtime support for developing scalable implementations of data mining algorithms. Our work encompasses shared memory parallelization, distributed memory parallelization, and optimizations for processing disk-resident datasets. In this paper, we focus on compiler and runtime support for shared memory parallelization of data mining algorithms. We have developed a set of parallelization techniques that apply across algorithms for a variety of mining tasks. We describe the interface of the middleware where these techniques are implemented. Then, we present compiler techniques for translating data parallel code to the middleware specification. Finally, we present a brief evaluation of our compiler using apriori association mining and k-means clustering.

    Galaxy Zoo: Morphological Classification and Citizen Science

    Full text link
    We provide a brief overview of the Galaxy Zoo and Zooniverse projects, including a short discussion of the history of, and motivation for, these projects as well as reviewing the science these innovative internet-based citizen science projects have produced so far. We briefly describe the method of applying en-masse human pattern recognition capabilities to complex data in data-intensive research. We also provide a discussion of the lessons learned from developing and running these community--based projects including thoughts on future applications of this methodology. This review is intended to give the reader a quick and simple introduction to the Zooniverse.Comment: 11 pages, 1 figure; to be published in Advances in Machine Learning and Data Mining for Astronom
    • …
    corecore