65 research outputs found

    Distributed log analysis on the cloud using MapReduce

    Get PDF
    U ovom članku opisujemo naš rad na projektiranju na mreži zasnovanog sustava analize distribuiranih podataka koji se zasniva na popularnom MapReduce okviru postavljenom na malom oblaku i razvijenom specijalno za analizu zapisa web poslužnika. Sustav analize zapisa sastoji se od nekoliko čvorova klastera, dijeli velike datoteke zapisa na distribuirani sustav datoteke i brzo ih obrađuje primjenom MapReduce modela programiranja. Klaster se stvara primjenom open source infrastrukture oblaka, čime nam je omogućeno jednostavno povećanje računalne snage dodavanjem dvaju čvorova. Time nam je data mogućnost da jednostavno promijenimo veličinu klastera u skladu s potrebama analize podataka. Primijenili smo MapReduce programe za potrebe osnovne analize zapisa poput frekvencijske analize, otkrivanja greške, otkrivanja prometnog sata (busy hour) itd. kao i za složenije analize za koje je potrebno nekoliko poslova. Sustav može automatski prepoznati i analizirati više vrsta zapisa web poslužnika kao što su Apache, IIS, Squid itd. Primijenjujemo open source projekte za kreiranje infrastrukture oblaka i obavljanje MapReduce poslova.In this paper we describe our work on designing a web based, distributed data analysis system based on the popular MapReduce framework deployed on a small cloud; developed specifically for analyzing web server logs. The log analysis system consists of several cluster nodes, it splits the large log files on a distributed file system and quickly processes them using MapReduce programming model. The cluster is created using an open source cloud infrastructure, which allows us to easily expand the computational power by adding new nodes. This gives us the ability to automatically resize the cluster according to the data analysis requirements. We implemented MapReduce programs for basic log analysis needs like frequency analysis, error detection, busy hour detection etc. as well as more complex analyses which require running several jobs. The system can automatically identify and analyze several web server log types such as Apache, IIS, Squid etc. We use open source projects for creating the cloud infrastructure and running MapReduce jobs

    Preparation of Improved Turkish DataSet for Sentiment Analysis in Social Media

    Full text link
    A public dataset, with a variety of properties suitable for sentiment analysis [1], event prediction, trend detection and other text mining applications, is needed in order to be able to successfully perform analysis studies. The vast majority of data on social media is text-based and it is not possible to directly apply machine learning processes into these raw data, since several different processes are required to prepare the data before the implementation of the algorithms. For example, different misspellings of same word enlarge the word vector space unnecessarily, thereby it leads to reduce the success of the algorithm and increase the computational power requirement. This paper presents an improved Turkish dataset with an effective spelling correction algorithm based on Hadoop [2]. The collected data is recorded on the Hadoop Distributed File System and the text based data is processed by MapReduce programming model. This method is suitable for the storage and processing of large sized text based social media data. In this study, movie reviews have been automatically recorded with Apache ManifoldCF (MCF) [3] and data clusters have been created. Various methods compared such as Levenshtein and Fuzzy String Matching have been proposed to create a public dataset from collected data. Experimental results show that the proposed algorithm, which can be used as an open source dataset in sentiment analysis studies, have been performed successfully to the detection and correction of spelling errors.Comment: Presented at CMES201

    Sosyal Bilgiler ve Sınıf Öğretmeni Adaylarının Girişimcilik Düzeylerinin İncelenmesi

    No full text
    Bu &ccedil;alışmanın amacı, sosyal bilgiler ve sınıf &ouml;ğretmeni adaylarının girişimcilik d&uuml;zeylerini &ccedil;eşitli değişkenlere g&ouml;re incelemektir. Betimsel tarama modeli kullanılan bu araştırmanın &ccedil;alışma grubunu, 2015-2016 eğitim-&ouml;ğretim yılında Erciyes &Uuml;niversitesi Eğitim Fak&uuml;ltesi&rsquo;nde &ouml;ğrenim g&ouml;ren 203 sosyal bilgiler ve 232 sınıf &ouml;ğretmenliği olmak &uuml;zere toplamda 435 &ouml;ğretmen adayı oluşturmaktadır. Araştırmada elde edilen veriler, araştırmacılar tarafından hazırlanan kişisel bilgi formu ve Yılmaz ve S&uuml;nb&uuml;l (2009) tarafından geliştirilen &ldquo;&Uuml;niversite &Ouml;ğrencileri İ&ccedil;in Girişimcilik &Ouml;l&ccedil;eği&rdquo; ile toplanmıştır. Sosyal bilgiler ve sınıf &ouml;ğretmeni adaylarının girişimcilik d&uuml;zeyleri, sınıf seviyesi, b&ouml;l&uuml;m, anne-baba eğitim durumu, mezun olunan lise t&uuml;r&uuml; ve aile aylık geliri değişkenlerine g&ouml;re istatistiksel olarak anlamlı bir şekilde farklılaşmamaktadır. Sosyal bilgiler ve sınıf &ouml;ğretmeni adaylarının girişimcilik d&uuml;zeyleri, cinsiyet (erkek lehine) ve yaşanılan yerleşim birimi (b&uuml;y&uuml;kşehir lehine) değişkenine g&ouml;re, istatistiksel olarak anlamlı bir şekilde farklılaşmaktadır.Anahtar kelimeler: Girişimcilik, girişimci, &ouml;ğretmen adayı, sosyal bilgiler &ouml;ğretmeni, sınıf &ouml;ğretmeniThe aim of this study is to investigate the entrepreneurship levels of the social studies and classroom teachers candidate according to various variables. The study group of this study, in which a descriptive model was used, is formed by a total of 435 teachers candidate (203 social studies teachers candidate and 232 classroom teachers candidate) studying at Erciyes University Department of Education in the academic year 2015-2016. The obtained data in the research were collected by personal information form prepared by the researchers and the &quot;University Students Entrepreneurship Scale&quot; developed by Yılmaz &amp; S&uuml;nb&uuml;l (2009). The entrepreneurship levels of the candidates of social studies and form teacher shows no significant difference according to the grade level, branch, graduated high school, monthly family income, parents&#39; education level and place of residence. Also, the entrepreneurship levels of the candidates of social studies and form teacher shows significant difference according to the gender (favor of men), place of residence (in favor of metropolis).&nbsp;Keywords: Entrepreneurship, entrepreneur, candidate teacher, social studies teacher, classroom teacher.</p
    corecore