65 research outputs found
Distributed log analysis on the cloud using MapReduce
U ovom članku opisujemo naš rad na projektiranju na mreži zasnovanog sustava analize distribuiranih podataka koji se zasniva na popularnom MapReduce okviru postavljenom na malom oblaku i razvijenom specijalno za analizu zapisa web poslužnika. Sustav analize zapisa sastoji se od nekoliko čvorova klastera, dijeli velike datoteke zapisa na distribuirani sustav datoteke i brzo ih obrađuje primjenom MapReduce modela programiranja. Klaster se stvara primjenom open source infrastrukture oblaka, čime nam je omogućeno jednostavno povećanje računalne snage dodavanjem dvaju čvorova. Time nam je data mogućnost da jednostavno promijenimo veličinu klastera u skladu s potrebama analize podataka. Primijenili smo MapReduce programe za potrebe osnovne analize zapisa poput frekvencijske analize, otkrivanja greške, otkrivanja prometnog sata (busy hour) itd. kao i za složenije analize za koje je potrebno nekoliko poslova. Sustav može automatski prepoznati i analizirati više vrsta zapisa web poslužnika kao što su Apache, IIS, Squid itd. Primijenjujemo open source projekte za kreiranje infrastrukture oblaka i obavljanje MapReduce poslova.In this paper we describe our work on designing a web based, distributed data analysis system based on the popular MapReduce framework deployed on a small cloud; developed specifically for analyzing web server logs. The log analysis system consists of several cluster nodes, it splits the large log files on a distributed file system and quickly processes them using MapReduce programming model. The cluster is created using an open source cloud infrastructure, which allows us to easily expand the computational power by adding new nodes. This gives us the ability to automatically resize the cluster according to the data analysis requirements. We implemented MapReduce programs for basic log analysis needs like frequency analysis, error detection, busy hour detection etc. as well as more complex analyses which require running several jobs. The system can automatically identify and analyze several web server log types such as Apache, IIS, Squid etc. We use open source projects for creating the cloud infrastructure and running MapReduce jobs
Preparation of Improved Turkish DataSet for Sentiment Analysis in Social Media
A public dataset, with a variety of properties suitable for sentiment
analysis [1], event prediction, trend detection and other text mining
applications, is needed in order to be able to successfully perform analysis
studies. The vast majority of data on social media is text-based and it is not
possible to directly apply machine learning processes into these raw data,
since several different processes are required to prepare the data before the
implementation of the algorithms. For example, different misspellings of same
word enlarge the word vector space unnecessarily, thereby it leads to reduce
the success of the algorithm and increase the computational power requirement.
This paper presents an improved Turkish dataset with an effective spelling
correction algorithm based on Hadoop [2]. The collected data is recorded on the
Hadoop Distributed File System and the text based data is processed by
MapReduce programming model. This method is suitable for the storage and
processing of large sized text based social media data. In this study, movie
reviews have been automatically recorded with Apache ManifoldCF (MCF) [3] and
data clusters have been created. Various methods compared such as Levenshtein
and Fuzzy String Matching have been proposed to create a public dataset from
collected data. Experimental results show that the proposed algorithm, which
can be used as an open source dataset in sentiment analysis studies, have been
performed successfully to the detection and correction of spelling errors.Comment: Presented at CMES201
Sosyal Bilgiler ve Sınıf Öğretmeni Adaylarının Girişimcilik Düzeylerinin İncelenmesi
Bu çalışmanın amacı, sosyal bilgiler ve sınıf öğretmeni adaylarının girişimcilik düzeylerini çeşitli değişkenlere göre incelemektir. Betimsel tarama modeli kullanılan bu araştırmanın çalışma grubunu, 2015-2016 eğitim-öğretim yılında Erciyes Üniversitesi Eğitim Fakültesi’nde öğrenim gören 203 sosyal bilgiler ve 232 sınıf öğretmenliği olmak üzere toplamda 435 öğretmen adayı oluşturmaktadır. Araştırmada elde edilen veriler, araştırmacılar tarafından hazırlanan kişisel bilgi formu ve Yılmaz ve Sünbül (2009) tarafından geliştirilen “Üniversite Öğrencileri İçin Girişimcilik Ölçeği” ile toplanmıştır. Sosyal bilgiler ve sınıf öğretmeni adaylarının girişimcilik düzeyleri, sınıf seviyesi, bölüm, anne-baba eğitim durumu, mezun olunan lise türü ve aile aylık geliri değişkenlerine göre istatistiksel olarak anlamlı bir şekilde farklılaşmamaktadır. Sosyal bilgiler ve sınıf öğretmeni adaylarının girişimcilik düzeyleri, cinsiyet (erkek lehine) ve yaşanılan yerleşim birimi (büyükşehir lehine) değişkenine göre, istatistiksel olarak anlamlı bir şekilde farklılaşmaktadır.Anahtar kelimeler: Girişimcilik, girişimci, öğretmen adayı, sosyal bilgiler öğretmeni, sınıf öğretmeniThe aim of this study is to investigate the entrepreneurship levels of the social studies and classroom teachers candidate according to various variables. The study group of this study, in which a descriptive model was used, is formed by a total of 435 teachers candidate (203 social studies teachers candidate and 232 classroom teachers candidate) studying at Erciyes University Department of Education in the academic year 2015-2016. The obtained data in the research were collected by personal information form prepared by the researchers and the "University Students Entrepreneurship Scale" developed by Yılmaz & Sünbül (2009). The entrepreneurship levels of the candidates of social studies and form teacher shows no significant difference according to the grade level, branch, graduated high school, monthly family income, parents' education level and place of residence. Also, the entrepreneurship levels of the candidates of social studies and form teacher shows significant difference according to the gender (favor of men), place of residence (in favor of metropolis). Keywords: Entrepreneurship, entrepreneur, candidate teacher, social studies teacher, classroom teacher.</p
- …