    Magnetotelluric data in western Turkey: Dimensionality analysis using Mohr circles

    
    During the summer of 1996, wideband (0.003125-2000 s) magnetotelluric (MT) data were collected from 52 sites across western Turkey, with a site spacing of similar to 5 km. The extensional regime is dominant as a result of the convergence of the African and Arabian plates against the Eurasian plate, and western Turkey is characterized with east-west oriented, "continental escape" tectonics. To contribute to the knowledge of the geological structures along the profile, the dimensionality characteristics of the MT impedance tensors are computed, taking real and imaginary parts of the tensor elements separately. The rotationally invariant parameters of central impedance (d(3)) and anisotropy angle (lambda) are also computed, the two being good dimensionality indicators. Pseudosections of these parameters reveal the presence of major geological structures in western Turkey, such as the zone between the Menderes Massif and the Bornova Flysch zone, the Izmir-Ankara Suture zone, the western part of the North Anatolian fault zone, and grabens (Demirci, Gordes, and Bigadic), which are the characteristic of the region. Interpretation of the full set of Mohr circles shows a strong anisotropy and high central impedance anomalies, while changes are shown in geological strike direction at estimated depths of about 7-8 km, 15-20 km, and 35-40 km along the profile from south to north. These anomalies are indicative of changes in the thickness of upper crust

    A hybrid approach to dynamic enterprise data platform

    
    Today, corporations aim to make maximum use of the data produced in business applications. One of the most important goals is to convert the data to the commercial benefit in the fastest way. For this purpose, it is critical to receive the data from source systems, process this data and use it as a support for business decisions. There are many approaches to the proceeding of acquiring, processing and making the data useful. In this study, we took advantage of most of the existing approaches and produced a hybrid solution. This solution can be integrated with new data sources very quickly and reduces the amount of time for data integration, preprocessing, deduplication and entity mapping by using open source software components.Publisher's Versio

    Intelligent mapping for hotel records representing the same entity

    
    Otel sayısının her geçen gün arttığı turizm sektöründe, aracı firmaların tüm oteller ile ayrı ayrı çalışma imkanı bulunmadığından, firmalar dünya üzerinde bir çok otelle anlaşması bulunan servis sağlayıcılarıyla beraber çalışmaktadır. Farklı servis sağlayıcılarından alınan otel kayıtlarında tekrarlayan otel verileri olabilmektedir. Tekrarlayan bu kayıtlar aynı bilgilere sahip olabileceği gibi, farklı bilgilere sahip olmasına rağmen aynı oteli temsil edebilmektedir. Otel verilerini tutarlı hale getirmek için aynı oteli temsil eden kayıtlar eşleştirilmelidir. Bu amaçla, otel kayıtları üzerinde çalışılarak, adres zenginleştirmesi ve ön işleme yapılan aday kayıtlar için kategorik ve görsel verilerin benzerliklerinin kullanıldığı makine öğrenmesi algoritmaları uygulanmıştır. Yapılan işlem sonucunda, 132.287 satırlık otel verisinde 14.985 adet otel %99,12 doğruluk oranı ile eşleştirilmiştir.Having the day by day increasing number of hotel entities, dealing with the whole set of hotels individually is almost impossible. Therefore, travel agencies work with online hotel providers which have deals with many hotels around the world. Whereas, working with online providers saves agencies from a big challenge, it degrades the problem of agency to another one: duplicate hotel records from different sources. The repeating records might either have all same set of identical features or features with different values that represent the same hotel. Matching and merging such records need to be applied for a consistent database. In this study, we propose a set of methods which aims to solve the pointed problem. We work on hotel records, applied machine learning algorithms using string and image similarity on records for which address enrichment and pre-processing applied, selecting prior methods as a baseline. Proposed method achieved 99.12% accuracy, matching 14.985 hotels on a 132.287 rows of data.Publisher's Versio

    Near duplicate detection in relational databases

    
    Veri miktarının artışına paralel olarak, ilişkisel veri tabanlarında mükerrer kayıtlar da artmaktadır. Artan bu kayıtlar kullanıldıkları rapor veya analizlerde tutarsızlığa sebep olabilmektedir. Bu sorunu en aza indirgemek için yaptığımız çalışmada, kayıtların birbirlerine olan benzerlikleri ve alan uzmanlık bilgisiyle belirlenen ağırlıklar, öznitelik olarak kullanılarak makine öğrenmesi algoritmaları ile mükerrer kayıtların bulunması hedeflenmiştir. Yapılan işlem sonucunda 9301467 satır veride 28412 mükerrer çift tespit edilmiştir. Bulunan bu mükerrer kayıtlar veri kaynağından temizlenerek verinin daha tutarlı hale gelmesi sağlanmaktadır.While data amount increases, number of duplicate records in relational databases increase gradually. The duplicate records might cause inconsistency on reports and analyzes. To reduce the effects of this problem, we aim to detect duplicate records using machine learning algorithms with features that are produced by similarity of the records. We achieved to detect 28412 duplicate records in 9301467 records. The detected duplicate rows are removed from the data source and the data become more consistent.Publisher's Versio