139 research outputs found

    Crime Prediction using Machine Learning with a Novel Crime Dataset

    Full text link
    Crime is an unlawful act that carries legal repercussions. Bangladesh has a high crime rate due to poverty, population growth, and many other socio-economic issues. For law enforcement agencies, understanding crime patterns is essential for preventing future criminal activity. For this purpose, these agencies need structured crime database. This paper introduces a novel crime dataset that contains temporal, geographic, weather, and demographic data about 6574 crime incidents of Bangladesh. We manually gather crime news articles of a seven year time span from a daily newspaper archive. We extract basic features from these raw text. Using these basic features, we then consult standard service-providers of geo-location and weather data in order to garner these information related to the collected crime incidents. Furthermore, we collect demographic information from Bangladesh National Census data. All these information are combined that results in a standard machine learning dataset. Together, 36 features are engineered for the crime prediction task. Five supervised machine learning classification algorithms are then evaluated on this newly built dataset and satisfactory results are achieved. We also conduct exploratory analysis on various aspects the dataset. This dataset is expected to serve as the foundation for crime incidence prediction systems for Bangladesh and other countries. The findings of this study will help law enforcement agencies to forecast and contain crime as well as to ensure optimal resource allocation for crime patrol and prevention.Comment: 24 page

    MOLECULAR CHARACTERIZATION OF MAJOR VECTOR MOSQUITOES OF BANGLADESH

    Get PDF
    Mosquito-borne diseases are considered major contributors to vector-borne diseases, threatening more than eighty per cent of the global population. Pest management depends on proper identification techniques. The barcode region of the cytochrome oxidase subunit I gene of mitochondrial DNA has recently been proposed as a systematic tool, functional in taxonomy and evolutionary study for species definition. This work is the first attempt to identify the main vector mosquito species from Bangladesh based on the MT-COI gene. Eleven vector mosquitos were identified. AT content (69%) was found to be higher than GC content (31%) at the COI barcode region of the mosquito. The interspecific genetic divergence range of medically important mosquitoes was 0.01-0.21. Haplotype analysis revealed that Mansonia annulifera diverged highly from its immediate ancestor by the highest mutational steps (59). Phylogenetic analysis indicated that species belonging to the same family were in the same major clade. Overall, our findings contribute to a better method of identifying major vector mosquito species by COI genes and for implementing management measures against mosquito pests in Bangladesh

    Application of RFLP-PCR-Based Identification for Sand Fly Surveillance in an Area Endemic for Kala-Azar in Mymensingh, Bangladesh

    Get PDF
    Mymensingh is the most endemic district for kala-azar in Bangladesh. Phlebotomus argentipes remains the only known vector although a number of sand fly species are prevalent in this area. Genotyping of sand flies distributed in a VL endemic area was developed by a PCR and restriction-fragment-length polymorphism (RFLP) of 18S rRNA gene of sand fly species. Using the RFLP-PCR analysis with AfaI and HinfI restriction enzymes, P. argentipes, P. papatasi, and Sergentomyia species could be identified. Among 1,055 female sand flies successfully analyzed for the species identification individually, 64.4% flies was classified as Sergentomyia species, whereas 35.6% was identified as P. argentipes and no P. papatasi was found. Although infection of Leishmania within the sand flies was individually examined targeting leishmanial minicircle DNA, none of the 1,055 sand flies examined were positive for Leishmania infection. The RFLP-PCR could be useful tools for taxonomic identification and Leishmania infection monitoring in endemic areas of Bangladesh

    Bengali Fake Review Detection using Semi-supervised Generative Adversarial Networks

    Full text link
    This paper investigates the potential of semi-supervised Generative Adversarial Networks (GANs) to fine-tune pretrained language models in order to classify Bengali fake reviews from real reviews with a few annotated data. With the rise of social media and e-commerce, the ability to detect fake or deceptive reviews is becoming increasingly important in order to protect consumers from being misled by false information. Any machine learning model will have trouble identifying a fake review, especially for a low resource language like Bengali. We have demonstrated that the proposed semi-supervised GAN-LM architecture (generative adversarial network on top of a pretrained language model) is a viable solution in classifying Bengali fake reviews as the experimental results suggest that even with only 1024 annotated samples, BanglaBERT with semi-supervised GAN (SSGAN) achieved an accuracy of 83.59% and a f1-score of 84.89% outperforming other pretrained language models - BanglaBERT generator, Bangla BERT Base and Bangla-Electra by almost 3%, 4% and 10% respectively in terms of accuracy. The experiments were conducted on a manually labeled food review dataset consisting of total 6014 real and fake reviews collected from various social media groups. Researchers that are experiencing difficulty recognizing not just fake reviews but other classification issues owing to a lack of labeled data may find a solution in our proposed methodology

    Evaluation of rK-39 strip test using urine for diagnosis of visceral leishmaniasis in an endemic area in Bangladesh

    Get PDF
    Diagnosis of visceral leishmaniasis (VL) by demonstration of parasites in tissue smears obtained from bone marrow, spleen or lymph nodes is risky, painful, and difficult. The rK-39 strip test is widely used for the diagnosis of VL using blood/serum samples in endemic countries. The aim of the study was to evaluate the rK-39 strip test using urine sample as a non-invasive means for the diagnosis of VL. The rk-39 strip test was performed using urine from 100 suspected VL cases along with 25 disease control (malarial febrile cases) and 50 healthy control (from endemic and non-endemic areas). All the VL suspected cases were positive with the rK-39 strip test using serum. The sensitivity and specificity of the rK-39 strip test using urine samples was 95% and 93.3%, respectively, compared to serum based rK-39 test. The findings suggest that the urine based rK-39 test could be a practical and efficient tool for the diagnosis of VL patients in rural areas, particularly where resources are limited

    Transformer-Based Deep Learning Model for Stock Price Prediction: A Case Study on Bangladesh Stock Market

    Full text link
    In modern capital market the price of a stock is often considered to be highly volatile and unpredictable because of various social, financial, political and other dynamic factors. With calculated and thoughtful investment, stock market can ensure a handsome profit with minimal capital investment, while incorrect prediction can easily bring catastrophic financial loss to the investors. This paper introduces the application of a recently introduced machine learning model - the Transformer model, to predict the future price of stocks of Dhaka Stock Exchange (DSE), the leading stock exchange in Bangladesh. The transformer model has been widely leveraged for natural language processing and computer vision tasks, but, to the best of our knowledge, has never been used for stock price prediction task at DSE. Recently the introduction of time2vec encoding to represent the time series features has made it possible to employ the transformer model for the stock price prediction. This paper concentrates on the application of transformer-based model to predict the price movement of eight specific stocks listed in DSE based on their historical daily and weekly data. Our experiments demonstrate promising results and acceptable root mean squared error on most of the stocks.Comment: 16 Pages, 14 Figures (including some containing subfigures

    An effective hotel recommendation system through processing heterogeneous data

    Get PDF
    Recommendation systems have recently gained a lot of popularity in various industries such as entertainment and tourism. They can act as filters of information by providing relevant suggestions to the users through processing heterogeneous data from different networks. Many travelers and tourists routinely rely on textual reviews, numerical ratings, and points of interest to select hotels in cities worldwide. To attract more customers, online hotel booking systems typically rank their hotels based on the recommendations from their customers. In this paper, we present a framework that can rank hotels by analyzing hotels’ customer reviews and nearby amenities. In addition, a framework is presented that combines the scores generated from user reviews and surrounding facilities. We perform experiments using datasets from online hotel booking platforms such as TripAdvisor and Booking to evaluate the effectiveness and applicability of the proposed framework. We first store the keywords extracted from reviews and assign weights to each considered unigram and bigram keywords and, then, we give a numerical score to each considered keyword. Finally, our proposed system aggregates the scores generated from the reviews and surrounding environments from different categories of the facilities. Experimental results confirm the effectiveness of the proposed recommendation framework
    corecore