51 research outputs found

    A Knowledge-Based Model For Context-Aware Smart Service Systems

    Get PDF
    The advancement of the Internet of Things, big data, and mobile computing leads to the need for smart services that enable the context awareness and the adaptability to their changing contexts. Today, designing a smart service system is a complex task due to the lack of an adequate model support in awareness and pervasive environment. In this paper, we present the concept of a context-aware smart service system and propose a knowledge model for context-aware smart service systems. The proposed model organizes the domain and context-aware knowledge into knowledge components based on the three levels of services: Services, Service system, and Network of service systems. The knowledge model for context-aware smart service systems integrates all the information and knowledge related to smart services, knowledge components, and context awareness that can play a key role for any framework, infrastructure, or applications deploying smart services. In order to demonstrate the approach, two case studies about chatbot as context-aware smart services for customer support are presented

    KHAI THÁC TẬP MỤC LỢI ÍCH CAO CÓ LỢI NHUẬN ÂM TRONG CƠ SỞ DỮ LIỆU PHÂN TÁN DỌC

    Get PDF
    High Utility Itemset (HUI) mining is an important problem in the data mining literature that considers the utilities for businesses of items (such as profits and margins) that are discovered from transactional databases. There are many algorithms for mining high utility itemsets (HUIs) by pruning candidates based on estimated and transaction-weighted utilization values. These algorithms aim to reduce the search space. In this paper, we propose a method for mining HUIs with negative unit profits from vertically distributed databases. This method does not integrate databases from the relevant local databases to form a centralized database. Experiments show that the run-time of this method is more efficient than that of the centralized database.Tập lợi ích cao (TLIC) là một vấn đề quan trọng trong khai phá dữ liệu, xem xét các lợi ích của các mục (chẳng hạn như lợi nhuận và lãi suất) được khám phá từ cơ sở dữ liệu (CSDL) giao dịch hỗ trợ cho việc kinh doanh của các đơn vị. Bài báo trình bày một phương pháp khai thác tập lợi ích cao có lợi nhuận âm trên CSDL phân tán dọc. Việc khai thác tập lợi ích cao đã được nghiên cứu và công bố rộng rãi trong những năm gần đây. Có nhiều thuật toán khai thác các tập lợi ích cao (TLIC) bằng cách cắt tỉa các ứng cử viên dựa trên các giá trị lợi ích và dựa trên các giá trị sử dụng có trọng số giao dịch. Các thuật toán này đều hướng tới mục đích làm giảm không gian tìm kiếm. Trong bài báo này, chúng tôi đề xuất một phương pháp khai thác tập lợi ích cao có lợi nhuận âm (TLIC-TSA) từ CSDL phân tán dọc. Phương pháp này không tích hợp CSDL từ CSDL cục bộ của các bên tham gia để hình thành CSDL tập trung và chỉ thực hiện việc quét các CSDL mỗi bên tham gia một lần. Các thí nghiệm cho thấy thời gian chạy của phương pháp này hiệu quả hơn so với khai thác trên cơ sở dữ liệu tập trung

    Efficient streaming algorithms for maximizing monotone DR-submodular function on the integer lattice

    Get PDF
    In recent years, the issue of maximizing submodular functions has attracted much interest from research communities. However, most submodular functions are specified in a set function. Meanwhile, recent advancements have been studied for maximizing a diminishing return submodular (DR-submodular) function on the integer lattice. Because plenty of publications show that the DR-submodular function has wide applications in optimization problems such as sensor placement impose problems, optimal budget allocation, social network, and especially machine learning. In this research, we propose two main streaming algorithms for the problem of maximizing a monotone DR-submodular function under cardinality constraints. Our two algorithms, which are called StrDRS1 and StrDRS2, have (1/2 - epsilon) , (1 - 1 /e - epsilon) of approximation ratios and O(n/epsilon log(log B/epsilon ) log k), O(n/epsilon log B), respectively. We conducted several experiments to investigate the performance of our algorithms based on the budget allocation problem over the bipartite influence model, an instance of the monotone submodular function maximization problem over the integer lattice. The experimental results indicate that our proposed algorithms not only provide solutions with a high value of the objective function, but also outperform the state-of-the-art algorithms in terms of both the number of queries and the running time.Web of Science1020art. no. 377

    Оцінювання розміру PHP-застосунків з відкритим кодом за нелінійними регресійними моделями з різними факторами

    Get PDF
    Приходько, С. Б. Оцінювання розміру PHP-застосунків з відкритим кодом за нелінійними регресійними моделями з різними факторами = Estimating the size of open-source PHP-based apps by nonlinear regression models with various factors / С. Б. Приходько, М. В. Ворона // Зб. наук. пр. НУК. – Миколаїв : НУК, 2021. – № 1 (484). – С. 92–98.Анотація. Проблема оцінювання розміру програмного забезпечення (ПЗ) на ранній стадії програмного проекту є важливою, оскільки оцінка розміру програмного забезпечення використовується для прогнозування трудомісткості розробки ПЗ, включаючи PHP-застосунки з відкритим кодом. Метою роботи є підвищення точності оцінювання розміру PHP-застосунків з відкритим кодом. Об’єктом дослідження є процес оцінювання розміру PHP-застосунків з відкритим кодом. Предметом дослідження є трьох-факторні моделі нелінійної регресії з різними факторами для оцінювання розміру PHP-застосунків з відкритим кодом. Для побудови трьохфакторних моделей нелінійної регресії ми використовуємо метод, заснований на багатовимірних нормалізуючих перетвореннях та інтервалах прогнозування. Ці моделі побудовані на основі чотирьох-вимірного перетворенні Джонсона для сімейства SB негаусового набору даних із 44 застосунків, розміщених на GitHub. Набір даних був отриманий за допомогою інструмента PhpMetrics (https://phpmetrics.org/). Трьох-факторні моделі нелінійної регресії побудовані за метриками діаграми класів: кількість класів, середня кількість методів на клас, сума середнього аферентного та еферентного зв’язків на клас, середнє значення DIT (глибина дерева успадкування) на клас. Для порівняння точності прогнозування трьох-факторних нелінійних регресійних моделей ми використовували відомі показники точності прогнозування, такі як множинний коефіцієнт детермінації R2 , середня величина відносної похибки MMRE та відсоток прогнозування на рівні величини відносної помилки 0,25, PRED(0,25). Нелінійна регресійна модель, що побудована навколо кількості класів, середньої кількості методів на клас, середнього значення DIT на клас, має більше значення PRED(0,25) та приблизно однакові значення R2 та MMRE, що і модель, в якій третім фактором є сума середнього аферентного та еферентного зв’язків на клас. Наукова новизна отриманих результатів полягає в тому, що удосконалена трьох-факторна нелінійна регресійна модель для оцінювання розміру PHP-застосунків з відкритим кодом шляхом введення нового фактору – середнього значення DIT на клас. Це дозволило збільшити значення PRED(0,25) на 8%. Практична значимість отриманих результатів полягає у розробці ПЗ, що реалізує побудовану модель, sci-мовою для Scilab.Abstract. The problem of estimating the software size in the early stage of a software project is important because a software size estimate is used for predicting the software development efforts, including open-source PHP-based apps. The purpose of the work is to increase the prediction accuracy of early software size estimation of open-source PHPbased apps. The object of study is the process of estimating the software size of open-source PHP-based apps. The subject of study is the three-factor nonlinear regression models with various factors to estimate the software size of open-source PHP-based apps. To build the three-factor nonlinear regression models we use the technique based on the multivariate normalizing transformations and prediction intervals. These models are constructed based on the Johnson four-variate normalizing transformation for SB family of the non-Gaussian data set from 44 apps hosted on GitHub. The data set was obtained using the PhpMetrics tool (https://phpmetrics.org/). The three-factor nonlinear regression models are built around the metrics of class diagrams: the number of classes, the average number of methods per class, the sum of average afferent coupling and average efferent coupling per class, DIT (depth of inheritance tree) mean per class. To compare the prediction accuracy of the three-factor nonlinear regression models we used the well-known prediction accuracy metrics such as a multiple coefficient of determination R2 , a mean magnitude of relative error MMRE, and prediction percentage at the level of magnitude of relative error of 0.25, PRED(0.25). The nonlinear regression model constructed around the number of classes, the average number of methods per class, DIT mean per class has the larger PRED(0.25) value and about the same values of R2 and MMRE that the model in which the third factor is the sum of average afferent coupling and average efferent coupling per class. The scientific novelty of obtained results is that the three-factor nonlinear regression model for estimating the software size of open-source PHP-based apps has been improved by introducing a new factor – the DIT mean per class. This allowed us to increase the PRED(0.25) value by 8%. The practical importance of obtained results is that the software realizing the constructed model is developed in the sci-language for Scilab

    A HEDGE ALGEBRAS BASED CLASSIFICATION REASONING METHOD WITH MULTI-GRANULARITY FUZZY PARTITIONING

    Get PDF
    During last years, lots of the fuzzy rule based classifier (FRBC) design methods have been proposed to improve the classification accuracy and the interpretability of the proposed classification models. Most of them are based on the fuzzy set theory approach in such a way that the fuzzy classification rules are generated from the grid partitions combined with the pre-designed fuzzy partitions using fuzzy sets. Some mechanisms are studied to automatically generate fuzzy partitions from data such as discretization, granular computing, etc. Even those, linguistic terms are intuitively assigned to fuzzy sets because there is no formalisms to link inherent semantics of linguistic terms to fuzzy sets. In view of that trend, genetic design methods of linguistic terms along with their (triangular and trapezoidal) fuzzy sets based semantics for FRBCs, using hedge algebras as the mathematical formalism, have been proposed. Those hedge algebras-based design methods utilize semantically quantifying mapping values of linguistic terms to generate their fuzzy sets based semantics so as to make use of fuzzy sets based-classification reasoning methods proposed in design methods based on fuzzy set theoretic approach for data classification. If there exists a classification reasoning method which bases merely on semantic parameters of hedge algebras, fuzzy sets-based semantics of the linguistic terms in fuzzy classification rule bases can be replaced by semantics - based hedge algebras. This paper presents a FRBC design method based on hedge algebras approach by introducing a hedge algebra- based classification reasoning method with multi-granularity fuzzy partitioning for data classification so that the semantic of linguistic terms in rule bases can be hedge algebras-based semantics. Experimental results over 17 real world datasets are compared to existing methods based on hedge algebras and the state-of-the-art fuzzy sets theoretic-based approaches, showing that the proposed FRBC in this paper is an effective classifier and produces good results
    corecore