Search CORE

271 research outputs found

Feature Set Selection for Improved Classification of Static Analysis Alerts

Author: Goeschel Kathleen
Publication venue: NSUWorks
Publication date: 01/01/2019
Field of study

With the extreme growth in third party cloud applications, increased exposure of applications to the internet, and the impact of successful breaches, improving the security of software being produced is imperative. Static analysis tools can alert to quality and security vulnerabilities of an application; however, they present developers and analysts with a high rate of false positives and unactionable alerts. This problem may lead to the loss of confidence in the scanning tools, possibly resulting in the tools not being used. The discontinued use of these tools may increase the likelihood of insecure software being released into production. Insecure software can be successfully attacked resulting in the compromise of one or several information security principles such as confidentiality, availability, and integrity. Feature selection methods have the potential to improve the classification of static analysis alerts and thereby reduce the false positive rates. Thus, the goal of this research effort was to improve the classification of static analysis alerts by proposing and testing a novel method leveraging feature selection. The proposed model was developed and subsequently tested on three open source PHP applications spanning several years. The results were compared to a classification model utilizing all features to gauge the classification improvement of the feature selection model. The model presented did result in the improved classification accuracy and reduction of the false positive rate on a reduced feature set. This work contributes a real-world static analysis dataset based upon three open source PHP applications. It also enhanced an existing data set generation framework to include additional predictive software features. However, the main contribution is a feature selection methodology that may be used to discover optimal feature sets that increase the classification accuracy of static analysis alerts

NSU Works

Distributed Anomaly Detection and Prevention for Virtual Platforms

Author: Jehangiri Ali Imran
Publication venue
Publication date: 17/07/2015
Field of study

Georg-August-University Göttingen

Predicting companies stock price direction by using sentiment analysis of news articles

Author: Chitkushev Lou
Dodevska Lodi
Gjorgjevikj Ana
Mishev Kostadin
Petreski Victor
Trajanov Dimitar
Vodenska Irena
Publication venue: CSECS
Publication date: 19/07/2019
Field of study

This paper summarizes our experience teaching several courses at Metropolitan College of Boston University Computer Science department over five years. A number of innovative teaching techniques are presented in this paper. We specifically address the role of a project archive, when designing a course. This research paper explores survey results from every running of courses, from 2014 to 2019. During each class, students participated in two distinct surveys: first, dealing with key learning outcomes, and, second, with teaching techniques used. This paper makes several practical recommendations based on the analysis of collected data. The research validates the value of a sound repository of technical term projects and the role such repository plays in effective teaching and learning of computer science courses.Published versio

Boston University Institutional Repository (OpenBU)

Atas das Oitavas Jornadas de Informática da Universidade de Évora

Author: Caldeira Carlos Pampulim
Coelho Francisco
Publication venue: 'Universidade de Evora'
Publication date: 01/03/2018
Field of study

Atas das Oitavas Jornadas de Informática da Universidade de Évora realizadas em Março de 2018

Repositório Científico da Universidade de Évora

Framework of Six Sigma implementation analysis on SMEs in Malaysia for information technology services, products and processes

Author: Wong Whee Yen
Publication venue
Publication date: 25/07/2015
Field of study

For the past two decades, the majority of Malaysia’s IT companies have been widely adopting a Quality Assurance (QA) approach as a basis for self-improvement and internal-assessment in IT project management. Quality Control (QC) is a comprehensive top-down observation approach used to fulfill requirements for quality outputs which focuses on the aspect of process outputs evaluation. However in the Malaysian context, QC and combination of QA and QC as a means of quality improvement approaches have not received significant attention. This research study aims to explore the possibility of integrating QC and QA+QC approaches through Six Sigma quality management standard to provide tangible and measureable business results by continuous process improvement to boost customer satisfactions. The research project adopted an exploratory case study approach on three Malaysian IT companies in the business area of IT Process, IT Service and IT Product. Semi-structured interviews, online surveys, self-administered questionnaires, job observations, document analysis and on-the-job-training are amongst the methodologies employed in these case studies. These collected data and viewpoints along with findings from an extensive literature review were used to benchmark quality improvement initiatives, best practices and to develop a Six Sigma framework for the context of the SMEs in the Malaysian IT industry. This research project contributed to both the theory and practice of implementing and integrating Six Sigma in IT products, services and processes. The newly developed framework has been proven capable of providing a general and fundamental start-up decision by demonstrating how a company with and without formal QIM can be integrated and implemented with Six Sigma practices to close the variation gap between QA and QC. This framework also takes into consideration those companies with an existing QIM for a new face-lift migration without having to drop their existing QIM. This can be achieved by integrating a new QIM which addresses most weaknesses of the current QIM while retaining most of the current business routine strengths. This framework explored how Six Sigma can be expanded and extended to include secondary external factors that are critical to successful QIM implementation. A vital segment emphasizes Six Sigma as a QA+QC approach in IT processes; and the ability to properly manage IT processes will result in overall performance improvement to IT Products and IT Services. The developed Six Sigma implementation framework can serve as a baseline for SMEs to better manage, control and track business performance and product quality; and at the same time creates clearer insights and un-biased views of Six Sigma implementation onto the IT industries to drive towards operational excellence

Nottingham eTheses

Advanced Information Systems and Technologies

Author
Publication venue: 'Sumy State University'
Publication date: 01/01/2018
Field of study

This book comprises the proceedings of the VI International Scientific Conference “Advanced Information Systems and Technologies, AIST-2018”. The proceeding papers cover issues related to system analysis and modeling, project management, information system engineering, intelligent data processing, computer networking and telecomunications, modern methods and information technologies of sustainable development. They will be useful for students, graduate students, researchers who interested in computer science

Framework of Six Sigma implementation analysis on SMEs in Malaysia for information technology services, products and processes

Author: Wong Whee Yen
Publication venue
Publication date
Field of study

Nottingham ePrints

Layout Optimization for Distributed Relational Databases Using Machine Learning

Author: Patvarczki Jozsef
Publication venue: Digital WPI
Publication date: 23/05/2012
Field of study

A common problem when running Web-based applications is how to scale-up the database. The solution to this problem usually involves having a smart Database Administrator determine how to spread the database tables out amongst computers that will work in parallel. Laying out database tables across multiple machines so they can act together as a single efficient database is hard. Automated methods are needed to help eliminate the time required for database administrators to create optimal configurations. There are four operators that we consider that can create a search space of possible database layouts: 1) denormalizing, 2) horizontally partitioning, 3) vertically partitioning, and 4) fully replicating. Textbooks offer general advice that is useful for dealing with extreme cases - for instance you should fully replicate a table if the level of insert to selects is close to zero. But even this seemingly obvious statement is not necessarily one that will lead to a speed up once you take into account that some nodes might be a bottle neck. There can be complex interactions between the 4 different operators which make it even more difficult to predict what the best thing to do is. Instead of using best practices to do database layout, we need a system that collects empirical data on when these 4 different operators are effective. We have implemented a state based search technique to try different operators, and then we used the empirically measured data to see if any speed up occurred. We recognized that the costs of creating the physical database layout are potentially large, but it is necessary since we want to know the Ground Truth about what is effective and under what conditions. After creating a dataset where these four different operators have been applied to make different databases, we can employ machine learning to induce rules to help govern the physical design of the database across an arbitrary number of computer nodes. This learning process, in turn, would allow the database placement algorithm to get better over time as it trains over a set of examples. What this algorithm calls for is that it will try to learn 1) What is a good database layout for a particular application given a query workload? and 2) Can this algorithm automatically improve itself in making recommendations by using machine learned rules to try to generalize when it makes sense to apply each of these operators? There has been considerable research done in parallelizing databases where large amounts of data are shipped from one node to another to answer a single query. Sometimes the costs of shipping the data back and forth might be high, so in this work we assume that it might be more efficient to create a database layout where each query can be answered by a single node. To make this assumption requires that all the incoming query templates are known beforehand. This requirement can easily be satisfied in the case of a Web-based application due to the characteristic that users typically interact with the system through a web interface such as web forms. In this case, unseen queries are not necessarily answerable, without first possibly reconstructing the data on a single machine. Prior knowledge of these exact query templates allows us to select the best possible database table placements across multiple nodes. But in the case of trying to improve the efficiency of a Web-based application, a web site provider might feel that they are willing to suffer the inconvenience of not being able to answer an arbitrary query, if they are in turn provided with a system that runs more efficiently

DigitalCommons@WPI

2019 EC3 July 10-12, 2019 Chania, Crete, Greece

Author: Chassiakos A.
Hall D.
O'Donnell J.
Rovas D.V
Publication venue: European Council on Computing in Construction
Publication date: 01/01/2019
Field of study

UCL Discovery