944 research outputs found

    WAQS : a web-based approximate query system

    Get PDF
    The Web is often viewed as a gigantic database holding vast stores of information and provides ubiquitous accessibility to end-users. Since its inception, the Internet has experienced explosive growth both in the number of users and the amount of content available on it. However, searching for information on the Web has become increasingly difficult. Although query languages have long been part of database management systems, the standard query language being the Structural Query Language is not suitable for the Web content retrieval. In this dissertation, a new technique for document retrieval on the Web is presented. This technique is designed to allow a detailed retrieval and hence reduce the amount of matches returned by typical search engines. The main objective of this technique is to allow the query to be based on not just keywords but also the location of the keywords within the logical structure of a document. In addition, the technique also provides approximate search capabilities based on the notion of Distance and Variable Length Don\u27t Cares. The proposed techniques have been implemented in a system, called Web-Based Approximate Query System, which contains an SQL-like query language called Web-Based Approximate Query Language. Web-Based Approximate Query Language has also been integrated with EnviroDaemon, an environmental domain specific search engine. It provides EnviroDaemon with more detailed searching capabilities than just keyword-based search. Implementation details, technical results and future work are presented in this dissertation

    Design and Analysis of a Dynamically Configured Log-based Distributed Security Event Detection Methodology

    Get PDF
    Military and defense organizations rely upon the security of data stored in, and communicated through, their cyber infrastructure to fulfill their mission objectives. It is essential to identify threats to the cyber infrastructure in a timely manner, so that mission risks can be recognized and mitigated. Centralized event logging and correlation is a proven method for identifying threats to cyber resources. However, centralized event logging is inflexible and does not scale well, because it consumes excessive network bandwidth and imposes significant storage and processing requirements on the central event log server. In this paper, we present a flexible, distributed event correlation system designed to overcome these limitations by distributing the event correlation workload across the network of event-producing systems. To demonstrate the utility of the methodology, we model and simulate centralized, decentralized, and hybrid log analysis environments over three accountability levels and compare their performance in terms of detection capability, network bandwidth utilization, database query efficiency, and configurability. The results show that when compared to centralized event correlation, dynamically configured distributed event correlation provides increased flexibility, a significant reduction in network traffic in low and medium accountability environments, and a decrease in database query execution time in the high-accountability case

    Cloud service discovery and analysis: a unified framework

    Get PDF
    Over the past few years, cloud computing has been more and more attractive as a new computing paradigm due to high flexibility for provisioning on-demand computing resources that are used as services through the Internet. The issues around cloud service discovery have considered by many researchers in the recent years. However, in cloud computing, with the highly dynamic, distributed, the lack of standardized description languages, diverse services offered at different levels and non-transparent nature of cloud services, this research area has gained a significant attention. Robust cloud service discovery approaches will assist the promotion and growth of cloud service customers and providers, but will also provide a meaningful contribution to the acceptance and development of cloud computing. In this dissertation, we have proposed an automated cloud service discovery approach of cloud services. We have also conducted extensive experiments to validate our proposed approach. The results demonstrate the applicability of our approach and its capability of effectively identifying and categorizing cloud services on the Internet. Firstly, we develop a novel approach to build cloud service ontology. Cloud service ontology initially is built based on the National Institute of Standards and Technology (NIST) cloud computing standard. Then, we add new concepts to ontology by automatically analyzing real cloud services based on cloud service ontology Algorithm. We also propose cloud service categorization that use Term Frequency to weigh cloud service ontology concepts and calculate cosine similarity to measure the similarity between cloud services. The cloud service categorization algorithm is able to categorize cloud services to clusters for effective categorization of cloud services. In addition, we use Machine Learning techniques to identify cloud service in real environment. Our cloud service identifier is built by utilizing cloud service features extracted from the real cloud service providers. We determine several features such as similarity function, semantic ontology, cloud service description and cloud services components, to be used effectively in identifying cloud service on the Web. Also, we build a unified model to expose the cloud service’s features to a cloud service search user to ease the process of searching and comparison among a large amount of cloud services by building cloud service’s profile. Furthermore, we particularly develop a cloud service discovery Engine that has capability to crawl the Web automatically and collect cloud services. The collected datasets include meta-data of nearly 7,500 real-world cloud services providers and nearly 15,000 services (2.45GB). The experimental results show that our approach i) is able to effectively build automatic cloud service ontology, ii) is robust in identifying cloud service in real environment and iii) is more scalable in providing more details about cloud services.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 201

    NewsView: A Recommender System for Usenet based on FAST Data Search

    Get PDF
    This thesis combines aspects from two approaches to information access, information filtering and information retrieval, in an effort to improve the signal to noise ratio in interfaces to conversational data. These two ideas are blended into one system by augmenting a search engine indexing Usenet messages with concepts and ideas from recommender systems theory. My aim is to achieve a situation where the overall result relevance is improved by exploiting the qualities of both approaches. Important issues in this context are obtaining ratings, evaluating relevance rankings and the application of useful user profiles. An architecture called NewsView has been designed as part of the work on this thesis. NewsView describes a framework for interfaces to Usenet with information retrieval and information filtering concepts built into it, as well as extensive navigational possibilities within the data. My aim with this framework is to provide a testbed for user interface, information filtering and information retrieval issues, and, most importantly, combinations of the three

    Real-Time Detection System for Suspicious URLs

    Get PDF
    Twitter is prone to malicious tweets containing URLs for spam, phishing, and malware distribution. Conventional Twitter spam detection schemes utilize account features such as the ratio of tweets containing URLs and the account creation date, or relation features in the Twitter graph. These detection schemes are ineffective against feature fabrications or consume much time and resources. Conventional suspicious URL detection schemes utilize several features including lexical features of URLs, URL redirection, HTML content, and dynamic behavior. However, evading techniques such as time-based evasion and crawler evasion exist. In this paper, we propose WARNINGBIRD, a suspicious Real-Time URL detection system for Twitter. Our system investigates correlations of URL redirect chains extracted from several tweets. Because attackers have limited resources and usually reuse them, their URL redirect chains frequently share the same URLs. We develop methods to discover correlated URL redirect chains using the frequently shared URLs and to determine their suspiciousness. We collect numerous tweets from the Twitter public timeline and build a statistical classifier using them. Evaluation results show that our classifier accurately and efficiently detects suspicious URLs

    Engineering an Open Web Syndication Interchange with Discovery and Recommender Capabilities

    Get PDF
    Web syndication has become a popular means of delivering relevant information to people online but the complexity of standards, algorithms and applications pose considerable challenges to engineers.  This paper describes the design and development of a novel Web-based syndication intermediary called InterSynd and a simple Web client as a proof of concept. We developed format-neutral middleware that sits between content sources and the user. Additional objectives were to add feed discovery and recommendation components to the intermediary. A search-based feed discovery module helps users find relevant feed sources. Implicit collaborative recommendations of new feeds are also made to the user. The syndication software built uses open standard XML technologies and the free open source libraries. Extensibility and re-configurability were explicit goals. The experience shows that a modular architecture can combine open source modules to build state-of-the-art syndication middleware and applications. The data produced by software metrics indicate the high degree of modularity retained

    Machining-based coverage path planning for automated structural inspection

    Get PDF
    The automation of robotically delivered nondestructive evaluation inspection shares many aims with traditional manufacture machining. This paper presents a new hardware and software system for automated thickness mapping of large-scale areas, with multiple obstacles, by employing computer-aided drawing (CAD)/computer-aided manufacturing (CAM)-inspired path planning to implement control of a novel mobile robotic thickness mapping inspection vehicle. A custom postprocessor provides the necessary translation from CAM numeric code through robotic kinematic control to combine and automate the overall process. The generalized steps to implement this approach for any mobile robotic platform are presented herein and applied, in this instance, to a novel thickness mapping crawler. The inspection capabilities of the system were evaluated on an indoor mock-inspection scenario, within a motion tracking cell, to provide quantitative performance figures for positional accuracy. Multiple thickness defects simulating corrosion features on a steel sample plate were combined with obstacles to be avoided during the inspection. A minimum thickness mapping error of 0.21 mm and a mean path error of 4.41 mm were observed for a 2 m² carbon steel sample of 10-mm nominal thickness. The potential of this automated approach has benefits in terms of repeatability of area coverage, obstacle avoidance, and reduced path overlap, all of which directly lead to increased task efficiency and reduced inspection time of large structural assets
    • …
    corecore