202 research outputs found
Hissiyat Odaklı ağ tarama
TÃœBÄ°TAK EEEAG Proje01.05.201
Automating price matching on e-commerce websites using natural language processing : a postgraduate project dissertation presented in partial fulfilment of the requirements for the degree of Masters in Information Technology at Massey University, Auckland, New Zealand
With the development of internet, shopping online has become an important part in our
daily life. Global B2C e-commerce turnover grew by 24.0% to reach 1,943 billion dollars
in 2014. Not only customers need to face a great amount of information while shopping
online, the companies also need to catch the information from their competitors. There
is a case which a company wanted to realize was a simple way for them to monitor the
prices of equivalent products on competitor’s website. Base on the development of the
E-commerce platform, after analyzing the requirement of companies and customer, we
propose a frame of E-commerce website data extraction, data storage and production
matching. We build up a customized web crawler to crawl the production on E-commerce
website and extract the production detail for matching. Finally we got average 87.18%
matching rate after applying enhance TF/IDF algorithm with weight adjustment
Random Projection in Deep Neural Networks
This work investigates the ways in which deep learning methods can benefit
from random projection (RP), a classic linear dimensionality reduction method.
We focus on two areas where, as we have found, employing RP techniques can
improve deep models: training neural networks on high-dimensional data and
initialization of network parameters. Training deep neural networks (DNNs) on
sparse, high-dimensional data with no exploitable structure implies a network
architecture with an input layer that has a huge number of weights, which often
makes training infeasible. We show that this problem can be solved by
prepending the network with an input layer whose weights are initialized with
an RP matrix. We propose several modifications to the network architecture and
training regime that makes it possible to efficiently train DNNs with learnable
RP layer on data with as many as tens of millions of input features and
training examples. In comparison to the state-of-the-art methods, neural
networks with RP layer achieve competitive performance or improve the results
on several extremely high-dimensional real-world datasets. The second area
where the application of RP techniques can be beneficial for training deep
models is weight initialization. Setting the initial weights in DNNs to
elements of various RP matrices enabled us to train residual deep networks to
higher levels of performance
FuzzTheREST - Intelligent Automated Blackbox RESTful API Fuzzer
In recent years, the pervasive influence of technology has deeply intertwined with human life, impacting diverse fields. This relationship has evolved into a dependency, with software systems playing a pivotal role, necessitating a high level of trust. Today, a substantial portion of software is accessed through Application Programming Interfaces, particularly web APIs, which predominantly adhere to the Representational State Transfer architecture. However, this architectural choice introduces a wide range of potential vulnerabilities, which are available and accessible at a network level. The significance of Software testing becomes evident when considering the widespread use of software in various daily tasks that impact personal safety and security, making the identification and assessment of faulty software of paramount importance. In this thesis, FuzzTheREST, a black-box RESTful API fuzzy testing framework, is introduced with the primary aim of addressing the challenges associated with understanding the context of each system under test and conducting comprehensive automated testing using diverse inputs. Operating from a black-box perspective, this fuzzer leverages Reinforcement Learning to efficiently uncover vulnerabilities in RESTful APIs by optimizing input values and combinations, relying on mutation methods for input exploration. The system's value is further enhanced through the provision of a thoroughly documented vulnerability discovery process for the user. This proposal stands out for its emphasis on explainability and the application of RL to learn the context of each API, thus eliminating the necessity for source code knowledge and expediting the testing process. The developed solution adheres rigorously to software engineering best practices and incorporates a novel Reinforcement Learning algorithm, comprising a customized environment for API Fuzzy Testing and a Multi-table Q-Learning Agent. The quality and applicability of the tool developed are also assessed, relying on the results achieved on two case studies, involving the Petstore API and an Emotion Detection module which was part of the CyberFactory#1 European research project. The results demonstrate the tool's effectiveness in discovering vulnerabilities, having found 7 different vulnerabilities and the agents' ability to learn different API contexts relying on API responses while maintaining reasonable code coverage levels.Ultimamente, a influência da tecnologia espalhou-se pela vida humana de uma forma abrangente, afetando uma grande diversidade dos seus aspetos. Com a evolução tecnológica esta acabou por se tornar uma dependência. Os sistemas de software começam assim a desempenhar um papel crucial, o que em contrapartida obriga a um elevado grau de confiança. Atualmente, uma parte substancial do software é implementada em formato de Web APIs, que na sua maioria seguem a arquitetura de transferência de estado representacional. No entanto, esta introduz uma série vulnerabilidade. A importância dos testes de software torna-se evidente quando consideramos o amplo uso de software em várias tarefas diárias que afetam a segurança, elevando ainda mais a importância da identificação e mitigação de falhas de software. Nesta tese é apresentado o FuzzTheREST, uma framework de teste fuzzy de APIs RESTful num modelo caixa preta, com o objetivo principal de abordar os desafios relacionados com a compreensão do contexto de cada sistema sob teste e a realização de testes automatizados usando uma variedade de possÃveis valores. Este fuzzer utiliza aprendizagem por reforço de forma a compreender o contexto da API que está sob teste de forma a guiar a geração de valores de teste, recorrendo a métodos de mutação, para descobrir vulnerabilidades nas mesmas. Todo o processo desempenhado pelo sistema é devidamente documentado para que o utilizador possa tomar ações mediante os resultados obtidos. Esta explicabilidade e aplicação de inteligência artificial para aprender o contexto de cada API, eliminando a necessidade de analisar código fonte e acelerando o processo de testagem, enaltece e distingue a solução proposta de outras. A solução desenvolvida adere estritamente à s melhores práticas de engenharia de software e inclui um novo algoritmo de aprendizagem por reforço, que compreende um ambiente personalizado para testagem Fuzzy de APIs e um Agente de QLearning com múltiplas Q-tables. A qualidade e aplicabilidade da ferramenta desenvolvida também são avaliadas com base nos resultados obtidos em dois casos de estudo, que envolvem a conhecida API Petstore e um módulo de Deteção de Emoções que fez parte do projeto de investigação europeu CyberFactory#1. Os resultados demonstram a eficácia da ferramenta na descoberta de vulnerabilidades, tendo identificado 7 vulnerabilidades distintas, e a capacidade dos agentes em aprender diferentes contextos de API com base nas respostas da mesma, mantendo nÃveis de cobertura aceitáveis
Recommended from our members
Multi agent system for web database processing, on data extraction from online social networks.
In recent years, there has been a
ood of continuously changing information
from a variety of web resources such as web databases, web sites,
web services and programs. Online Social Networks (OSNs) represent
such a eld where huge amounts of information are being posted online
over time. Due to the nature of OSNs, which o er a productive source
for qualitative and quantitative personal information, researchers from
various disciplines contribute to developing methods for extracting data
from OSNs. However, there is limited research which addresses extracting
data automatically. To the best of the author's knowledge, there
is no research which focuses on tracking the real time changes of information
retrieved from OSN pro les over time and this motivated the
present work.
This thesis presents di erent approaches for automated Data Extraction
(DE) from OSN: crawler, parser, Multi Agent System (MAS) and Application
Programming Interface (API). Initially, a parser was implemented
as a centralized system to traverse the OSN graph and extract the pro-
le's attributes and list of friends from Myspace, the top OSN at that
time, by parsing the Myspace pro les and extracting the relevant tokens
from the parsed HTML source les. A Breadth First Search (BFS) algorithm
was used to travel across the generated OSN friendship graph
in order to select the next pro le for parsing. The approach was implemented
and tested on two types of friends: top friends and all friends.
In case of top friends, 500 seed pro les have been visited; 298 public
pro les were parsed to get 2197 top friends pro les and 2747 friendship
edges, while in case of all friends, 250 public pro les have been parsed
to extract 10,196 friends' pro les and 17,223 friendship edges.
This approach has two main limitations. The system is designed as
a centralized system that controlled and retrieved information of each
user's pro le just once. This means that the extraction process will stop
if the system fails to process one of the pro les; either the seed pro le
( rst pro le to be crawled) or its friends. To overcome this problem,
an Online Social Network Retrieval System (OSNRS) is proposed to
decentralize the DE process from OSN through using MAS. The novelty
of OSNRS is its ability to monitor pro les continuously over time.
The second challenge is that the parser had to be modi ed to cope with
changes in the pro les' structure. To overcome this problem, the proposed
OSNRS is improved through use of an API tool to enable OSNRS
agents to obtain the required elds of an OSN pro le despite modi cations
in the representation of the pro le's source web pages. The experimental
work shows that using API and MAS simpli es and speeds up the
process of tracking a pro le's history. It also helps security personnel,
parents, guardians, social workers and marketers in understanding the
dynamic behaviour of OSN users. This thesis proposes solutions for web
database processing on data extraction from OSNs by the use of parser
and MAS and discusses the limitations and improvements.Taibah Universit
Adaptive and learning-based formation control of swarm robots
Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
Using MapReduce Streaming for Distributed Life Simulation on the Cloud
Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
- …