35 research outputs found

    AUTHENTICATION OF K NEAREST NEIGHBOR QUERY ON ROAD NETWORKS

    Get PDF
    ABSTRACT This work specifically focus on the k-nearest-neighbor (kNN) query verification on road networks and design verification schemes which support both distance verification and path verification. That is the k resulting objects have the shortest distances to the query point among all the objects in the database, and the path from the query point to each knearest-neighbor result is the valid shortest path on the network. In order to verify the kNN query result on a road network, a naĂŻve solution would be to return the whole road network and the point of interest (POI) dataset to the client to show correctness and completeness of the result

    A SECURED SEARCHING IN CLOUD DATA USING CRYPTOGRAPHIC TECHNIQUE

    Get PDF
    ABSTRACT Cloud computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. Trusted applications from untrusted components will be a major aspect of secure cloud computing. Today, many such applications require a proof of result correctness. It has been envisioned as the next generation architecture of IT Enterprise. As it becomes prevalent, more and more sensitive information are being centralized into the cloud. In contrast to traditional solutions, where the IT services are under proper physical, logical and personnel controls, Cloud Computing moves the application software and databases to the large data centers, where the management of the data and services may not be fully trustworthy. For the protection of data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a very challenging task. With the advent of cloud computing, data owners are motivated to outsource their complex data management systems from local sites to commercial public cloud for great flexibility and economic savings. But for protecting data privacy, which obsoletes traditional data utilization based on plaintext keyword search. Multistep processing is commonly used for nearest neighbor (NN) and similarity search in applications involving high dimensional data and/or costly distance computations. Although traditional searchable encryption schemes allow a user to securely search over encrypted data through keywords and selectively retrieve files of interest, these techniques support only exact keyword search. This paper explores various secured searching algorithms such as Advanced Encryption Standard (AES), Secured Socket Layer (SSL), and Knowledge nearest Neighbor (k-NN)

    Dataset for creating pedotransfer functions to estimate organic liquid retention of soils

    Get PDF
    Soil properties characterising pressure-saturation relationships (P-S), such as the fluid retention values or the fitting parameter of retention curves are basic input parameters for simulating the behaviour and transport of nonaqueous phase liquids (NAPLs) in subsurface. Recent investigations have shown the limited applicability of the commonly used estimation methods for predicting NAPL retention values in environmental practice. Alternatively, building pedotransfer functions (PTFs) based on the easily measurable properties of soils might give more accurate and reliable results for estimating hydraulic propertie s of soils and enable the utilisation of the wide range of data incorporated in Hungarian and international datasets. In spite of the availability of several well-established PTFs to predict the water retention of soils only a limited amount of research has been done concerning the NAPL retention of soils. Thus, in our study, data from our recent NAPL and water retention mea surements were collected into a dataset containing the basic soil properties as well. Relationships between basic soil propert ies and fluid retention of soils with water or an organic liquid (Dunasol 180/220) were investigated with principal component analysis. NAPL retention of soil samples were determined with PTFs, based on basic soil properties and their d erived values, and using a scaling method. Result of the statistical analysis (SPSS 13.1) revealed that using PTFs could be a promising alte rnative and could give more accurate results compared to the scaling method both for determining the NAPL saturation or the volumetric NAPL retention values of soils

    Theoretical investigation of the structures of unsupported 38-atom CuPt clusters

    Get PDF
    A genetic algorithm has been used to perform a global sampling of the potential energy surface in the search for the lowest-energy structures of unsupported 38-atom Cu–Pt clusters. Structural details of bimetallic Cu–Pt nanoparticles are analyzed as a function of their chemical composition and the parameters of the Gupta potential, which is used to mimic the interatomic interactions. The symmetrical weighting of all parameters used in this work strongly influences the chemical ordering patterns and, consequently, cluster morphologies. The most stable structures are those corresponding to potentials weighted toward Pt characteristics, leading to Cu–Pt mixing for a weighting factor of 0.7. This reproduces density functional theory (DFT) results for Cu–Pt clusters of this size. For several weighting factor values, the Cu30Pt8 cluster exhibits slightly higher relative stability. The copper-rich Cu32Pt6 cluster was reoptimized at the DFT level to validate the reliability of the empirical approach, which predicts a Pt@Cu core-shell segregated cluster. A general increase of interatomic distances is observed in the DFT calculations, which is greater in the Pt core. After cluster relaxation, structural changes are identified through the pair distribution function. For the majority of weighting factors and compositions, the truncated octahedron geometry is energetically preferred at the Gupta potential level of theory

    Journal of environmental geography : Vol. VII. No 1-2.

    Get PDF

    Proceedings of 11th European Congress on Telepathology and 5th International Congress on Virtual Microscopy

    Get PDF

    Development and application of distributed computing tools for virtual screening of large compound libraries

    Get PDF
    Im derzeitigen Drug Discovery Prozess ist die Identifikation eines neuen Targetproteins und dessen potenziellen Liganden langwierig, teuer und zeitintensiv. Die Verwendung von in silico Methoden gewinnt hier zunehmend an Bedeutung und hat sich als wertvolle Strategie zur Erkennung komplexer Zusammenhänge sowohl im Bereich der Struktur von Proteinen wie auch bei Bioaktivitäten erwiesen. Die zunehmende Nachfrage nach Rechenleistung im wissenschaftlichen Bereich sowie eine detaillierte Analyse der generierten Datenmengen benötigen innovative Strategien für die effiziente Verwendung von verteilten Computerressourcen, wie z.B. Computergrids. Diese Grids ergänzen bestehende Technologien um einen neuen Aspekt, indem sie heterogene Ressourcen zur Verfügung stellen und koordinieren. Diese Ressourcen beinhalten verschiedene Organisationen, Personen, Datenverarbeitung, Speicherungs- und Netzwerkeinrichtungen, sowie Daten, Wissen, Software und Arbeitsabläufe. Das Ziel dieser Arbeit war die Entwicklung einer universitätsweit anwendbaren Grid-Infrastruktur - UVieCo (University of Vienna Condor pool) -, welche für die Implementierung von akademisch frei verfügbaren struktur- und ligandenbasierten Drug Discovery Anwendungen verwendet werden kann. Firewall- und Sicherheitsprobleme wurden mittels eines virtuellen privaten Netzwerkes gelöst, wohingegen die Virtualisierung der Computerhardware über das CoLinux Konzept ermöglicht wurde. Dieses ermöglicht, dass unter Linux auszuführende Aufträge auf Windows Maschinen laufen können. Die Effektivität des Grids wurde durch Leistungsmessungen anhand sequenzieller und paralleler Aufgaben ermittelt. Als Anwendungsbeispiel wurde die Assoziation der Expression bzw. der Sensitivitätsprofile von ABC-Transportern mit den Aktivitätsprofilen von Antikrebswirkstoffen durch Data-Mining des NCI (National Cancer Institute) Datensatzes analysiert. Die dabei generierten Datensätze wurden für liganden-basierte Computermethoden wie Shape-Similarity und Klassifikationsalgorithmen mit dem Ziel verwendet, P-glycoprotein (P-gp) Substrate zu identifizieren und sie von Nichtsubstraten zu trennen. Beim Erstellen vorhersagekräftiger Klassifikationsmodelle konnte das Problem der extrem unausgeglichenen Klassenverteilung durch Verwendung der „Cost-Sensitive Bagging“ Methode gelöst werden. Applicability Domain Studien ergaben, dass unser Modell nicht nur die NCI Substanzen gut vorhersagen kann, sondern auch für wirkstoffähnliche Moleküle verwendet werden kann. Die entwickelten Modelle waren relativ einfach, aber doch präzise genug um für virtuelles Screening einer großen chemischen Bibliothek verwendet werden zu können. Dadurch könnten P-gp Substrate schon frühzeitig erkannt werden, was möglicherweise nützlich sein kann zur Entfernung von Substanzen mit schlechten ADMET-Eigenschaften bereits in einer frühen Phase der Arzneistoffentwicklung. Zusätzlich wurden Shape-Similarity und Self-organizing Map Techniken verwendet um neue Substanzen in einer hauseigenen sowie einer großen kommerziellen Datenbank zu identifizieren, die ähnlich zu selektiven Serotonin-Reuptake-Inhibitoren (SSRI) sind und Apoptose induzieren können. Die erhaltenen Treffer besitzen neue chemische Grundkörper und können als Startpunkte für Leitstruktur-Optimierung in Betracht gezogen werden. Die in dieser Arbeit beschriebenen Studien werden nützlich sein um eine verteilte Computerumgebung zu kreieren die vorhandene Ressourcen in einer Organisation nutzt, und die für verschiedene Anwendungen geeignet ist, wie etwa die effiziente Handhabung der Klassifizierung von unausgeglichenen Datensätzen, oder mehrstufiges virtuelles Screening.In the current drug discovery process, the identification of new target proteins and potential ligands is very tedious, expensive and time-consuming. Thus, use of in silico techniques is of utmost importance and proved to be a valuable strategy in detecting complex structural and bioactivity relationships. Increased demands of computational power for tremendous calculations in scientific fields and timely analysis of generated piles of data require innovative strategies for efficient utilization of distributed computing resources in the form of computational grids. Such grids add a new aspect to the emerging information technology paradigm by providing and coordinating the heterogeneous resources such as various organizations, people, computing, storage and networking facilities as well as data, knowledge, software and workflows. The aim of this study was to develop a university-wide applicable grid infrastructure, UVieCo (University of Vienna Condor pool) which can be used for implementation of standard structure- and ligand-based drug discovery applications using freely available academic software. Firewall and security issues were resolved with a virtual private network setup whereas virtualization of computer hardware was done using the CoLinux concept in a way to run Linux-executable jobs inside Windows machines. The effectiveness of the grid was assessed by performance measurement experiments using sequential and parallel tasks. Subsequently, the association of expression/sensitivity profiles of ABC transporters with activity profiles of anticancer compounds was analyzed by mining the data from NCI (National Cancer Institute). The datasets generated in this analysis were utilized with ligand-based computational methods such as shape similarity and classification algorithms to identify and separate P-gp substrates from non-substrates. While developing predictive classification models, the problem of imbalanced class distribution was proficiently addressed using the cost-sensitive bagging approach. Applicability domain experiment revealed that our model not only predicts NCI compounds well, but it can also be applied to drug-like molecules. The developed models were relatively simple but precise enough to be applicable for virtual screening of large chemical libraries for the early identification of P-gp substrates which can potentially be useful to remove compounds of poor ADMET properties in an early phase of drug discovery. Additionally, shape-similarity and self-organizing maps techniques were used to screen in-house as well as a large vendor database for identification of novel selective serotonin reuptake inhibitor (SSRI) like compounds to induce apoptosis. The retrieved hits possess novel chemical scaffolds and can be considered as a starting point for lead optimization studies. The work described in this thesis will be useful to create distributed computing environment using available resources within an organization and can be applied to various applications such as efficient handling of imbalanced data classification problems or multistep virtual screening approach

    Detecting cyberstalking from social media platform(s) using data mining analytics

    Get PDF
    Cybercrime is an increasing activity that leads to cyberstalking whilst making the use of data mining algorithms to detect or prevent cyberstalking from social media platforms imperative for this study. The aim of this study was to determine the prevalence of cyberstalking on the social media platforms using Twitter. To achieve the objective, machine learning models that perform data mining alongside the security metrics were used to detect cyberstalking from social media platforms. The derived security metrics were used to flag up any suspicious cyberstalking content. Two datasets of detailed tweets were analysed using NVivo and R Programming. The dominant occurrence of cyberstalking was assessed with the induction of fifteen unigrams identified from the preliminary dataset such as “abuse”, “annoying”, “creep or creepy”, “fear”, “follow or followers”, “gender”, “harassment”, “messaging”, “relationships p/p”, “scared”, “stalker”, “technology”, “unwanted”, “victim”, and “violent”. Ordinal regression was used to analyse the use of the fifteen unigrams which were categorised according to degree or relationship/link towards cyberstalking on the platform Twitter. Moreover, two lightweight machine learning algorithms were used for the model performance showcasing cyberstalking indicative content. K Nearest Neighbour and K Means Clustering were both coded in R computer language for the extraction, refined, analysation and visualisation process for this research. Results showed the emotional terms like “bad”, “sad” and “hate” were attached to the unigrams being linked to cyberstalking. Each emotional term was flagged up in correspondence with one of the fifteen unigrams in tweets that correlate cyberstalking indicative content, proving one must accompany the other. K Means Clustering results showed the two terms “bad” and “sad” were shown within 100 percent of the clustering results and the term “hate” was only seen within 60 percent of the results. Results also revealed that the accuracy of the KNN algorithm was up to 40% in predicting key terms-based cyberstalking content in a real Twitter dataset consisting of 1m data points. This study emphasises the continuous relationship between the fifteen unigrams, emotional terms, and tweets within numerous datasets portrayed in this research, and reveals a general picture that cyberstalking indicative content in fact happens on Twitter at a vast rate with the corresponding links or relationships within the detection of cyberstalking
    corecore