500 research outputs found

    Failure-awareness and dynamic adaptation in data scheduling

    Get PDF
    Over the years, scientific applications have become more complex and more data intensive. Especially large scale simulations and scientific experiments in areas such as physics, biology, astronomy and earth sciences demand highly distributed resources to satisfy excessive computational requirements. Increasing data requirements and the distributed nature of the resources made I/O the major bottleneck for end-to-end application performance. Existing systems fail to address issues such as reliability, scalability, and efficiency in dealing with wide area data access, retrieval and processing. In this study, we explore data-intensive distributed computing and study challenges in data placement in distributed environments. After analyzing different application scenarios, we develop new data scheduling methodologies and the key attributes for reliability, adaptability and performance optimization of distributed data placement tasks. Inspired by techniques used in microprocessor and operating system architectures, we extend and adapt some of the known low-level data handling and optimization techniques to distributed computing. Two major contributions of this work include (i) a failure-aware data placement paradigm for increased fault-tolerance, and (ii) adaptive scheduling of data placement tasks for improved end-to-end performance. The failure-aware data placement includes early error detection, error classification, and use of this information in scheduling decisions for the prevention of and recovery from possible future errors. The adaptive scheduling approach includes dynamically tuning data transfer parameters over wide area networks for efficient utilization of available network capacity and optimized end-to-end data transfer performance

    Wind Power in China: Has China Greenwashed the Global Energy Sector?

    Get PDF
    Due to years of serving as the world’s manufacturing hub, and appropriately developing a global reputation of being environmentally-unfriendly, China has since sought to rejuvenate its image by becoming an international leader in the realm of wind power. However, if one were to pull back the curtain on China’s wind energy program,they would find that the Chinese Communist Party may be intentionally putting on a facade. By reporting the number of wind turbines constructed nationally, but not the number of turbines actually generating electricity, China has hoodwinked the worldwide energy sector and general public. The idle wind turbines cost China not only valuable energy, but also roughly $2 Billion annually in forgone revenue. In order to achieve wind success, China must deal first with its most significant bottleneck: an outdated and inefficient energy grid infrastructure. By choosing to build wind turbines before constructing a Smart Grid to support their energy transmission, China has shot itself in the foot environmentally, geopolitically, and economically. This thesis exposes an important defect in China’s energy system and will be of use to all with interests in business ethics, the environment, engineering, and economists alike

    Development and application of distributed computing tools for virtual screening of large compound libraries

    Get PDF
    Im derzeitigen Drug Discovery Prozess ist die Identifikation eines neuen Targetproteins und dessen potenziellen Liganden langwierig, teuer und zeitintensiv. Die Verwendung von in silico Methoden gewinnt hier zunehmend an Bedeutung und hat sich als wertvolle Strategie zur Erkennung komplexer Zusammenhänge sowohl im Bereich der Struktur von Proteinen wie auch bei Bioaktivitäten erwiesen. Die zunehmende Nachfrage nach Rechenleistung im wissenschaftlichen Bereich sowie eine detaillierte Analyse der generierten Datenmengen benötigen innovative Strategien für die effiziente Verwendung von verteilten Computerressourcen, wie z.B. Computergrids. Diese Grids ergänzen bestehende Technologien um einen neuen Aspekt, indem sie heterogene Ressourcen zur Verfügung stellen und koordinieren. Diese Ressourcen beinhalten verschiedene Organisationen, Personen, Datenverarbeitung, Speicherungs- und Netzwerkeinrichtungen, sowie Daten, Wissen, Software und Arbeitsabläufe. Das Ziel dieser Arbeit war die Entwicklung einer universitätsweit anwendbaren Grid-Infrastruktur - UVieCo (University of Vienna Condor pool) -, welche für die Implementierung von akademisch frei verfügbaren struktur- und ligandenbasierten Drug Discovery Anwendungen verwendet werden kann. Firewall- und Sicherheitsprobleme wurden mittels eines virtuellen privaten Netzwerkes gelöst, wohingegen die Virtualisierung der Computerhardware über das CoLinux Konzept ermöglicht wurde. Dieses ermöglicht, dass unter Linux auszuführende Aufträge auf Windows Maschinen laufen können. Die Effektivität des Grids wurde durch Leistungsmessungen anhand sequenzieller und paralleler Aufgaben ermittelt. Als Anwendungsbeispiel wurde die Assoziation der Expression bzw. der Sensitivitätsprofile von ABC-Transportern mit den Aktivitätsprofilen von Antikrebswirkstoffen durch Data-Mining des NCI (National Cancer Institute) Datensatzes analysiert. Die dabei generierten Datensätze wurden für liganden-basierte Computermethoden wie Shape-Similarity und Klassifikationsalgorithmen mit dem Ziel verwendet, P-glycoprotein (P-gp) Substrate zu identifizieren und sie von Nichtsubstraten zu trennen. Beim Erstellen vorhersagekräftiger Klassifikationsmodelle konnte das Problem der extrem unausgeglichenen Klassenverteilung durch Verwendung der „Cost-Sensitive Bagging“ Methode gelöst werden. Applicability Domain Studien ergaben, dass unser Modell nicht nur die NCI Substanzen gut vorhersagen kann, sondern auch für wirkstoffähnliche Moleküle verwendet werden kann. Die entwickelten Modelle waren relativ einfach, aber doch präzise genug um für virtuelles Screening einer großen chemischen Bibliothek verwendet werden zu können. Dadurch könnten P-gp Substrate schon frühzeitig erkannt werden, was möglicherweise nützlich sein kann zur Entfernung von Substanzen mit schlechten ADMET-Eigenschaften bereits in einer frühen Phase der Arzneistoffentwicklung. Zusätzlich wurden Shape-Similarity und Self-organizing Map Techniken verwendet um neue Substanzen in einer hauseigenen sowie einer großen kommerziellen Datenbank zu identifizieren, die ähnlich zu selektiven Serotonin-Reuptake-Inhibitoren (SSRI) sind und Apoptose induzieren können. Die erhaltenen Treffer besitzen neue chemische Grundkörper und können als Startpunkte für Leitstruktur-Optimierung in Betracht gezogen werden. Die in dieser Arbeit beschriebenen Studien werden nützlich sein um eine verteilte Computerumgebung zu kreieren die vorhandene Ressourcen in einer Organisation nutzt, und die für verschiedene Anwendungen geeignet ist, wie etwa die effiziente Handhabung der Klassifizierung von unausgeglichenen Datensätzen, oder mehrstufiges virtuelles Screening.In the current drug discovery process, the identification of new target proteins and potential ligands is very tedious, expensive and time-consuming. Thus, use of in silico techniques is of utmost importance and proved to be a valuable strategy in detecting complex structural and bioactivity relationships. Increased demands of computational power for tremendous calculations in scientific fields and timely analysis of generated piles of data require innovative strategies for efficient utilization of distributed computing resources in the form of computational grids. Such grids add a new aspect to the emerging information technology paradigm by providing and coordinating the heterogeneous resources such as various organizations, people, computing, storage and networking facilities as well as data, knowledge, software and workflows. The aim of this study was to develop a university-wide applicable grid infrastructure, UVieCo (University of Vienna Condor pool) which can be used for implementation of standard structure- and ligand-based drug discovery applications using freely available academic software. Firewall and security issues were resolved with a virtual private network setup whereas virtualization of computer hardware was done using the CoLinux concept in a way to run Linux-executable jobs inside Windows machines. The effectiveness of the grid was assessed by performance measurement experiments using sequential and parallel tasks. Subsequently, the association of expression/sensitivity profiles of ABC transporters with activity profiles of anticancer compounds was analyzed by mining the data from NCI (National Cancer Institute). The datasets generated in this analysis were utilized with ligand-based computational methods such as shape similarity and classification algorithms to identify and separate P-gp substrates from non-substrates. While developing predictive classification models, the problem of imbalanced class distribution was proficiently addressed using the cost-sensitive bagging approach. Applicability domain experiment revealed that our model not only predicts NCI compounds well, but it can also be applied to drug-like molecules. The developed models were relatively simple but precise enough to be applicable for virtual screening of large chemical libraries for the early identification of P-gp substrates which can potentially be useful to remove compounds of poor ADMET properties in an early phase of drug discovery. Additionally, shape-similarity and self-organizing maps techniques were used to screen in-house as well as a large vendor database for identification of novel selective serotonin reuptake inhibitor (SSRI) like compounds to induce apoptosis. The retrieved hits possess novel chemical scaffolds and can be considered as a starting point for lead optimization studies. The work described in this thesis will be useful to create distributed computing environment using available resources within an organization and can be applied to various applications such as efficient handling of imbalanced data classification problems or multistep virtual screening approach

    Developing resilient cyber-physical systems: A review of state-of-the-art malware detection approaches, gaps, and future directions

    Get PDF
    Cyber-physical systems (CPSes) are rapidly evolving in critical infrastructure (CI) domains such as smart grid, healthcare, the military, and telecommunication. These systems are continually threatened by malicious software (malware) attacks by adversaries due to their improvised tactics and attack methods. A minor configuration change in a CPS through malware has devastating effects, which the world has seen in Stuxnet, BlackEnergy, Industroyer, and Triton. This paper is a comprehensive review of malware analysis practices currently being used and their limitations and efficacy in securing CPSes. Using well-known real-world incidents, we have covered the significant impacts when a CPS is compromised. In particular, we have prepared exhaustive hypothetical scenarios to discuss the implications of false positives on CPSes. To improve the security of critical systems, we believe that nature-inspired metaheuristic algorithms can effectively counter the overwhelming malware threats geared toward CPSes. However, our detailed review shows that these algorithms have not been adapted to their full potential to counter malicious software. Finally, the gaps identified through this research have led us to propose future research directions using nature-inspired algorithms that would help in bringing optimization by reducing false positives, thereby increasing the security of such systems

    Artificial Intelligence and Climate Change

    Get PDF
    As artificial intelligence (AI) continues to embed itself in our daily lives, many focus on the threats it poses to privacy, security, due process, and democracy itself. But beyond these legitimate concerns, AI promises to optimize activities, increase efficiency, and enhance the accuracy and efficacy of the many aspects of society relying on predictions and likelihoods. In short, its most promising applications may come, not from uses affecting civil liberties and the social fabric of our society, but from those particularly complex technical problems lying beyond our ready human capacity. Climate change is one such complex problem, requiring fundamental changes to our transportation, agricultural, building, and energy sectors. This Article argues for the enhanced use of AI to address climate change, using the energy sector to exemplify its potential promise and pitfalls. The Article then analyzes critical policy tradeoffs that may be associated with an increased use of AI and argues for its disciplined use in a way that minimizes its limitations while harnessing its benefits to reduce greenhouse-gas emissions

    Artificial Intelligence and Climate Change

    Get PDF
    As artificial intelligence (AI) continues to embed itself in our daily lives, many focus on the threats it poses to privacy, security, due process, and democracy itself. But beyond these legitimate concerns, AI promises to optimize activities, increase efficiency, and enhance the accuracy and efficacy of the many aspects of society relying on predictions and likelihoods. In short, its most promising applications may come, not from uses affecting civil liberties and the social fabric of our society, but from those particularly complex technical problems lying beyond our ready human capacity. Climate change is one such complex problem, requiring fundamental changes to our transportation, agricultural, building, and energy sectors. This Article argues for the enhanced use of AI to address climate change, using the energy sector to exemplify its potential promise and pitfalls. The Article then analyzes critical policy tradeoffs that may be associated with an increased use of AI and argues for its disciplined use in a way that minimizes its limitations while harnessing its benefits to reduce greenhouse-gas emissions
    • …
    corecore