Search CORE

1,561 research outputs found

Using GES DISC Data to Study Kilauea Volcano of 2018

Author: Johnson James E.
KC Binita
Ostrenga Dana M.
Shen Suhung
Su Jay
Zeng Jian
Publication venue
Publication date
Field of study

Kilauea volcano in Hawaii which erupted in early May 2018 injected massive amount of SO2 and ash into the atmosphere. The lava flow during the eruption destroyed many home and neighborhoods. The SO2 plume during the eruption of Kilauea volcano is analyzed from May to August 2018 using multiple satellite products such as Level 2 TROPspheric Monitoring Instrument (TROPOMI) and Level 3 Ozone Monitoring Instrument (OMI) from the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). GES DISC hosts multi-disciplinary Earth science data sets that can be used to analyze natural disasters, such as the Kilauea volcano. Additionally, GES DISC's Giovanni tool can be used to visualize these data. We acquired OMI through the subsetting function, which is processed by the GES DISC in-house developed backend software Level3/4 Regrider and Subsetter (L34RS) and TROPOMI using OPeNDAP.Data from the OMI OMSO2e product showed elevated levels of SO2 amounts during the eruption between May to August 2018. Similarly, ground-based stations at Hawaii Volcanoes National Park recorded higher SO2 concentrations during the same time period. This study uses wind direction from Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA-2) to analyze the transport and dispersion of SO2 plume and map lava flows from the volcano using thermal images from Visible Infrared Imaging Radiometer Suite (VIIRS). Furthermore, satellite observations combined with socioeconomic and public health data are used to analyze its impact in public health

NASA Technical Reports Server

Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and Similarities

Author: DN Georgiou
GA Watson
GP Zhou
GP Zhou
GP Zhou
H Gurulingappa
H Mohabatkar
H Mohabatkar
IW Althaus
J Andraos
J Lin
Kai-Yan Feng
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
Kuo-Chen Chou
L Hu
Lei Chen
M Dunkel
M Esmaeili
M Hattori
M Kanehisa
M Kanehisa
M Kuhn
Ozlem Keskin
P Jaccard
P Wang
Q Gu
R Sharan
T Huang
U Karaoz
Wei-Ming Zeng
WZ Lin
X Xiao
YD Cai
YD Cai
Yu-Dong Cai
ZC Wu
ZC Wu
Publication venue: Public Library of Science
Publication date: 13/04/2012
Field of study

The Anatomical Therapeutic Chemical (ATC) classification system, recommended by the World Health Organization, categories drugs into different classes according to their therapeutic and chemical characteristics. For a set of query compounds, how can we identify which ATC-class (or classes) they belong to? It is an important and challenging problem because the information thus obtained would be quite useful for drug development and utilization. By hybridizing the informations of chemical-chemical interactions and chemical-chemical similarities, a novel method was developed for such purpose. It was observed by the jackknife test on a benchmark dataset of 3,883 drug compounds that the overall success rate achieved by the prediction method was about 73% in identifying the drugs among the following 14 main ATC-classes: (1) alimentary tract and metabolism; (2) blood and blood forming organs; (3) cardiovascular system; (4) dermatologicals; (5) genitourinary system and sex hormones; (6) systemic hormonal preparations, excluding sex hormones and insulins; (7) anti-infectives for systemic use; (8) antineoplastic and immunomodulating agents; (9) musculoskeletal system; (10) nervous system; (11) antiparasitic products, insecticides and repellents; (12) respiratory system; (13) sensory organs; (14) various. Such a success rate is substantially higher than 7% by the random guess. It has not escaped our notice that the current method can be straightforwardly extended to identify the drugs for their 2nd-level, 3rd-level, 4th-level, and 5th-level ATC-classifications once the statistically significant benchmark data are available for these lower levels

Public Library of Science (PLOS)

Crossref

PubMed Central

FigShare

Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites

Author: A Hoglund
B Liao
CE Rasmussen
DN Georgiou
FM Li
Franca Fraternali
G Tsoumakas
GP Zhou
H Mohabatkar
H Mohabatkar
H Nakashima
HB Shen
HB Shen
HB Shen
HB Shen
HN Lin
Hong Gu
J Ma
J Ma
J Tian
J Yin
Jianjun He
JY Shi
K Imai
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KY Lee
L Chen
L Chen
L Hu
LJ Foster
LL Hu
M Esmaeili
MS Scott
O Emanuelsson
P Wang
P Wang
RE Schapire
S Briesemeister
S Hua
S Mei
S Mei
S Zhang
T Huang
T Huang
T Huang
T Liu
Wenqi Liu
WZ Lin
X Jiang
X Xiao
X Xiao
X Xiao
YH Zeng
YL Chen
YL Chen
Z He
Z Lu
ZC Wu
ZC Wu
Publication venue: Public Library of Science
Publication date: 08/06/2012
Field of study

It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using Gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Insights from Modeling the 3D Structure of New Delhi Metallo-β-Lactamse and Its Binding Interactions with Antibiotic Drugs

Author: A Tamilselvi
AT Laurie
B Tang
D Xu
D Yong
DM Livermore
GM Morris
H Giamarellou
H Gui
H Wei
HB Shen
HC Maltezou
I Garcia-Saez
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
JF Wang
Jing-Fang Wang
K Gong
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
Keertan Dheda
KK Kumarasamy
Kuo-Chen Chou
L Li
L Poirel
NO Concha
P Benkert
P Ricchiuto
QK Zeng
QS Du
QS Du
RA Laskowski
RB Huang
RX Gu
S Wu
T Magdziarz
V Colotta
X Guo
Y Wang
Y Zhang
Y Zhang
Z Wang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

New Delhi metallo-beta-lactamase (NDM-1) is an enzyme that makes bacteria resistant to a broad range of beta-lactam antibiotic drugs. This is because it can inactivate most beta-lactam antibiotic drugs by hydrolyzing them. For in-depth understanding of the hydrolysis mechanism, the three-dimensional structure of NDM-1 was developed. With such a structural frame, two enzyme-ligand complexes were derived by respectively docking Imipenem and Meropenem (two typical beta-lactam antibiotic drugs) to the NDM-1 receptor. It was revealed from the NDM-1/Imipenem complex that the antibiotic drug was hydrolyzed while sitting in a binding pocket of NDM-1 formed by nine residues. And for the case of NDM-1/Meropenem complex, the antibiotic drug was hydrolyzed in a binding pocket formed by twelve residues. All these constituent residues of the two binding pockets were explicitly defined and graphically labeled. It is anticipated that the findings reported here may provide useful insights for developing new antibiotic drugs to overcome the resistance problem

CiteSeerX

Public Library of Science (PLOS)

Crossref

PubMed Central

Gene ontology based transfer learning for protein subcellular localization

Author: A Bateman
A Dijk
A Hoglund
A Hoglund
A Pierleoni
C Chen
C Leslie
C Leslie
DH Haft
E Marcotte
EM Zdobnov
F Corpet
FM Li
G Lanckriet
G Schneider
H Ding
H Lin
H Lin
H Liu
H Rangwala
H Shen
HB Shen
HB Shen
HB Shen
HB Shen
HB Shen
J Cedano
J Schultz
J Shen
JD Qiu
JD Qiu
K Chou
K Chou
K Chou
K Hofmann
K Lee
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
L Nanni
M Ashburner
M Esmaeili
M Mak
M Wang
Q Gu
Q Yang
R Apweiler
R Kuang
R Kuang
S Mei
S Pan
Shuigeng Zhou
Suyu Mei
T Blum
T Tung
TK Attwood
W Dai
W Dai
W Huang
W Huang
Wang Fei
X Jiang
X Xiao
XB Zhou
YH Zeng
YS Ding
YS Ding
Z Lei
Z Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as <it>GO</it>, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the <it>GO </it>terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (<it>GO-TLM</it>) for large-scale protein subcellular localization. The model transfers the signature-based homologous <it>GO </it>terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false <it>GO </it>terms that are resulted from evolutionary divergence. We derive three <it>GO </it>kernels from the three aspects of gene ontology to measure the <it>GO </it>similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for protein subcellular localization. We evaluate <it>GO-TLM </it>performance against three baseline models: <it>MultiLoc, MultiLoc-GO </it>and <it>Euk-mPLoc </it>on the benchmark datasets the baseline models adopted. 5-fold cross validation experiments show that <it>GO-TLM </it>achieves substantial accuracy improvement against the baseline models: 80.38% against model <it>Euk-mPLoc </it>67.40% with <it>12.98% </it>substantial increase; 96.65% and 96.27% against model <it>MultiLoc-GO </it>89.60% and 89.60%, with <it>7.05% </it>and <it>6.67% </it>accuracy increase on dataset <it>MultiLoc plant </it>and dataset <it>MultiLoc animal</it>, respectively; 97.14%, 95.90% and 96.85% against model <it>MultiLoc-GO </it>83.70%, 90.10% and 85.70%, with accuracy increase <it>13.44%</it>, <it>5.8% </it>and <it>11.15% </it>on dataset <it>BaCelLoc plant</it>, dataset <it>BaCelLoc fungi </it>and dataset <it>BaCelLoc animal </it>respectively. For <it>BaCelLoc </it>independent sets, <it>GO-TLM </it>achieves 81.25%, 80.45% and 79.46% on dataset <it>BaCelLoc plant holdout</it>, dataset <it>BaCelLoc plant holdout </it>and dataset <it>BaCelLoc animal holdout</it>, respectively, as compared against baseline model <it>MultiLoc-GO </it>76%, 60.00% and 73.00%, with accuracy increase <it>5.25%</it>, <it>20.45% </it>and <it>6.46%</it>, respectively. Conclusions Since direct homology-based <it>GO </it>term transfer may be prone to introducing noise and outliers to the target protein, we design an explicitly weighted kernel learning system (called Gene Ontology Based Transfer Learning Model, <it>GO-TLM</it>) to transfer to the target protein the known knowledge about related homologous proteins, which can reduce the risk of outliers and share knowledge between homologous proteins, and thus achieve better predictive performance for protein subcellular localization. Cross validation and independent test experimental results show that the homology-based <it>GO </it>term transfer and explicitly weighing the <it>GO </it>kernels substantially improve the prediction performance.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Multi-Label Predictor for Identifying the Subcellular Locations of Singleplex and Multiplex Eukaryotic Proteins

Author: A Garg
A Khan
A Pierleoni
A Reinhardt
AA Schffer
AH Millar
B Niu
C Chen
C Cortes
C Smith
CS Yu
D Georgiou
D Zou
E Camon
E Glory
FM Li
G Tsoumakas
Guo-Zheng Li
GY Zhang
GY Zhang
H Ding
H Ding
H Lin
H Lin
H Mohabatkar
H Mohabatkar
HB Shen
HB Shen
HB Shen
HB Shen
HB Shen
HB Shen
J Guo
J Lin
J Lin
J Read
J Wang
JD Qiu
JD Qiu
JD Qiu
K Nakai
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KJ Park
KK Kandaswamy
L Hao
L Hu
L Nanni
L Nanni
L Yu
Lukasz Kurgan
M Ashburner
M Bhasin
M Esmaeili
M Gerstein
P Wang
Q Gu
R Fan
S Hua
S Zhang
SS Sahu
SW Zhang
SW Zhang
WZ Lin
X Jian
X Jiang
X Xiao
X Xiao
X Xiao
XB Zhou
Xiao Wang
Y Fang
Y Huang
Y Loewenstein
YC Wang
Yh Zeng
YS Ding
Z Lu
ZC Li
Publication venue: Public Library of Science
Publication date: 22/05/2012
Field of study

Subcellular locations of proteins are important functional attributes. An effective and efficient subcellular localization predictor is necessary for rapidly and reliably annotating subcellular locations of proteins. Most of existing subcellular localization methods are only used to deal with single-location proteins. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. To better reflect characteristics of multiplex proteins, it is highly desired to develop new methods for dealing with them. In this paper, a new predictor, called Euk-ECC-mPLoc, by introducing a powerful multi-label learning approach which exploits correlations between subcellular locations and hybridizing gene ontology with dipeptide composition information, has been developed that can be used to deal with systems containing both singleplex and multiplex eukaryotic proteins. It can be utilized to identify eukaryotic proteins among the following 22 locations: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centrosome, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome, (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole. Experimental results on a stringent benchmark dataset of eukaryotic proteins by jackknife cross validation test show that the average success rate and overall success rate obtained by Euk-ECC-mPLoc were 69.70% and 81.54%, respectively, indicating that our approach is quite promising. Particularly, the success rates achieved by Euk-ECC-mPLoc for small subsets were remarkably improved, indicating that it holds a high potential for simulating the development of the area. As a user-friendly web-server, Euk-ECC-mPLoc is freely accessible to the public at the website http://levis.tongji.edu.cn:8080/bioinfo/Euk-ECC-mPLoc/. We believe that Euk-ECC-mPLoc may become a useful high-throughput tool, or at least play a complementary role to the existing predictors in identifying subcellular locations of eukaryotic proteins

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property

Author: A Bairoch
A Barabasi
C Chen
C Chen
C Klukas
C Krieger
Cathal Seoighe
CF Gao
D Chakrabarti
D Frishman
DN Georgiou
E Camon
F Chiti
G Pollastri
GF Cooper
GP Zhou
GP Zhou
GY Zhang
H Ding
H Lin
H Mohabatkar
H Mohabatkar
H Ogata
H Peng
I Althaus
I Althaus
I Althaus
I Dubchak
I Dubchak
I Schomburg
I Schomburg
IH Witten
J Andraos
J Cheng
J Cheng
JD Qiu
JM Dale
K Chou
K Chou
K Chou
K Chou
K Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
Kuo-Chen Chou
L Chen
L Chen
L Chen
L Chen
L Chen
L Lu
L Lu
L Yu
Lei Chen
M Chang
M Esmaeili
M Kanehisa
M Kanehisa
M Kanehisa
M Kanehisa
N Chazal
N Friedman
P Carmona-Saez
P Pharkya
Q Gu
R Caspi
R Caspi
RR Bouckaert
S Salzberg
SS Keerthi
T Denoeux
T Huang
T Huang
T Huang
T Huang
T Huang
Tao Huang
U Stelzl
W Buntine
X Xiao
XB Zhou
Y Cai
Y Cai
Y Cai
Y Qi
YH Zeng
YS Lobanova
Yu-Dong Cai
Z He
ZC Wu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Given a regulatory pathway system consisting of a set of proteins, can we predict which pathway class it belongs to? Such a problem is closely related to the biological function of the pathway in cells and hence is quite fundamental and essential in systems biology and proteomics. This is also an extremely difficult and challenging problem due to its complexity. To address this problem, a novel approach was developed that can be used to predict query pathways among the following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental Information Processing”, (iv) “Cellular Processes”, (v) “Organismal Systems”, and (vi) “Human Diseases”. The prediction method was established trough the following procedures: (i) according to the general form of pseudo amino acid composition (PseAAC), each of the pathways concerned is formulated as a 5570-D (dimensional) vector; (ii) each of components in the 5570-D vector was derived by a series of feature extractions from the pathway system according to its graphic property, biochemical and physicochemical property, as well as functional property; (iii) the minimum redundancy maximum relevance (mRMR) method was adopted to operate the prediction. A cross-validation by the jackknife test on a benchmark dataset consisting of 146 regulatory pathways indicated that an overall success rate of 78.8% was achieved by our method in identifying query pathways among the above six classes, indicating the outcome is quite promising and encouraging. To the best of our knowledge, the current study represents the first effort in attempting to identity the type of a pathway system or its biological function. It is anticipated that our report may stimulate a series of follow-up investigations in this new and challenging area

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model

Author: A Bairoch
A Dehzangi
A Neumann
AA Schaffer
AK Patel
AK Patel
B Molparia
C Chen
DN Georgiou
E Nordhoff
EW Stawiski
G Nimrod
G Nimrod
G Wang
G Wang
H Mohabatkar
H Mohabatkar
HP Shanahan
J Rogers
JB Brown
JD Qiu
Jian-An Fang
JL Deng
JS Wu
K-C Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KK Kandaswamy
KK Kumar
Kuo-Chen Chou
L Breiman
L Breiman
L Nanni
L Nanni
L Yu
M Esmaeili
M Keil
M Kumar
N Bhardwaj
N Bhardwaj
Q Gu
RE Langlois
S Ahmad
S Ahmad
Vladimir N. Uversky
Wei-Zhong Lin
WR Atchley
X Shao
X Xiao
X Xiao
X Yu
XB Zhou
Xuan Xiao
Y Cai
Y Fang
YD Cai
YH Zeng
ZP Liu
Publication venue: Public Library of Science
Publication date: 15/09/2011
Field of study

DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Entanglement of single-photons and chiral phonons in atomically thin WSe $_2$

Author: A Branny
A Castellanos-Gomez
A Srivastava
A Srivastava
Ajit Srivastava
AK Geim
C Cao
C Chakraborty
C Palacios-Berraquero
C Palacios-Berraquero
D Gammon
D Xiao
DL Moehring
F Dolde
G Aivazian
H Zeng
H Zhu
J Maldacena
K Greve De
KC Lee
KF Mak
L DiCarlo
L Zhang
Lifa Zhang
M Koperski
M Sidler
M Steffen
N Yoshikawa
P Tonndorf
Qiang Yao
Qihua Xiong
R Heitz
S Kim
SG Drapcho
Sheng Liu
Sudipta Dubey
SY Chen
T Cao
WB Gao
X Luo
X Xu
Xiaotong Chen
Xin Lu
Xingzhi Wang
Y Cai
YM He
YM He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Quantum entanglement is a fundamental phenomenon which, on the one hand, reveals deep connections between quantum mechanics, gravity and the space-time; on the other hand, has practical applications as a key resource in quantum information processing. While it is routinely achieved in photon-atom ensembles, entanglement involving the solid-state or macroscopic objects remains challenging albeit promising for both fundamental physics and technological applications. Here, we report entanglement between collective, chiral vibrations in two-dimensional (2D) WSe

_2

host --- chiral phonons (CPs) --- and single-photons emitted from quantum dots (QDs) present in it. CPs which carry angular momentum were recently observed in WSe

_2

and are a distinguishing feature of the underlying honeycomb lattice. The entanglement results from a "which-way" scattering process, involving an optical excitation in a QD and doubly-degenerate CPs, which takes place via two indistinguishable paths. Our unveiling of entanglement involving a macroscopic, collective excitation together with strong interaction between CPs and QDs in 2D materials opens up ways for phonon-driven entanglement of QDs and engineering chiral or non-reciprocal interactions at the single-photon level

arXiv.org e-Print Archive

Crossref

DR-NTU (Digital Repository of NTU)

Cost Efficient Scheduling of MapReduce Applications on Public Clouds

Author: Assunção
Chen
Dean
Geng
Gong
Herodotou
Kambatla
Kc
Lee
Lee
Martello
Matsunaga
Mattess
Murty
Nguyen
Polo
Shvachko
Vecchiola
Verma
Verma
Wang
Wang
Wang
Wang
Wang
Wang
Yang
Yin
Zeng
Zeng
Zhenyu
Zhenyu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

MapReduce framework has been one of the most prominent ways for efficient processing large amount of data requiring huge computational capacity. On-demand computing resources of Public Clouds have become a natural host for these MapReduce applications. However, the decision of what type and in what amount computing and storage resources should be rented is still a user’s responsibility. This is not a trivial task particularly when users may have performance constraints such as deadline and have several Cloud product types to choose with the intention of not spending much money. Even though there are several existing scheduling systems, however, most of them are not developed to manage the scheduling of MapReduce applications. That is, they do not consider things such as number of map and reduce tasks that are needed to be scheduled and heterogeneity of Virtual Machines (VMs) available. This paper proposes a novel greedy-based MapReduce application scheduling algorithm (MASA) that considers the user’s constraints in order to minimize cost of renting Cloud resources while considering Service Level Agreements (SLA) in terms of the user given budget and deadline constraints. The simulation results show that MASA can achieve 25-50% cost reduction in comparison to current SLA agnostic methods and there is only 10% performance disparity between MASA and an exhaustive search algorithm

Crossref

Edinburgh Research Explorer

University of Tasmania Open Access Repository