    Feature extraction is an essential task in graph analytics. These feature vectors, called graph descriptors, are used in downstream vector-space-based graph analysis models. This idea has proved fruitful in the past, with spectral-based graph descriptors providing state-of-the-art classification accuracy. However, known algorithms to compute meaningful descriptors do not scale to large graphs since: (1) they require storing the entire graph in memory, and (2) the end-user has no control over the algorithm's runtime. In this paper, we present streaming algorithms to approximately compute three different graph descriptors capturing the essential structure of graphs. Operating on edge streams allows us to avoid storing the entire graph in memory, and controlling the sample size enables us to keep the runtime of our algorithms within desired bounds. We demonstrate the efficacy of the proposed descriptors by analyzing the approximation error and classification accuracy. Our scalable algorithms compute descriptors of graphs with millions of edges within minutes. Moreover, these descriptors yield predictive accuracy comparable to the state-of-the-art methods but can be computed using only 25% as much memory.Comment: Extension of work accepted to PAKDD 202

    Vi har arbeidet med denne kandidatoppgaven siden august 2011. I lang tid har vi hatt som mål å produsere ett læreverktøy med flerspråklig tale. Bakgrunn for valg av problemstilling er det faktum at fremmedspråklige er en høyrisikogruppe i trafikken, og at språket er en barriere i undervisningen. Dette kommer frem i flere uavhengige forskningsrapporter, hvor man har følgende definisjon på høyrisikogruppe i trafikken: «En gruppe trafikanter som både har høyere skaderisiko enn gjennomsnittet og som står for en relativt stor andel av det samlede antall skader i vegtrafikken. Høyrisikogrupper kan defineres ut fra trafikantkategorier, sosiale og demografiske bakgrunnsvariabler og/eller ut fra risikorelatert atferd» (Sagberg, 2007) Forskningsrapportene har kommet med anbefalte tiltak, og generelt sett kan man si at behovet for et tilpasset lærerverktøy blir ansett som et godt tiltak, i forbindelse med trafikkopplæringen for denne høyrisikogruppen. Vår konklusjon er at bransjen ikke har tatt tak i de utfordringer som er knyttet til trafikkopplæring for fremmedspråklige i Norge, som er omtalt i forskningsrapportene. Vi hadde som mål å lage et produkt, men problemstillingen har blitt dreid mot hvordan et godt lærerverktøy for denne målgruppen bør utformes. Det ble slik da ingen aktører har gått inn med økonomisk støtte. Språk er en barriere i undervisningen, og denne utfordringen må løses. Vår rapport tar tak i denne utfordringen. I utformingen av et godt læreverktøy har vi lagt vekt på enkle visuelle fremstillinger av trafikksituasjoner, med pedagogisk riktig tale på flere språk. Vedrørende språket konkluderer vi med at man bør benytte så få ord som mulig, produktet bør bestå av elementære nøkkelord på forskjellige språk og ordene bør ikke danne grunnlag for misforståelser. Vi legger også vekt på læreutbyttet ved visuell framstilling i undervisningen. Vi har også beskrevet kulturforskjeller og forhold til risiko, som medfører at vår målgruppe har behov for et tilpasset læreverktøy. Gruppens erfaring tilsier at barrieren for akseptert risiko er et på høyere nivå for mange fremmedspråklige, som kan skyldes infrastruktur og økonomi investert i trafikksikkerhet i hjemlandet. I utformingen av læreverktøyet blir det av den grunn viktig at man har en motivasjon for øvelsen, hvor risikomomenter blir synlig gjort

    The influence of climate change on wildland fire has received considerable attention, but few studies have examined the potential effects of climate variability on grassland area burned within the extensive steppe land of Eurasia. We used a novel statistical approach borrowed from the social science literature—dynamic simulations of autoregressive distributed lag (ARDL) models—to explore the relationship between temperature, relative humidity, precipitation, wind speed, sunlight, and carbon emissions on grassland area burned in Xilingol, a large grassland-dominated landscape of Inner Mongolia in northern China. We used an ARDL model to describe the influence of these variables on observed area burned between 2001 and 2018 and used dynamic simulations of the model to project the influence of climate on area burned over the next twenty years. Our analysis demonstrates that area burned was most sensitive to wind speed and temperature. A 1% increase in wind speed was associated with a 20.8% and 22.8% increase in observed and predicted area burned respectively, while a 1% increase in maximum temperature was associated with an 8.7% and 9.7% increase in observed and predicted future area burned. Dynamic simulations of ARDL models provide insights into the variability of area burned across Inner Mongolia grasslands in the context of anthropogenic climate change

    © 2020, The Author(s). Grassland fire dynamics are subject to myriad climatic, biological, and anthropogenic drivers, thresholds, and feedbacks and therefore do not conform to assumptions of statistical stationarity. The presence of non-stationarity in time series data leads to ambiguous results that can misinform regional-level fire management strategies. This study employs non-stationarity in time series data among multiple variables and multiple intensities using dynamic simulations of autoregressive distributed lag models to elucidate key drivers of climate and ecological change on burned grasslands in Xilingol, China. We used unit root methods to select appropriate estimation methods for further analysis. Using the model estimations, we developed scenarios emulating the effects of instantaneous changes (i.e., shocks) of some significant variables on climate and ecological change. Changes in mean monthly wind speed and maximum temperature produce complex responses on area burned, directly, and through feedback relationships. Our framework addresses interactions among multiple drivers to explain fire and ecosystem responses in grasslands, and how these may be understood and prioritized in different empirical contexts needed to formulate effective fire management policies

    This is an accepted manuscript of an article published by Springer in Scientometrics on 18/05/2020, available online: https://doi.org/10.1007/s11192-020-03499-1 The accepted version of the publication may differ from the final published version.© 2020, Akadémiai Kiadó, Budapest, Hungary. We argue that classic citation-based scientific document clustering approaches, like co-citation or Bibliographic Coupling, lack to leverage the social-usage of the scientific literature originate through online information dissemination platforms, such as Twitter. In this paper, we present the methodology Tweet Coupling, which measures the similarity between two or more scientific documents if one or more Twitter users mention them in the tweet(s). We evaluate our proposal on an altmetric dataset, which consists of 3081 scientific documents and 8299 unique Twitter users. By employing the clustering approaches of Bibliographic Coupling and Tweet Coupling, we find the relationship between the bibliographic and tweet coupled scientific documents. Further, using VOSviewer, we empirically show that Tweet Coupling appears to be a better clustering methodology to generate cohesive clusters since it groups similar documents from the subfields of the selected field, in contrast to the Bibliographic Coupling approach that groups cross-disciplinary documents in the same cluster.The authors (Saeed-Ul Hassan & Mudassir Shabbir) were funded by the CIPL (National Center in Big Data and Cloud Computing (NCBC) grant, received from the Planning Commission of Pakistan, through Higher Education Commission (HEC) of Pakistan. This work was partially supported by the Spanish Ministry of Science and Technology under the projects TIN2017-89517-P and TIN2017-83445-P. Eugenio Martínez Cámara was supported by the Spanish Government Programme Juan de la Cierva Incorporación (IJC2018-036092-I).Published versio

    Ticks are ectoparasites that act as vectors for transmission of various pathogens to wild and domesticated animals and pose a serious threat to human health. Because of the hot and humid conditions in different agro-ecological zones of Pakistan, ticks are abundant and parasitize a variety of animals. The aim of this study was to identify different tick species and distribution on different hosts especially livestock, such as sheep, goat, cattle, buffalo, and camel, and livestock associated canines and equines, such as horse, donkey, and dog, across different agro-ecological zones of Pakistan. The ticks samples were collected and morphologically identified at genus and species level using morphological keys under stereomicroscope. A total of 2,846 animals were examined for the tick infestation, and 408 animals were tick-infested. Eleven tick species belonging to 4 genera were identified: Hyalomma anatolicum, Hyalomma scupense, Hyalomma dromedarii, Hyalomma isaaci, Rhipicephalus microplus, Rhipicephalus haemaphysaloides, Rhipicephalus turanicus, Haemaphysalis cornupunctata, Haemaphysalis montgomeryi, Haemaphysalis bispinosa, and Ixodes kashmiricus. The overall tick prevalence was 14.3%; host-wise infestation rate was 12.2% in sheep; 12.6%, goat; 11.7%, buffalo; 11.7%, cattle; 19.6%, camel; 27.4%, donkey; 23.5%, horse; and 24.3%, dog. Tick infestation of different animals differed on the basis of the zones. Camels showed the highest tick infestation rate in zones 1 and 2 (21.4 and 26.7%, respectively), whereas donkeys showed the highest infestation rate in zones 3, 4, 6, and 7 (25, 39.3, 3.3, and 21.4%, respectively). The infestation rates of Hyalomma and Rhipicephalus were the highest in zone 2 (71.4 and 52.9%, respectively). The infestation rate of Hyalomma was the highest (47.4%) in sheep; Haemaphysalis (46.9%), goat; Rhipicephalus (69.7%), buffalo; Rhipicephalus (62.3%), cattle; Hyalomma (70%), camel; Ixodes (60.9%), donkey; Ixodes (75%), horse; and Rhipicephalus (61.1%), dog. This study showed the diversity and infestation rate of different ticks with respect to their hosts and agro-ecological zones of Pakistan. High tick burdens and infestation rates are responsible for the spread of different tick-borne infections, resulting in loss of animal productivity and posing a threat to animal and human health. Understanding different tick species and their distribution across different zones will be helpful for developing efficient control strategies against different tick born infections

    Background: The parasitic disease, cystic echinococcosis (CE), is a serious health problem in Pakistan. Risk of disease transmission is increased by economic and political instability, poor living conditions, and limited awareness of hygienic practices. The current study aimed to investigate the community perception and awareness regarding the risk factors of CE in Pakistan, from a One Health perspective. Methods: We conducted a community-based survey involving 454 participants in the major cities of Pakistan. Quantitative data based on knowledge, attitude, and practices (KAP), the One Health concept, risk factors, and community perception of CE among the general population of the major cities of Pakistan were collected. The questions included those related to knowledge, attitude, practices, One Health concept, risk factors, and community perception. The Chi-squared test was applied to determine the associations regarding KAPs across socio-demographic parameters. Results: KAPs had no significant associations with sociodemographic aspects such as age, sex, religion, ethnicity, education, marital status, occupation, or financial status of the participants. The findings indicated a lack of awareness about CE among the participants. Respondents were unaware of the risk factors and the One Health concept of CE. However, the community attitude and perception were positive toward the control of CE. Conclusion: Illiteracy, deficient sanitation systems and lack of awareness are the contributing factors to CE in Pakistan. It is necessary to make the community aware regarding CE and its importance. Increasing this awareness represents an important step toward the eradication and control of CE