471 research outputs found

    A Simple Deterministic Distributed MST Algorithm, with Near-Optimal Time and Message Complexities

    Full text link
    Distributed minimum spanning tree (MST) problem is one of the most central and fundamental problems in distributed graph algorithms. Garay et al. \cite{GKP98,KP98} devised an algorithm with running time O(D+nlogn)O(D + \sqrt{n} \cdot \log^* n), where DD is the hop-diameter of the input nn-vertex mm-edge graph, and with message complexity O(m+n3/2)O(m + n^{3/2}). Peleg and Rubinovich \cite{PR99} showed that the running time of the algorithm of \cite{KP98} is essentially tight, and asked if one can achieve near-optimal running time **together with near-optimal message complexity**. In a recent breakthrough, Pandurangan et al. \cite{PRS16} answered this question in the affirmative, and devised a **randomized** algorithm with time O~(D+n)\tilde{O}(D+ \sqrt{n}) and message complexity O~(m)\tilde{O}(m). They asked if such a simultaneous time- and message-optimality can be achieved by a **deterministic** algorithm. In this paper, building upon the work of \cite{PRS16}, we answer this question in the affirmative, and devise a **deterministic** algorithm that computes MST in time O((D+n)logn)O((D + \sqrt{n}) \cdot \log n), using O(mlogn+nlognlogn)O(m \cdot \log n + n \log n \cdot \log^* n) messages. The polylogarithmic factors in the time and message complexities of our algorithm are significantly smaller than the respective factors in the result of \cite{PRS16}. Also, our algorithm and its analysis are very **simple** and self-contained, as opposed to rather complicated previous sublinear-time algorithms \cite{GKP98,KP98,E04b,PRS16}

    An auto-scaling framework for analyzing big data in the cloud environment

    Get PDF
    Processing big data on traditional computing infrastructure is a challenge as the volume of data is large and thus high computational complexity. Recently, Apache Hadoop has emerged as a distributed computing infrastructure to deal with big data. Adopting Hadoop to dynamically adjust its computing resources based on real-time workload is itself a demanding task, thus conventionally a pre-configuration with adequate resources to compute the peak data load is set up. However, this may cause a considerable wastage of computing resources when the usage levels are much lower than the preset load. In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the real-time data load. A cost-effective auto-scaling (CEAS) framework is first proposed for an Amazon Web Services (AWS) Cloud environment. The proposed CEAS framework allows us to scale the computing resources of Hadoop cluster so as to either reduce the computing resource use when the workload is low or scale-up the computing resources to speed up the data processing and analysis within an adequate time. To validate the effectiveness of the proposed framework, a case study with real-time sentiment analysis on the universities’ tweets is provided to analyse the reviews/tweets of the people posted on social media. Such a dynamic scaling method offers a reference to improving the Twitter data analysis in a more cost-effective and flexible way

    Opinion Mining for Software Development: A Systematic Literature Review

    Get PDF
    Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

    Security in 5G-Enabled Internet of Things Communication: Issues: Challenges, and Future Research Roadmap

    Get PDF
    5G mobile communication systems promote the mobile network to not only interconnect people, but also interconnect and control the machine and other devices. 5G-enabled Internet of Things (IoT) communication environment supports a wide-variety of applications, such as remote surgery, self-driving car, virtual reality, flying IoT drones, security and surveillance and many more. These applications help and assist the routine works of the community. In such communication environment, all the devices and users communicate through the Internet. Therefore, this communication agonizes from different types of security and privacy issues. It is also vulnerable to different types of possible attacks (for example, replay, impersonation, password reckoning, physical device stealing, session key computation, privileged-insider, malware, man-in-the-middle, malicious routing, and so on). It is then very crucial to protect the infrastructure of 5G-enabled IoT communication environment against these attacks. This necessitates the researchers working in this domain to propose various types of security protocols under different types of categories, like key management, user authentication/device authentication, access control/user access control and intrusion detection. In this survey paper, the details of various system models (i.e., network model and threat model) required for 5G-enabled IoT communication environment are provided. The details of security requirements and attacks possible in this communication environment are further added. The different types of security protocols are also provided. The analysis and comparison of the existing security protocols in 5G-enabled IoT communication environment are conducted. Some of the future research challenges and directions in the security of 5G-enabled IoT environment are displayed. The motivation of this work is to bring the details of different types of security protocols in 5G-enabled IoT under one roof so that the future researchers will be benefited with the conducted work

    Analysis of the Arrangement of Geographical Conditions with the Aim of Reducing Air Pollution: A Case Study of Tehran

    Get PDF
    Abstract:According to the statistics of the Organization of the Environment, a total of 48 days of air pollution exceeds the admissible threshold (AQI more than 150) for the three months of the year. These days coincide with the time when Tehran's inversion reaches its maximum stability. The purpose of the present study was first to determine the height of air pollution in Tehran on the days when pollution exceeds the permissible limit. It also aims to study the pressure and temperature masses of such days, considering the geographical and topographic conditions, and finally to identify the best of these cells for theoretically possible air turbulence. The results of this study, based on Tehran temperature and pressure data over a 15-year period (2003-2017), show that the highest elevation of Tehran inversion does not exceed 1800 meters on polluted days. Only within 6 days of whole days beyond the admissible threshold, temperature and pressure cells with the highest Newtonian mass are formed. The center of such cells shows a pressure difference of 32 milligrams in November, 7 milligrams in January, 100 milligrams in December, as well as a temperature difference of 1.1 degrees in November, 4.4 degrees in January, and 1.9 degrees in December. Based on the results and topographic conditions as well as the cell adaptation to such conditions, it seems that theoretically, it is possible to artificially create air turbulence in Tehran to mitigate the contamination amount.Extended AbstractIntroductionTehran is one of the largest and the most crowded cities that suffers from air pollution. On some days of the year, the amount of contaminating and pollution elements increases so much that breathing is very difficult for inhabitants. The Air Quality Index (AQI) varies over the course of a year in Tehran. During autumn and winter, Tehran becomes more polluted. Atmospheric temperature inversion worsens air pollution during that period.  The two factors of climate and topography are affecting air pollution in Tehran. These two factors are emphasized in this research to look for a way to eliminate or at least decrease the pollution of Tehran's air. This research focuses on vertical and horizontal exchanges via atmospheric mixing by defining the good conditions for instability during the inversion periods in Tehran. If there are suitable mixing conditions (identified with cells of pressure and/or temperature), we could define the best status for instability. There is a need to know the differences between temperature and pressure that give rise to air turbulence. MethodologyFirstly, the pressure and temperature maps were drawn at different levels of the atmosphere. Further, based on these maps, the levels that had the most number cells of pressure and temperature with the most gradient were selected. This revealed the degree of differences in temperature and pressure that cells should have to create instability. We used the synoptic stations and the air pollution testing stations as well as Google Earth, Arc GIS, Surfer, and Voxler software DiscussionIn the first step, we find the days when the AQI (Air Quality Index) was greater than 150 as dangerous days of pollution from 2003 until 2017. In order to calculate the average inversion level, Radiosonde data were used. The height of the inversion phenomenon in Tehran is not the same in the target months (January, November, and December). The highest inversion height in the target months is 1800 m and the lowest is 1300 m. Exceedance of the AQI index or the pollution crisis threshold does not cover all areas of Tehran in the target months. That is, while some districts of Tehran experience higher pollution than the thresholds, others do not. During December, the expanse of pollution in Tehran is wider than other target months.Next, based on the determined inversion levels, the zoning maps of pressure and temperature on critical days of pollution were drawn in the target months. From among them, maps containing temperature and pressure cells were selected, then a matrix was prepared for all cells in the selected maps and their Newtonian mass was calculated. This matrix represents the cells that have the gradient because the two factors of cell difference and distances play a major role in their triggering. Finally, for each month, two temperature and pressure cells with the highest Newtonian mass were selected.In order to investigate the effect of topographic terrains on temperature and pressure cells, and to further understand the location of these cells, the temperature and pressure cells were overlain on the topography of the area. For this purpose, a 3D map of the area’s heights was plotted, and the synoptic stations, pressure, and temperature cells overlapped for analysis and investigation. ConclusionThe following results were obtained by drawing and examining the pressure and temperature maps:1) There are two cells in the November temperature map at Imam Khomeini Airport Station and Mehrabad Station. Imam Khomeini's cell is located at an altitude of 990.2 meters near the low elevation range of southern Shahriar. Mehrabad cell is located at an elevation of 1190 meters and in the easterly part of the southern Alborz Mountains.2) The temperature maps of January with two cells of geophysics and Shemiran are 1423.8 m and 1548.2 meters, respectively. The two formed cells are located in the recesses of the southern slope of the Alborz Mountains, and it may be noted that the confinement of cell formation zones may influence the formation of these temperature cells.3) The December temperature map contains two cells of geophysics and Shemiran, which are located at altitudes of 1423.8 and 1548.2, respectively. These two cells are also located in the indentation of the southern slope of the Alborz Mountains.4) On the map of November pressure difference, two cells of Chitgar at 1305.2 height and Imam Khomeini airport cell at 990.2 height are located. The Chitgar Cell lies on the southern slope of Alborz, where the heights have advanced, and the Imam Khomeini Airport cell is near the low-lying slopes south of Shahriar. The formation of these pressure cells at the sites mentioned may be affected by the air currents in the area. These currents, due to the advance of the southern slopes of the eastern highlands, divert the surface winds of these currents to the southern plains and increase the relative wind velocity at these points.5) The January pressure difference map shows two cells of Mehrabad with a height of 1190.2 and Chitgar with a height of 1305.2 meters. The two cells are located on the eaves of the southern Alborz Mountains.6) December pressure maps showed two cells of geophysics and Mehrabad. These two cells were located at 1423.8 and 1190.2, respectively. These two cells are located on the northern elevation of Tehran. In fact, this part of the southern slope of Alborz is indented, and this retreat can be effective in winds and existing cells.According to the obtained results, among all days that the AQI passes the threshold, only in 6 days, temperature and pressure closed cells with the highest Newtonian mass are formed. The center of these cells shows a pressure difference of 32 milligrams in November, 7 milligrams in January, 100 milligrams in December, and the temperature difference of 1.1 degrees in November, 4.4 degrees in January, and 1.9 degrees in December.Generally, considering the formed cells by the temperature and pressure difference and the gradient between them as well as the difference in height between the cells and their location and pointing out that the local winds cause the difference of temperature and pressure, it seems that, theoretically, it is possible to create artificial air turbulence in Tehran within the study area to control the contamination amount. Knowledge of the conditions in the study area is natural in this study and there is no uniformity pattern for all areas in the subject area. This study was conducted only for a limited period of 15 years (from 2003 to 2017) in the study area of ​​Tehran province and also all analyses were performed on the basis of statistics measured in synoptic stations in this area. It should be emphasized that all reviews and results are based on this range and the data and cannot be generalized. Keywords: Inversion, Air Pollution, Thermal Cells, Pressure Cells, Tehran. References- Ccoyllo, S. O. R., & Andrade, M. F. (2002). The influence of meteorological conditions on the behavior of Sapaolo Brazil. (n.p).- Dutta, J., Chowdhury, C., Roy, S., Middya, A. I., & Gazi, F. (2017). Towards smart city: Sensing air quality in city based on opportunistic crown-sensing. In Proceedings of the 18th International Conference on Distributed Computing and Networking, Hyderabad, India, 5–7.- Fargkou, M. C. (2009). Evaluation of urban sustainability through a metabolic perspective. PhD Thesis, Environmental Sciences, Universitat Autonoma de Barcelona.- Fortelli, A., Scafetta, N., & Mazzarella A. (2016). Influence of synoptic and local atmospheric patterns on PM10 air pollution levels: A model application to Naples (Italy). Journal of Atmospheric Environment, 143, 218-228.- Ma, J., Chen, L. L., Guo, Y., Wu, Q., Yang, M., Wu, M. H., & Kannan, K. (2014). Phthalate diester in Airborne PM2.5 and PM10 in a suburban area of Shanghai: Seasonal distribution and risk assessment. Journal of Science of the Total Environment, 497, 467-474.- Mohan, M., & Kandya, A. (2007). An analysis of the annual and seasonal trends of air quality index of Delhi. Journal of Environmental Monitoring and Assessment, 131(1-3), 267-277.- Molina, M. J., & Molina, L. T. (2004). Megacities and atmospheric pollution. Journal of the Air and Waste Management Association, 54(6), 644-680.- Nieuwenhuijsen, M. J., Basagan, X., Dadvand, P., Martinez, D., Cirach, M., Beelen, R., & Jacquemin, B. (2014). Air pollution and human fertility rates. Environmental International, 70, 9-14.- Song, X. D., Wang, S., Hao, C., & Qiu, J. S. (2014). Investigation of SO2 gas adsorption in metal-organic frameworks by molecular simulation. Journal of Inorganic Chemistry Communications, 46, 277-281.- Tian, G., Qiao, Z., & Xu, X. (2014). Characteristics of Particulate matter (PM10) and its relationship with meteorological factors during 2001-2012 in Beijing. Journal of Environmental Pollution, 192, 266-274.- Xing, Y., Horner, R. M. W., El-Haram, M. A., & Bebbington, J. (2009). A framework model for assessing sustainability impacts of urban development. Journal of Accounting Forum, 33, 209-224

    IR2Vec: LLVM IR based Scalable Program Embeddings

    Full text link
    We propose IR2Vec, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the input programs. As our infrastructure is based on the Intermediate Representation (IR) of the source code, obtained embeddings are both language and machine independent. The entities of the IR are modeled as relationships, and their representations are learned to form a seed embedding vocabulary. Using this infrastructure, we propose two incremental encodings:Symbolic and Flow-Aware. Symbolic encodings are obtained from the seed embedding vocabulary, and Flow-Aware encodings are obtained by augmenting the Symbolic encodings with the flow information. We show the effectiveness of our methodology on two optimization tasks (Heterogeneous device mapping and Thread coarsening). Our way of representing the programs enables us to use non-sequential models resulting in orders of magnitude of faster training time. Both the encodings generated by IR2Vec outperform the existing methods in both the tasks, even while using simple machine learning models. In particular, our results improve or match the state-of-the-art speedup in 11/14 benchmark-suites in the device mapping task across two platforms and 53/68 benchmarks in the Thread coarsening task across four different platforms. When compared to the other methods, our embeddings are more scalable, is non-data-hungry, and has betterOut-Of-Vocabulary (OOV) characteristics.Comment: Accepted in ACM TAC
    corecore