37 research outputs found
A common framework of partition-based clustering for large scale dataset using sampling and its MapReduce implementation
Grupiranje (clustering) je jedan od važnih zadataka u rudarenu podataka (data mining), a algoritmi grupiranja utemeljenog na raspodjeli kao Å”to su k-naÄin jedno su od popularnih rjeÅ”enja. Ipak, sve veÄim razvojem raÄunarstva u oblaku i ogromne koliÄine podataka, prijenos velikog broja podataka postao je veliki izazov za grupiranje. Na primjer, izvoÄenje algoritma grupiranja oduzima previÅ”e vremena, optimizacija parametara je teÅ”ka, a kvaliteta grupa (klastera) nije dobra. U tu smo svrhu u ovom radu predložili uobiÄajeni okvir za algoritme grupiranja utemeljenog na raspodjeli kao Å”to su k-naÄin i dizajnirali njegovu MapReduce implementaciju. Posebice smo, u svrhu predstavljanja prijenosa velikog broja podataka, predložili primjenu tehnike uzorkovanja. Zatim, koristeÄi k-naÄin algoritam, predlažemo uobiÄajeni postupak grupiranja i opisujemo primjenu na temelju k-naÄin algoritma. Nadalje, implementiramo predloženi okvir primjenom MapReduce modela programiranja. Eksperimenti pokazuju da je naÅ”a metoda uÄinkovita za prijenos velikog broja podataka.Clustering is one of the significant tasks in data mining, and partition-based clustering algorithms such as k-means are one of the popular solutions. However, with the increasing development of cloud computing and big data, large scale dataset has been a big challenge for clustering. For example, the execution of clustering algorithm is too time-consuming, the optimization of parameters is difficult, and the quality of clusters is not good. To this end, in this paper, we proposed a common framework of partition-based clustering algorithms such as k-means, and designed its MapReduce implementation. Specifically, in order to deal with the representation of large scale dataset, we propose to employ sampling technique. Then, inspired by k-means algorithm, we propose a common procedure of clustering, and provide a k-means based implementation. Furthermore, we implement proposed framework using MapReduce programming model. Experiments show that our method is efficient for large scale dataset
How to promote the hierarchical diagnosis and treatment system: A tripartite evolutionary game theory perspective
Due to the disorderly access to medical care and inefficient use of health resources, the advancement of the hierarchical diagnosis and treatment is more valued in promoting health system reform. Hence, this article integrates prospect theory into an evolutionary game model of the local government health departments, the medical institutions, and the patients in the system promotion of the hierarchical diagnosis and treatment. The simulation shows the specific influencing mechanism of the psychological perceived value of game subjects. Then by introducing the stochastic evolutionary game model, the system promotion under different medical cultures is also discussed in detail. The results indicate that for local government health departments, the amount and duration of financial subsidies are the key factors influencing the game systemās evolution. For medical institutions, participating in the hierarchical diagnosis and treatment system is relatively beneficial. For patients, the recovery rate in primary hospitals matters more than the cost of treatment. Changes in the risk sensitivity coefficient will cause the equilibrium of the game system to change. However, changes in the loss avoidance factor do not change the equilibrium and only have an impact on the speed of convergence. With the health departmentsā intervention, patients in rural medical culture are more inclined to support the hierarchical diagnosis and treatment system than those in urban or town medical culture. Therefore, in order to promote the hierarchical diagnosis and treatment system, this article recommends that more attention should be paid to the regulatory role of health departments and the participation improvement of medical institutions and patients
Research on the Efficiency of Local Government Health Expenditure in China and Its Spatial Spillover Effect
The efficiency of the local government health expenditure (GHE) in China determines the level of public health services. However, the local government does not pay much attention to that efficiency, though the scale of local GHE is increasing. In this paper, first, we use the data envelopment analysis (DEA) method to measure the static overall efficiency of the local government health expenditure (GHE) in each region of China from 2007 to 2016. Then, based on the spatial statistical theory, global and local spatial Moran’s I value is utilized to investigate its spatial correlation and spatial agglomeration phenomenon. Finally, the spatial spillover effect (SSE) of the static overall efficiency of local GHE in each region is measured by constructing a spatial Durbin model (SDM). It is demonstrated that there are significant differences in the efficiency of the local GHE between different regions of China. In addition, it is shown that Moran’s I value of the static overall efficiency of the local GHE from 2007 to 2016 is positive. It passed the test of the 5% significance level, indicating that there is a positive spatial correlation between the efficiency of the local GHE and a spatial spillover effect. On the other hand, the decomposition of the SDM reveals that the proportion of GHE to financial expenditure, gross domestic product (GDP) per capita, and population density have a positive effect on the efficiency of the local GHE. Hence, their growth will improve the GHE efficiency in the local region and neighboring regions. In contrast, the proportion of urban population, illiteracy, and fiscal decentralization have a negative effect. Thus, their growth will decrease the GHE efficiency in the local region and neighboring regions. The results are discussed and suggestions are given based on the analysis in this paper. The main contribution of this work is to consider the spatial spillover effect in terms with realistic meaning. The results obtained can be used as a reference for optimizing the structure and improving the efficiency of government health inputs. It breaks the government’s GDP-only theory-based assessment system and helps to improve it by assessing the GHE efficiency
Multiperiod Transfer Synchronization for Cross-Platform Transfer in an Urban Rail Transit System
Transfer synchronization is an important issue in timetable scheduling for an urban rail transit system, especially a cross-platform transfer. In this paper, we aim to optimize the performance of transfer throughout the daily operation of an urban rail transit system. The daily operation is divided into multiple time periods and each time period has a specific headway to fulfill time varied passenger demand. At the same time, the turn-back process of trains should also be considered for a real operation. Therefore, our work enhances the base of the transfer synchronization model taking into account time-dependent passenger demand and utilization of trains. A mixed integer programming model is developed to obtain an optimal timetable, providing a smooth transfer for cross-transfer platform and minimizing the transfer waiting time for all transfer passengers from different directions with consideration of timetable symmetry. By adjusting the departure time of trains based on a predetermined timetable, this transfer optimization model is solved through a genetic algorithm. The proposed model and algorithm are utilized for a real transfer problem in Beijing and the results demonstrate a significant reduction in transfer waiting time
The Depth-First Optimal Strategy Path Generation Algorithm for Passengers in a Metro Network
Passenger behavior analysis is a key issue in passenger assignment research, in which the path choice is a fundamental component. A highly complex transit network offers multiple paths for each origin–destination (OD) pair and thus resulting in more flexible choices for each passenger. To reflect a passenger’s flexible choice for the transit network, the optimal strategy was proposed by other researchers to determine passenger choice behavior. However, only strategy links have been searched in the optimal strategy algorithm and these links cannot complete the whole path. To determine the paths for each OD pair, this study proposes the depth-first path generation algorithm, in which a strategy node concept is newly defined. The proposed algorithm was applied to the Beijing metro network. The results show that, in comparison to the shortest path and the K-shortest path analysis, the proposed depth-first optimal strategy path generation algorithm better represents the passenger behavior more reliably and flexibly
Train Distance Estimation for Virtual Coupling Based on Monocular Vision
By precisely controlling the distance between two train sets, virtual coupling (VC) enables flexible coupling and decoupling in urban rail transit. However, relying on train-to-train communication for obtaining the train distance can pose a safety risk in case of communication malfunctions. In this paper, a distance-estimation framework based on monocular vision is proposed. First, key structure features of the target train are extracted by an object-detection neural network, whose strategies include an additional detection head in the feature pyramid, labeling of object neighbor areas, and semantic filtering, which are utilized to improve the detection performance for small objects. Then, an optimization process based on multiple key structure features is implemented to estimate the distance between the two train sets in VC. For the validation and evaluation of the proposed framework, experiments were implemented on Beijing Subway Line 11. The results show that for train sets with distances between 20 m and 100 m, the proposed framework can achieve a distance estimation with an absolute error that is lower than 1 m and a relative error that is lower than 1.5%, which can be a reliable backup for communication-based VC operations
A Review of Sustainable Maintenance Strategies for Single Component and Multicomponent Equipment
Contemporary industrial equipment is increasingly developing towards complexity. In order to ensure the high reliability and sustainability of industrial equipment, more flexible maintenance strategies have attracted extensive attention. In view of this, this paper aims to summarize the current situation of existing maintenance strategies, so as to enable colleagues in the industry to choose or formulate more efficient maintenance strategies. Firstly, the characteristics, application potential and limitations of single component maintenance strategies, such as corrective maintenance, preventive maintenance and predictive maintenance, are described in detail from the perspective of maintenance time. On the basis of single component maintenance and the dependency between multiple components, the advantages and disadvantages of multicomponent maintenance strategies, such as batch maintenance, opportunity maintenance and group maintenance, are summarized, and suggestions for the future maintenance of industrial equipment are proposed. Based on this, industries can select the appropriate maintenance strategy according to their equipment characteristics, or improve their existing maintenance strategies based on actual needs
Long Short-Term Memory Neural Network Applied to Train Dynamic Model and Speed Prediction
The automatic train operation system is a significant component of the intelligent railway transportation. As a fundamental problem, the construction of the train dynamic model has been extensively researched using parametric approaches. The parametric based models may have poor performances due to unrealistic assumptions and changeable environments. In this paper, a long short-term memory network is carefully developed to build the train dynamic model in a nonparametric way. By optimizing the hyperparameters of the proposed model, more accurate outputs can be obtained with the same inputs of the parametric approaches. The proposed model was compared with two parametric methods using actual data. Experimental results suggest that the model performance is better than those of traditional models due to the strong learning ability. By exploring a detailed feature engineering process, the proposed long short-term memory network based algorithm was extended to predict train speed for multiple steps ahead
Does Scale and Efficiency of Government Health Expenditure Promote Development of the Health Industry?
Macro-economic development of China’s health industry is essential to the sustainable development and growth momentum of the national economy. Strategies to promote the development and rebalancing of the industrial structure need to be improved in order to transform China’s health industry and drive development. Based on panel data of 25 regions in China from 2004 to 2016, this paper analyzes the linear and non-linear relationship between Chinese government health expenditure (GHE), GHE efficiency, and the macro-economic development of the health industry. It uses a novel index of industrial structure to measure the transformation of industrial sectors in China, based on a semi-parametric generalized additive model. The model shows that per capita GHE and its efficiency have a significant positive linear and comprehensive non-linear effect on the development of health industry structure. By analyzing the interaction of GHE and its efficiency, we show that high expenditure with low-efficiency regimes and high expenditure with high-efficiency regimes have a positive impact on the development of industrial structure. Following the empirical results, the paper puts forward corresponding policy suggestions for the role of fiscal policy in promoting the development of the health industry in China