Search CORE

231 research outputs found

GeoMatch: Efficient Large-Scale Map Matching on Apache Spark

Author: Lagerspetz Eemil
Nurmi Petteri
Tarkoma Sasu
Vo Huy
Zeidan Ayman
Zhao Kai
Publication venue
Publication date: 01/01/2019
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Lancaster E-Prints

Data-Driven Dynamic Robust Resource Allocation: Application to Efficient Transportation

Author: Miao Fei
Publication venue: ScholarlyCommons
Publication date: 01/01/2016
Field of study

The transformation to smarter cities brings an array of emerging urbanization challenges. With the development of technologies such as sensor networks, storage devices, and cloud computing, we are able to collect, store, and analyze a large amount of data in real time. Modern cities have brought to life unprecedented opportunities and challenges for allocating limited resources in a data-driven way. Intelligent transportation system is one emerging research area, in which sensing data provides us opportunities for understanding spatial-temporal patterns of demand human and mobility. However, greedy or matching algorithms that only deal with known requests are far from efficient in the long run without considering demand information predicted based on data. In this dissertation, we develop a data-driven robust resource allocation framework to consider spatial-temporally correlated demand and demand uncertainties, motivated by the problem of efficient dispatching of taxi or autonomous vehicles. We first present a receding horizon control (RHC) framework to dispatch taxis towards predicted demand; this framework incorporates both information from historical record data and real-time GPS location and occupancy status data. It also allows us to allocate resource from a globally optimal perspective in a longer time period, besides the local level greedy or matching algorithm for assigning a passenger pick-up location of each vacant vehicle. The objectives include reducing both current and anticipated future total idle driving distance and matching spatial-temporal ratio between demand and supply for service quality. We then present a robust optimization method to consider spatial-temporally correlated demand model uncertainties that can be expressed in closed convex sets. Uncertainty sets of demand vectors are constructed from data based on theories in hypothesis testing, and the sets provide a desired probabilistic guarantee level for the performance of dispatch solutions. To minimize the average resource allocation cost under demand uncertainties, we develop a general data-driven dynamic distributionally robust resource allocation model. An efficient algorithm for building demand uncertainty sets that compatible with various demand prediction methods is developed. We prove equivalent computationally tractable forms of the robust and distributionally robust resource allocation problems using strong duality. The resource allocation problem aims to balance the demand-supply ratio at different nodes of the network with minimum balancing and re-balancing cost, with decision variables on the denominator that has not been covered by previous work. Trace-driven analysis with real taxi operational record data of San Francisco shows that the RHC framework reduces the average total idle distance of taxis by 52%, and evaluations with over 100GB of New York City taxi trip data show that robust and distributionally robust dispatch methods reduce the average total idle distance by 10% more compared with non-robust solutions. Besides increasing the service efficiency by reducing total idle driving distance, the resource allocation methods in this dissertation also reduce the demand-supply ratio mismatch error across the city

ScholarlyCommons@Penn

Designing an On-Demand Dynamic Crowdshipping Model and Evaluating its Ability to Serve Local Retail Delivery in New York City

Author: najaf abadi shirin
Publication venue: CUNY Academic Works
Publication date: 01/01/2019
Field of study

Nowadays city mobility is challenging, mainly in populated metropolitan areas. Growing commute demands, increase in the number of for-hire vehicles, enormous escalation in several intra-city deliveries and limited infrastructure (road capacities), all contribute to mobility challenges. These challenges typically have significant impacts on residents’ quality-of-life particularly from an economic and environmental perspective. Decision-makers have to optimize transportation resources to minimize the system externalities (especially in large-scale metropolitan areas). This thesis focus on the intra-city mobility problems experienced by travelers (in the form of congestion and imbalance taxi resources) and businesses (in the form of last-mile delivery), while taking into consideration a measurement of potential adoption by citizens (in the form of a survey). To find solutions for this mobility problem this dissertation proposes three distinct and complementary methodological studies. First, taxi demand is predicted by employing a deep learning approach that leverages Long Short-Term Memory (LSTM) neural networks, trained over publicly available New York City taxi trip data. Taxi pickup data are binned based on geospatial and temporal informational tags, which are then clustered using a technique inspired by Principal Component Analysis. The spatiotemporal distribution of the taxi pickup demand is studied within short-term periods (for the next hour) as well as long-term periods (for the next 48 hours) within each data cluster. The performance and robustness of the LSTM model are evaluated through a comparison with Adaptive Boosting Regression and Decision Tree Regression models fitted to the same datasets. On the next study, an On-Demand Dynamic Crowdshipping system is designed to utilize excess transport capacity to serve parcel delivery tasks and passengers collectively. This method is general and could be expanded and used for all types of public transportation modes depending upon the availability of data. This system is evaluated for the case study of New York City and to assess the impacts of the crowdshipping system (by using taxis as carriers) on trip cost, vehicle miles traveled, and people travel behavior. Finally, a Stated Preference (SP) survey is presented, designed to collect information about people’s willingness to participate in a crowdshipping system. The survey is analyzed to determine the essential attributes and evaluate the likelihood of individuals participating in the service either as requesters or as carriers. The survey collects information on the preferences and important attributes of New York citizens, describing what segments of the population are willing to participate in a crowdshipping system. While the transportation problems are complex and approximations had to be done within the studies to achieve progress, this dissertation provides a comprehensive way to model and understand the potential impact of efficient utilization of existing resources on transportation systems. Generally, this study offer insights to decisions makers and academics about potential areas of opportunity and methodologies to optimize the transportation system of densely populated areas. This dissertation offers methods that can optimize taxi distribution based on the demand, optimize costs for retail delivery, while providing additional income for individuals. It also provides valuable insights for decision makers in terms of collecting population opinion about the service and analyzing the likelihood of participating in the service. The analysis provides an initial foundation for future modeling and assessment of crowdshipping

City University of New York

THE BIOLOGY OF GENOMES

Author: Celniker S.
Clark A.
Ponting C.
Weinstock G.
Publication venue
Publication date: 01/05/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

Regional variations in automation job risk and labour market thickness to agricultural employment

Author: Crowley Frank
Doran Justin
Rijnks Richard Henry
Publication venue: 'Elsevier BV'
Publication date: 21/03/2022
Field of study

Automation has the potential to transform entire agricultural value chains and the nature of agricultural business. Recent studies have emphasised barriers to adoption, as well as issues related to labour market and cultural outcomes of automation. However, thus far, very little attention has been afforded to the regional variations in the potential for automation adoption or threats to agricultural employment. Specifically, research to date does not take into account the local availability of similar occupations including those in different sectors to which displaced workers may transition. Threats to employment and lower numbers of similar jobs locally are particularly salient in rural contexts, given the thin and specialized local labour markets. The aims of this paper are to show the regional distribution of risk to automation for the agricultural sector specifically, and to link these patterns to indicators for occupation specific labour market thickness in Ireland. Using detailed occupational skills data, we construct indices for local labour market thickness conditioned on occupational skills and knowledge requirements. We show that there is substantial regional heterogeneity in the potential threat of automation to the employment prospects of workers currently active in the agricultural sector. This regional heterogeneity highlights the importance of the regional context for designing effective labour market policy in the face of job automation

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Cork Open Research Archive

Dissertations of the University of Groningen

Distributed Partitioning and Processing of Large Spatial Datasets

Author: Zeidan Ayman I
Publication venue: CUNY Academic Works
Publication date: 01/02/2022
Field of study

Data collection is one of the most common practices in today’s world. The data collection rate has rapidly increased over the past decade and is not showing any signs of decline. Data sources are many; the Internet of Things devices, mobile gadgets, social media posts, connected cars, and web servers constantly report on their users’ interactions and habits. Much of the collected data is spatial data which contains attributes that denote the physical origin of the data. As a result of the tremendous growth in data collection, higher demand for new techniques emerged to efficiently process and extract valuable insights in a relatively acceptable time frame. The current standard approach to large-scale data analysis uses distributed parallel processing systems like Apache Hadoop and Apache Spark. However, these systems are designed for general-purpose parallel processing and require an additional layer to recognize and efficiently process spatial datasets. Motivated by its many applications, we examine the several challenges facing spatial data partitioning and processing and propose solutions customized for each task. We detail our techniques for building spatial partitioners over large datasets for use with spatial queries like map-matching and kNN spatial join. Additionally, we present an accuracy benchmarking framework for comparing and classifying the results of two input files based on specific criteria. Our proposed work targets batch processing of large spatial datasets, including structured, unstructured, and semi-structured datasets

City University of New York

Density of demand and the consumer benefit from Uber

Author: SHAPIRO Matthew H.
Publication venue
Publication date: 01/12/2020
Field of study

Institutional Knowledge at Singapore Management University

The compatibility of offline labour platforms of the gig economy with Europe's social market model:Addressing policy gaps in a quasi-federal bloc

Author: Hawley Adrian
Publication venue
Publication date: 01/01/2019
Field of study

Royal Holloway - Pure

The semicircular flow of the data economy

Author: DE PEDRAZA GARCIA PABLO
VOLLBRACHT IAN
Publication venue: 'Publications Office of the European Union'
Publication date: 24/07/2019
Field of study

This paper revisits the traditional ‘circular flow’ of the macroeconomy (Samuelson, 1948) and reworks it to capture the use of big data and artificial intelligence in the economy. The characterisation builds on the multifaceted role of data to conceptualise markets and differentiate them depending on whether data is an output, a means of payment, or an input in knowledge extraction processes. After this, the main differences between the circular flow economy and the data economy are described, identifying the new flows and agents and the circular flow assumptions that do not seem to be as relevant to the workings of the data economy. The result is a ‘semicircular’ flow diagram: unprocessed data flow from individuals, families, and firms to data holders. Only data processed in the form of digital services flows back to families and firms. The new model is used to explore the potential for market failures. Knowledge extraction to generate digital services occurs within a ‘black box’ that displays natural monopoly characteristics. Data holders operate simultaneously in the markets for data generation and knowledge extraction. They generate the amount of knowledge that maximises their profit. This creates data underutilisation and asymmetries between data holders and other agents in the economy such as anti-trust authorities, central banks, scientific communities, consumers, and firms. Public intervention should facilitate additional generation of knowledge by developing additional merit and non-rival uses of data in such a way that knowledge generation maximises the social gain from digitalisation. The semicircular model can incorporate data leakages and knowledge injections activated by data taxation. Data taxes should be paid with data respecting existing legislation, privacy concerns, and preserve the incentives of the data holder to innovate in competitive data generation markets. A centralised data authority, as initially proposed by Martens (2016) and more recently by Scott Morton et al. (2019), would be responsible for knowledge generation and aim to achieve better regulation, standards, and transparency, and maximise common good. Our conclusions are in line with an extensive user-centric approach to data portability (De Hert et al., 2018). This paper contributes to the digital economy discussion by developing a simple theoretical motivation for increased access to data for the public good, which will stimulate further theoretical and empirical exercises and lead to policy actions.JRC.I.1-Monitoring, Indicators & Impact Evaluatio

JRC Publications Repository