    Some Contribution of Statistical Techniques in Big Data: A Review

    Big Data is a popular topic in research work. Everyone is talking about big data, and it is believed that science, business, industry, government, society etc. will undergo a through change with the impact of big data.Big data is used to refer to very huge data set having large, more complex, hidden pattern, structured and unstructured nature of data with the difficulties to collect, storage, analysing for process or result. So proper advanced techniques to use to gain knowledge about big data. In big data research big challenge is created in storage, process, search, sharing, transfer, analysis and visualizing. To deeply discuss on introduction of big data, issue, management and all used big data techniques. Also in this paper present a review of various advanced statistical techniques to handling the key application of big data have large data set. These advanced techniques handle the structure as well as unstructured big data in different area

    Benchmarking real and ideal cities: a multicriteria analysis of city performance based on urban form

    The debate on the ideal urban layout, or form has long been an active topic of research. As cities expand and population demands rise, the quest for efficient and sustainable urban designs gains greater significance, necessitating objective and quantitative evaluation of their performance. This article adds to the debate by presenting a multicriteria analysis of city performance, based on quantitative indicators obtainable from geographic information systems calculations, which focus on sustainability and physical pleasantness issues. Indicator values were derived for a real city, its infill version, and five redrafts as classic city models existing in the literature. The city layouts were then compared using the TOPSIS multicriteria ranking method, results showing a preference for the more compact urban layouts due to the multiple advantages of having shorter distances between supply and demand points. The methodology provides quantitative insights on city performance and efficiency and can be used to compare options for city expansions or major urban regeneration projects.info:eu-repo/semantics/publishedVersio

    Developing an optimised activity type annotation method based on classification accuracy and entropy indices

    The generation of substantial amounts of travel and mobility related data has spawned the emergence of the era of big data. However, this data generally lacks activity-travel information such as trip purpose. This deficiency led to the development of trip purpose inference (activity type imputation / annotation) techniques, of which the performance depends on the available input data and the (number of) activity type classes to infer. Aggregating activity types strongly increases the inference accuracy and is usually left to the discretion of the researcher. As this is open for interpretation, it undermines the reported inference accuracy. This study developed an optimised classification methodology by identifying classes of activity types with an optimal balance between improving model accuracy, and preserving activity information from the original data set. A sensitivity analysis was performed. Additionally, several machine learning algorithms are experimented with. The proposed method may be applied to any study area

    Sanitization of Transportation Data: Policy Implications and Gaps

    UC-ITS-2020-04Data about mobility provides information to improve city planning, identify traffic patterns, detect traffic jams, and route vehicles around them. This data often contains proprietary and personal information that companies and individuals do not wish others to know, for competitive and personal reasons. This sets up a paradox: the data needs to be analyzed, but it cannot be without revealing information that must be kept secret. A solution is to sanitize the data\u2014i.e., remove or suppress the sensitive information. The goal of sanitization is to protect sensitive information while enabling analyses of the data that will produce the same results as analyses of the unsanitized data. However, protecting information requires that sanitized data cannot be linked to data from other sources in a manner that leads to desanitization. This project reviews typical strategies used to sanitize datasets, the research on how some of these strategies are unsuccessful, and the questions that must be addressed to better understand the risks of desanitization

    Identificación de los usos de suelo y análisis de viajes de residentes en Pekín utilizando los datos masivos de Mobike y puntos de interés

    Màster universitari en Estudis Avançats en Arquitectura: Gestió i Valoració Urbana i ArquitectònicaEn los últimos años, los datos individuales de espacio-tiempo a gran escala y de alta calidad se han convertido en información de rápido acceso debido al acelerado desarrollo de la tecnología LBS (Location Based Service). En la actualidad el uso de esta tecnología de micro datos y minería de datos en el estudio de refinamiento urbano de la ciudad se ha convertido en la tendencia principal de los estudios urbanos. El análisis de estos datos sobre puntos de interés de las ciudades puede reflejar las actividades urbanas; ayudando a identificar usos de suelo, como los datos que describen los edificios emblemáticos de la ciudad. Por otro lado, los datos abiertos de las bicicletas compartidas reflejan el alcance de las actividades del usuario y la dinámica de la estructura espacial-temporal urbana. La combinación de datos dinámicos y estáticos ayudará a los planificadores urbanos y al público a comprender la compleja estructura espacial urbana y contribuirán a la geografía urbana y el urbanismo. En este artículo, en primer lugar, utilizamos los datos de apertura de Mobike para revelar las características y patrones de los residentes desde punto de acceso y el espacio-temporal de una manera visual después del procesamiento de datos, la coincidencia de mapas y el análisis de clústeres. Por su parte este documento también presenta un método para identificar y zonificar los usos de suelo basada en datos geográficos de fuentes públicas. La selección del área de estudio se realiza mediante datos de la red de carreteras OSM. A continuación, sobre una base ya definida del concepto de los usos del terreno, se crea un método de división de dichos suelos basado en el punto de interés. Con el análisis del mapa general de planificación urbana y el mapa de los usos de suelo, los resultados muestran que el efecto de reconocimiento es preciso, incluso los detalles identificados de los usos de suelo son más precisos que los del mapa de planificación urbana

    Caractérisation structurelle des déplacements en transport en commun au regard des données de l'enquête origine-destination et des cartes à puce

    RÉSUMÉ La collecte passive des informations constitue une des pistes d’avenir dans l’optique d’une meilleure connaissance et opérationnalité des systèmes de transport, relativement à des données récoltées manuellement. Ces données de grande quantité et de faible coût sont de plus en plus répandues, notamment au travers des données de cartes à puce dans les Autorités Organisatrices de Transport. Elles sont une prodigieuse opportunité de mieux caractériser la demande en Transport en Commun. Le traitement et l’intégration de ces données constituent un important défi.----------ABSTRACT The passive collection of information is one of the avenues of the future with a view to improve knowledge and operationality of transport systems in the face of data collected manually. These massive quantities of data at a low cost are increasingly widespread, especially through smart card data within the public transport authorities. They are an excellent opportunity to better characterize the demand for public transport. Processing and integrating these kinds of data is an important challenge

    Behavioral Approach to Estimation of Smart Card Holders Socio-Demographic Characteristics in a Public Transportation System

    RÉSUMÉ Les systèmes de collecte automatisée des titres de transport sont utilisés dans de nombreuses villes, le titre de transport est le plus souvent stocké sur une carte à puce (CAP). Ils génèrent quotidiennement d’importants volumes de données liées à la mobilité des individus. Il devient très intéressant de disposer de méthodes pour pouvoir utiliser ces données car elles présentent le triple avantage d’être exhaustives, longitudinales et directement liées au réseau de transport en commun. En effet, tous les passagers doivent valider leur embarquement (à l’exception des fraudeurs qui s’octroient s’affranchissent de ce devoir), ces données sont récoltées tous les jours de l’année pour l’intégralité du réseau et elles sont liées à un véhicule et une ligne de bus. Il y a de nombreuses applications développées à ce jour: reconstitution des chaînes de déplacements, étude des typologies de déplacements sur le réseaux, étude de la loyauté des usagers, validation des enquêtes de déplacements, identification des maxima de charges sur chaque ligne, étude de l’adéquation de l’offre à la demande, etc.----------ABSTRACT Automated Fare Collection (AFC) systems such as smart cards are being used in many different cities and countries. The AFC systems leverage large volume of data related to person’s mobility and it becomes very interesting to develop methods to use these data to complement other data sources. They present four main advantages, they are longitudinal, they concern every public transit user (within the limitation of the penetration rate of the smart card and the policy around shared smart card ownership), they are passively leveraged and they are directly related to the public transit structure. There are already many applications such as processing the trip chain, study of public transit users loyalty and behaviour, validation of travel surveys etc

    Modelo de análise do potencial de promoção de centralidade com base em uso do solo, rede de transportes e configuração

    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Civil e Ambiental, 2018.A pesquisa procura compreender a articulação entre uso do solo, rede de transportes e configuração (forma urbana) na formação das centralidades urbanas. O objetivo é desenvolver um estudo para verificação do potencial de promoção de novas centralidades em zonas servidas por estações de transporte de massa. A partir da investigação do espaço urbano como um sistema, formado pela interação de fenômenos que se processam nas escalas local e global, as concentrações responsáveis pela estrutura urbana foram investigadas dentro dos conceitos propostos em estudos de natureza econômica, comportamento de viagens e ambiente construído, e de análise configuracional com base nas ferramentas da Sintaxe Espacial. A metodologia aplicada permitiu calcular matematicamente, por meio das regularidades manifestadas na associação de variáveis representativas das três dimensões propostas pela pesquisa, o potencial de geração de viagens de uma determinada zona a fim de identificar áreas no tecido urbano com possibilidade de intensificação de usos e desenvolvimento de novos núcleos de atividades.The research seeks to understand the articulation between land use, transport network and configuration (urban form) in the growth of urban centralities. The main goal is to develop a study to verify the potential of promoting new centralities in areas served by transit stations. From the investigation of the urban space as a system, shaped by global and local phenomena, the concentrations responsible for the urban structure were investigated within the concepts proposed in the studies of economic nature, travel behavior and built environment, and of configurational analysis based on the tools of Space Syntax. The applied methodology allowed to calculate mathematically, through the regularities manifested by the association of variables representative of the three dimensions proposed by the research, the potential of trip generation in a determined zone in order to identify the areas in the urban fabric with the possibility of intensification of uses and development of new activities