2,153 research outputs found

    Hidden properties identification and text diversity translation of people’s names

    Get PDF
    Developments in the ability to analyse online data of people’s names have provided breakthroughs for social research, however, there are several existing challenges. This thesis proposes several novel approaches for identifying hidden properties and text diversity translation of people’s names. We start by studying the hidden properties of people’s names and found that there are limited existing methods to identify more than one hidden property of people’s names at one time. We, therefore, propose a ’Hidden Property Bayes’ model that achieves identifying more than one hidden property of people’s names in Kanji and Hanzi at one time. In addition, our model performs better than an existing system on name origin identification. We then moved on to text diversity translation and found that translating romanised names to the original language is a challenge. Therefore, we propose two novel models to translate Pinyin names to Hanzi names. These two novel models perform better than ’google translate’ on Mandarin name translation. We next investigated gender prediction of people’s names and found that limited existing tools can predict and analyse the data in one process. Therefore, we propose a ’Name-Gender’ tool that achieves predicting the gender of people’s names and also provides a statistical graph directly. In addition, our tool has better performance than an existing system on predicting the genders of people’s names in Latin and Hanzi characters. We also provide novel findings of gender analysis in computer science using our ’Name-Gender’ tool approaches. Overall, our contributions provide effective novel approaches to support social researchers analysing online data sources of people’s names to aid them in understanding real-world events

    Worldwide AI Ethics: a review of 200 guidelines and recommendations for AI governance

    Full text link
    In the last decade, several organizations have produced documents intended to standardize, in the normative sense, and promote guidance to our recent and rapid AI development. However, the full spectrum of ideas presented in these documents has not yet been analyzed, except for a few meta-analyses and critical reviews of the field. In this work, we seek to expand on the work done by past researchers and create a tool for better data visualization of the contents and nature of these documents, to understand whether there is consensus or similarity between the principles espoused by various institutions, which may inspire debates on future regulations. We also provide some preliminary thoughts and questions that could guide the continuity of the research through a critical analysis of the results acquired by our methodology into a sample size of 200 documents

    Equity in Transportation: Data Driven Analysis of Transportation Services and Infrastructures

    Get PDF
    Achieving equity in transportation is an ongoing challenge, as transportation options still vary tremendously when it comes to marginalized populations. This dissertation addresses this challenge by conducting a comprehensive review of existing transportation equity literature and identifying two critical gaps: the lack of data-driven approaches to studying spatial mismatch between transportation supply and demand, and limited information on women\u27s perceptions and expectations towards emerging transportation services. Chapter two introduces the concept of transportation deserts, specifically transit deserts and walking deserts, and develops data-driven frameworks to identify and investigate neighborhoods with limited transportation service supply but high demand. The frameworks compare mobility demand and supply for active transportation modes and utilize statistical modeling techniques to reveal the inequitable distribution of transportation services. The identification of transportation deserts provides valuable insights for investment and redevelopment, highlighting areas of underinvestment. Chapter three focuses on gender equity and the lack of understanding about transportation user preferences, particularly for women. Through a gender-sensitive analysis of online reviews using text-mining techniques, the chapter presents an empirical analysis of rider satisfaction with scooter services. The study utilizes online data from app store reviews and employs machine learning techniques to uncover factors that influence overall satisfaction across genders. The findings enhance our understanding of gendered differences in micromobility rider sentiment and satisfaction. In conclusion, this dissertation offers a comprehensive examination of transportation equity from multiple perspectives. It identifies critical gaps in existing literature and employs innovative analytical methodologies to address these gaps. The research findings have important policy implications for city planners, transportation managers, urban authorities, and decision-makers striving to create inclusive and vibrant urban spaces that benefit all members of society. By addressing these gaps, policymakers can promote equitable transportation services and ensure access to safe, reliable, and affordable transportation options for all individuals

    A case study of survival and presentation of gastroesophageal cancer in local neighbourhoods

    Get PDF
    This thesis presents a quantitative case study on incidence, survival and presentation of patients diagnosed with gastroesophageal cancer to evaluate whether where people live affects how they present and survive with a gastroesophageal cancer diagnosis. The focus research evolved from studies on gastroesophageal cancer’s ‘geographic affiliation’ and a desire to review whether patient and population attributes could be harnessed to reveal potential ‘hotspots’ to inform targeted health intervention strategies. As the most crucial stage for intervention was associated with patients detecting symptoms early enough for intervention, the focus of this case study was narrowed to survival and presentation.This research analysed data from 2785 patients who presented to a regional referral specialist cancer treatment centre between the years 2000 and 2013. Cohort analysis revealed common attributes and survival, and data were merged with demographic information in a geographic information system to present findings in mapped format.Descriptive analysis revealed an association between later stage presentation and reduced survival outcome. Emergency presentations tended to have worse outcomes. Survival deteriorated with advancing age. Gastroesophageal cancer diagnoses in the under 54 age group was more common in lower socioeconomic groups and survival outcomes were marginally lower than in those patients from the least deprived areas. Spatial analysis revealed variation in incidence, presentation and survival across the region. Though this case study revealed several new findings on gastroesophageal cancer presentation and survival, there remains no single solution to informing and encouraging earlier diagnosis interventions. Though presenting data at finer scales of resolution is more clinically relevant, it threatens patient confidentiality

    Real Time Crime Prediction Using Social Media

    Get PDF
    There is no doubt that crime is on the increase and has a detrimental influence on a nation's economy despite several attempts of studies on crime prediction to minimise crime rates. Historically, data mining techniques for crime prediction models often rely on historical information and its mostly country specific. In fact, only a few of the earlier studies on crime prediction follow standard data mining procedure. Hence, considering the current worldwide crime trend in which criminals routinely publish their criminal intent on social media and ask others to see and/or engage in different crimes, an alternative, and more dynamic strategy is needed. The goal of this research is to improve the performance of crime prediction models. Thus, this thesis explores the potential of using information on social media (Twitter) for crime prediction in combination with historical crime data. It also figures out, using data mining techniques, the most relevant feature engineering needed for United Kingdom dataset which could improve crime prediction model performance. Additionally, this study presents a function that could be used by every state in the United Kingdom for data cleansing, pre-processing and feature engineering. A shinny App was also use to display the tweets sentiment trends to prevent crime in near-real time.Exploratory analysis is essential for revealing the necessary data pre-processing and feature engineering needed prior to feeding the data into the machine learning model for efficient result. Based on earlier documented studies available, this is the first research to do a full exploratory analysis of historical British crime statistics using stop and search historical dataset. Also, based on the findings from the exploratory study, an algorithm was created to clean the data, and prepare it for further analysis and model creation. This is an enormous success because it provides a perfect dataset for future research, particularly for non-experts to utilise in constructing models to forecast crime or conducting investigations in around 32 police districts of the United Kingdom.Moreover, this study is the first study to present a complete collection of geo-spatial parameters for training a crime prediction model by combining demographic data from the same source in the United Kingdom with hourly sentiment polarity that was not restricted to Twitter keyword search. Six unique base models that were frequently mentioned in the previous literature was selected and used to train stop-and-search historical crime dataset and evaluated on test data and finally validated with dataset from London and Kent crime datasets.Two different datasets were created from twitter and historical data (historical crime data with twitter sentiment score and historical data without twitter sentiment score). Six of the most prevalent machine learning classifiers (Random Forest, Decision Tree, K-nearest model, support vector machine, neural network and naïve bayes) were trained and tested on these datasets. Additionally, hyperparameters of each of the six models developed were tweaked using random grid search. Voting classifiers and logistic regression stacked ensemble of different models were also trained and tested on the same datasets to enhance the individual model performance.In addition, two combinations of stack ensembles of multiple models were constructed to enhance and choose the most suitable models for crime prediction, and based on their performance, the appropriate prediction model for the UK dataset would be selected. In terms of how the research may be interpreted, it differs from most earlier studies that employed Twitter data in that several methodologies were used to show how each attribute contributed to the construction of the model, and the findings were discussed and interpreted in the context of the study. Further, a shiny app visualisation tool was designed to display the tweets’ sentiment score, the text, the users’ screen name, and the tweets’ vicinity which allows the investigation of any criminal actions in near-real time. The evaluation of the models revealed that Random Forest, Decision Tree, and K nearest neighbour outperformed other models. However, decision trees and Random Forests perform better consistently when evaluated on test data

    Engineering Education for the Future

    Get PDF

    Congress UPV Proceedings of the 21ST International Conference on Science and Technology Indicators

    Get PDF
    This is the book of proceedings of the 21st Science and Technology Indicators Conference that took place in València (Spain) from 14th to 16th of September 2016. The conference theme for this year, ‘Peripheries, frontiers and beyond’ aimed to study the development and use of Science, Technology and Innovation indicators in spaces that have not been the focus of current indicator development, for example, in the Global South, or the Social Sciences and Humanities. The exploration to the margins and beyond proposed by the theme has brought to the STI Conference an interesting array of new contributors from a variety of fields and geographies. This year’s conference had a record 382 registered participants from 40 different countries, including 23 European, 9 American, 4 Asia-Pacific, 4 Africa and Near East. About 26% of participants came from outside of Europe. There were also many participants (17%) from organisations outside academia including governments (8%), businesses (5%), foundations (2%) and international organisations (2%). This is particularly important in a field that is practice-oriented. The chapters of the proceedings attest to the breadth of issues discussed. Infrastructure, benchmarking and use of innovation indicators, societal impact and mission oriented-research, mobility and careers, social sciences and the humanities, participation and culture, gender, and altmetrics, among others. We hope that the diversity of this Conference has fostered productive dialogues and synergistic ideas and made a contribution, small as it may be, to the development and use of indicators that, being more inclusive, will foster a more inclusive and fair world

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    E-Learning In Higher Education: The Gender Perspective

    Get PDF
    • …