15 research outputs found

    Coastal Resilience Decision Making with Machine Learning

    Get PDF
    Our research aims to understand how social data can be integrated with climate data using machine learning for coastal resilience decisions. Although data analytics techniques have been adapted for decision models, incorporating unstructured data is a challenge. We adapt a design science research approach to develop a coastal resilience decision model that can accommodate various sets of climate criteria and social attributes to help us understand coastal risks in communities vulnerable to coastal hazards. We collected social data from environmental groups and individuals and conducted an exploratory social media data analysis on coastal resilience in the greater Boston, U.S., area. We employ non-negative matrix factorization (NMF), a topic modeling technique, to extract human-interpretable topics from a preliminary dataset of 131 documents from 50 different accounts. The outcomes of this research can help community members and policy makers understand and develop robust sustainability and climate focused decisions

    Coastal Resilience with Social Data Analytics: A Design Science Approach

    Get PDF
    We adapt a design science approach (DSR) for coastal resilience and climate justice using big data analytics. Our big data and machine learning based artifact can accommodate various sets of social attributes to understand coastal risks for vulnerable communities. We analyzed social data from communities vulnerable to coastal hazards by incorporating machine learning (ML) to assess coastal community needs and demands. In addition, we developed a user interface that provides data selection and weighting functionalities. We extend IS literature in design science research and ML techniques to further our understanding of coastal resilience in vulnerable communities. The outcomes of this research can help community members and policy makers understand and develop robust sustainability and climate focused decisions using a coastal resilience decision approach

    Profiling Essential Professional Skills of Chief Data Officers Through Topical Modeling Algorithms

    Get PDF
    Today enterprises are increasingly dependent on data to keep their business competitive and successful. To better harness values of data, more and more organizations are establishing Chief Data Officer (CDO) position. The professional skills of CDOs are rather diverse because CDOs are expected to undertake a variety of roles in their companies including enterprise data architect, data quality and governance manager, business strategy leader, business regulation compliance officer, etc. CDO is an emerging research field, few studies have been done on CDO. This paper tries to profile what are the key professional skills and education background that current CDOs have by studying their resumes on LinkedIn using topic modeling technique. This work is a step forward towards understanding the roles of CDOs in organizations and what are the professional skills and experience they may need have in order to undertake their responsibilities of managing data and realizing its true values for their organizations

    Identificação de Temas em Redes Sociais por meio de técnicas de agrupamento

    Get PDF
    Os anos recentes foram marcados pelo surgimento de várias mídias sociais, do Orkut ao Facebook, incluindo Twitter, Youtube, Google+ e muitos outros: cada um oferece novos recursos para atrair mais usuários. Estas mídias sociais geram uma grande quantidade de dados que se processada corretamentepode ser utilizada para identificar tendâncias, padrões e mudanças. O objetivodeste trabalho é a descoberta de temas-chave em uma discussão de redessociais, caracterizada como grupos de termos relevantes restritos a um contexto, e o estudo de sua evolução ao longo do tempo. Para isso, utilizamosprocedimentos baseados em mineração de dados e processamento de texto. No início, as técnicas de processamento de texto são usadas para identificar os termos mais relevantes que aparecem nas mensagens de texto da rede social. Em seguida, esses termos são agrupados usando os algoritmos k-means e k-medoids clássicos, e também o recente algoritmo NMF (Non-negative Matrix Factorization). Finalmente, associamos os termos mais relevantes dos agrupamentos de documentos para caracterizar os principais temas das mensagens consideradas. A proposta foi avaliada na rede Twitter, usando conjuntos de dados de tweets de vários contextos iniciais. Os resultados mostram a viabilidade da proposta, a fim de identificar os tópicos relevantes desta rede social no contexto inicial fornecido

    IT skills in vocational training curricula and labour market outcomes

    Full text link
    We use vocational training curricula to investigate how IT skills are trained within broader skills packages and how these relate to labour market outcomes. Skills packages are the typical combinations of IT skills (e.g., CNC) and technical or nontechnical skills (e.g., material sciences or work safety) that are jointly required in the real world and occur in training curricula. This broadened perspective of teaching IT skills offers new insights into how digital skills can be successfully integrated into future education and training programmes. We use legally binding vocational education and training (VET) curricula of dual apprenticeship training in Switzerland. We apply natural language processing methods to analyse the extensive curriculum texts, which meticulously define the skills that have to be taught. We identify four typical skills packages, each of which are centred around one of four different types of IT skill (CNC/CAD, control technologies, system technologies, IT-applications). Our empirical analyses show that VET graduates trained in these skills packages receive positive labour market outcomes compared to VET graduates without these skills packages. Moreover, we find that the positive outcomes are not just driven by differences in cognitive skill requirements of the respective occupations

    Innovative Heuristics to Improve the Latent Dirichlet Allocation Methodology for Textual Analysis and a New Modernized Topic Modeling Approach

    Get PDF
    Natural Language Processing is a complex method of data mining the vast trove of documents created and made available every day. Topic modeling seeks to identify the topics within textual corpora with limited human input into the process to speed analysis. Current topic modeling techniques used in Natural Language Processing have limitations in the pre-processing steps. This dissertation studies topic modeling techniques, those limitations in the pre-processing, and introduces new algorithms to gain improvements from existing topic modeling techniques while being competitive with computational complexity. This research introduces four contributions to the field of Natural Language Processing and topic modeling. First, this research identifies a requirement for a more robust “stopwords” list and proposes a heuristic for creating a more robust list. Second, a new dimensionality-reduction technique is introduced that exploits the number of words within a document to infer importance to word choice. Third, an algorithm is developed to determine the number of topics within a corpus and demonstrated using a standard topic modeling data set. These techniques produce a higher quality result from the Latent Dirichlet Allocation topic modeling technique. Fourth, a novel heuristic utilizing Principal Component Analysis is introduced that is capable of determining the number of topics within a corpus that produces stable sets of topic words