34,998 research outputs found

    Building the European Social Innovation Database with natural language processing and machine learning

    Get PDF
    Social innovation is widely defined as technological and non-technological new products, services or models that simultaneously meet social needs and create new social relationships or collaborations. Despite a significant interest in the concept, the lack of reliable and comprehensive data is a barrier for social science research. We created the European Social Innovation Database (ESID) to address this gap. ESID is based on the idea of large-scale collection of unstructured web site text to classify and characterise social innovation projects from around the world. We use advanced machine learning techniques to extract features such as social innovation dimensions, project locations, summaries, and topics, among others. Our models perform as high as 0.90 F1. ESID currently includes 11,468 projects from 159 countries. ESID data is available freely and also presented in a web-based app. Our future workplan includes expansion (i.e., increasing the number of projects), extension (i.e., adding new variables) and dynamic retrieval (i.e., retrieving and extracting information in regular intervals)

    Inside Dropbox: Understanding Personal Cloud Storage Services

    Get PDF
    Personal cloud storage services are gaining popularity. With a rush of providers to enter the market and an increasing of- fer of cheap storage space, it is to be expected that cloud storage will soon generate a high amount of Internet traffic. Very little is known about the architecture and the perfor- mance of such systems, and the workload they have to face. This understanding is essential for designing efficient cloud storage systems and predicting their impact on the network. This paper presents a characterization of Dropbox, the leading solution in personal cloud storage in our datasets. By means of passive measurements, we analyze data from four vantage points in Europe, collected during 42 consecu- tive days. Our contributions are threefold: Firstly, we are the first to study Dropbox, which we show to be the most widely-used cloud storage system, already accounting for a volume equivalent to around one third of the YouTube traffic at campus networks on some days. Secondly, we characterize the workload typical users in different environments gener- ate to the system, highlighting how this reflects on network traffic. Lastly, our results show possible performance bot- tlenecks caused by both the current system architecture and the storage protocol. This is exacerbated for users connected far from control and storage data-center

    GeoIntelligence: Data Mining Locational Social Media Content for Profiling and Information Gathering

    Get PDF
    The current social media landscape has resulted in a situation where people are encouraged to share a greater amount of information about their day-to-day lives than ever before. In this environment a large amount of personal data is disclosed in a public forum with little to no regard for the potential privacy impacts. This paper focuses on the presence of geographic data within images, metadata and individual postings. The GeoIntelligence project aims to aggregate this information to educate users on the possible implications of the utilisation of these services as well as providing service to law enforcement and business. This paper demonstrates the ability to profile users on an individual and group basis from data posted openly to social networking services
    • 

    corecore