653 research outputs found

    IDTraffickers:An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements

    Get PDF
    Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To address this issue, we introduce IDTraffickers, an extensive dataset consisting of 87,595 text ads and 5,244 vendor labels to enable the verification and identification of potential HT vendors on online escort markets. To establish a benchmark for authorship identification, we train a DeCLUTR-small model, achieving a macro-F1 score of 0.8656 in a closed-set classification environment. Next, we leverage the style representations extracted from the trained classifier to conduct authorship verification, resulting in a mean r-precision score of 0.8852 in an open-set ranking environment. Finally, to encourage further research and ensure responsible data sharing, we plan to release IDTraffickers for the authorship attribution task to researchers under specific conditions, considering the sensitive nature of the data. We believe that the availability of our dataset and benchmarks will empower future researchers to utilize our findings, thereby facilitating the effective linkage of escort ads and the development of more robust approaches for identifying HT indicators

    IDTraffickers:An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements

    Get PDF
    Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To address this issue, we introduce IDTraffickers, an extensive dataset consisting of 87,595 text ads and 5,244 vendor labels to enable the verification and identification of potential HT vendors on online escort markets. To establish a benchmark for authorship identification, we train a DeCLUTR-small model, achieving a macro-F1 score of 0.8656 in a closed-set classification environment. Next, we leverage the style representations extracted from the trained classifier to conduct authorship verification, resulting in a mean r-precision score of 0.8852 in an open-set ranking environment. Finally, to encourage further research and ensure responsible data sharing, we plan to release IDTraffickers for the authorship attribution task to researchers under specific conditions, considering the sensitive nature of the data. We believe that the availability of our dataset and benchmarks will empower future researchers to utilize our findings, thereby facilitating the effective linkage of escort ads and the development of more robust approaches for identifying HT indicators

    Good Tech, Bad Tech: Policing Sex Trafficking with Big Data

    Get PDF
    Technology is often highlighted in popular discourse as a causal factor in significantly increasing sex trafficking. However, there is a paucity of robust empirical evidence on sex trafficking and the extent to which technology facilitates it. This has not prevented the proliferation of beliefs that technology is essential for disrupting or even ending sex trafficking. Big data analytics and anti-trafficking software are used in this context to produce knowledge and intelligence on sex trafficking. This paper explores the challenges and limitations of understanding exploitation through algorithms and online data. It also highlights the key dimensions of exploitation ignored in big data-oriented research on sex trafficking. By doing so, the paper seeks to advance our theoretical understanding of the trafficking–‍technology nexus, and it is argued that sex trafficking must be reframed along a continuum of exploitation that is sensitive to the social context of exploitation within the sex market

    Volume 25, Full Contents

    Get PDF

    Machine Learning para deteção de padrões e previsão de ocorrências criminais

    Get PDF
    The increase of the world population, especially in large urban centers, has resulted in new challenges such as the management of natural resources and infrastructures as well as the optimization of services to promote the quality of citizens’ life. One of the biggest and most important challenges is the management of public safety, since, in addition to being a factor of interest to both the general population and the authorities, it is also an area that influences other essential indicators in a city such as tourism and employment. Public Safety has impact on the economic growth and social development of a community. This dissertation proposes a solution for the prediction of criminal occurrences in a city based on historical data of incidents and demographic data. The entire life cycle of the model’s learning process will be presented to provide an organization with predictive capability: start with the data collection from its original source, the treatment and transformations applied to them, the choice and the evaluation and implementation of the Machine Learning model up to the application layer. Classification models will be implemented to predict criminal risk for a given time interval and location, as well as regression models to predict the number of crimes. Machine Learning algorithms, such as Random Forest, Neural Networks, K-Nearest Neighbors and Logistic Regression will be used to predict occurrences, and their performance will be compared according to the data processing and transformation used. The results of the chosen model show that the use of Machine Learning techniques helps to anticipate criminal occurrences, which contributed to the reinforcement of public security. Finally, the models will be implemented on a platform that provides an API to enable other entities to request for predictions in real-time. An application will also be presented where it is possible to show criminal occurrences predictions visually.O aumento da população mundial, especialmente nos grandes centros urbanos, tem resultado em novos desafios tais como a gestão de recursos naturais, gestão de infraestruturas, bem como a otimização dos serviços para promover a qualidade de vida dos cidadãos. Um dos maiores e mais importantes desafios é a gestão da segurança pública. Para além de ser um fator de interesse quer da população em geral quer das autoridades, também é um domínio que influencia outros indicadores essenciais numa cidade como o turismo e o emprego. A segurança pública reflete-se no crescimento económico e no desenvolvimento social de uma comunidade. Nesta dissertação é proposta uma solução para previsão de ocorrências criminais numa cidade baseada em dados de histórico de incidentes e dados demográficos. Será apresentado todo o ciclo de vida do processo de aprendizagem do modelo para dotar uma organização da capacidade preditiva: desde a recolha dos dados da sua fonte de origem, o tratamento e transformações aplicadas aos mesmos, escolha, avaliação e implementação do modelo de Machine Learning até à camada de aplicação. Serão implementados modelos de classificação para previsão do risco criminal para um dado intervalo temporal e localização, e modelos de regressão para previsão do número de crimes. Irão ser utilizados algoritmos de Machine Learning como Random Forest, Redes Neuronais, K-Nearest Neighbors e Regressão Logística para a aprendizagem do modelo de previsão de ocorrências onde serão comparados os seus desempenhos de acordo com o tratamento e transformação dos dados utilizados. Os resultados do modelo escolhido evidenciam que a utilização de técnicas de Machine Learning auxiliam a antecipação de ocorrências criminais, o que contribuiu para o reforço da segurança pública. Por fim, irá ser procedida a implementação dos modelos numa plataforma que fornece uma API para que entidades externas possam solicitar previsões em tempo real. Será também apresentada a aplicação onde é possível mostrar visualmente as previsões de ocorrências criminais.Mestrado em Engenharia Informátic

    Range expansion in the invasive Round goby (Neogobius melanostomus): behavioural and gene transcriptional components of a successful invader

    Get PDF
    Range expansion of an invasive species can be influenced by intrinsic mechanisms such as behaviours described as being highly flexible and/or of specific behavioural types that are associated with dispersal ability. In addition, related gene transcription can also be influential in invasion success, promoting acclimation to novel environments. My study species, the round goby (Neogobius melanostomus), is an invasive fish continuously expanding its range in the Laurentian Great Lakes and its tributaries. This thesis aims to examine: 1) the behavioural repertoire of the round goby 2) differential gene transcription for gobies “natural” and environmental captive “treatment” using brain candidate genes associated with behavioural traits specific to aggression, boldness, stress response, learning, and activity; and 3) how behaviour and gene transcription vary between residents and dispersers and detection time since North American invasion. I found that round goby possess an “invasion behavioural phenotype” consisting of boldness, exploration, sociality and predator habituation. In addition, I found juveniles were bolder, explored more, were social and habituated to predation more compared to adults, but more so at established sites than recently invaded ones, contrary to predictions. Adults did not show any overall invasion stage differences, possibly due to conspecific densities, habitat-feature differences, and/or time-since-first detection. I showed evidence that there could be a genetic mechanism driving these behaviours, genes expressed for the “natural” group (aggression, stress-response, learning). My natural gene transcription results support that detection time can result in differences most likely driven by density, but round gobies are most likely able to produce “alternative ontogenies” due to plasticity, where individuals acclimatize to novel stressors over time, resulting in shifts in phenotypes. By examining all the facets that could drive range expansion one can gain a deeper insight underlying “invasiveness”

    Analysis of Family-Health-Related Topics on Wikipedia

    Get PDF
    New concepts, terms, and topics always emerge; and meanings of existing terms and topics keep changing all the time. These phenomena occur more frequently on social media than on conventional media because social media allows a huge number of users to generate information online. Retrieving relevant results in different time periods of a fast-changing topic becomes one of the most difficult challenges in the information retrieval field. Among numerous topics discussed on social media, health-related topics are a major category which attracts increasing attention from the general public. This study investigated and explored the evolution patterns of family-health-related topics on Wikipedia. Three family-health-related topics (Child Maltreatment, Family Planning, and Women’s Health) were selected from the World Health Organization Website and their associated entries were retrieved on Wikipedia. Historical numeric and text data of the entries from 2010 to 2017 were collected from a Wikipedia data dump and the Wikipedia Web pages. Four periods were defined: 2010 to 2011, 2012 to 2013, 2014 to 2015, and 2016 to 2017. Coding, subject analysis, descriptive statistical analysis, inferential statistical analysis, SOM approach, and n-gram approach were employed to explore the internal characteristics and external popularity evolutions of the topics. The findings illustrate that the external popularities of the family-health-related topics declined from 2010 to 2017, although their content on Wikipedia kept increasing. The emerged entries had three features: specialization, summarization, and internationalization. The subjects derived from the entries became increasingly diverse during the investigated periods. Meanwhile, the developing trajectories of the subjects varied from one to another. According to the developing trajectories, the subjects were grouped into three categories: growing subject, diminishing subject, and fluctuating subject. The popularities of the topics among the Wikipedia viewers were consistent, while among the editors were not. For each topic, its popularity trend among the editors and the viewers was inconsistent. Child Maltreatment was the most popular among the three topics, Women’s Health was the second most popular, while Family Planning was the least popular among the three. The implications of this study include: (1) helping health professionals and general users get a more comprehensive understanding of the investigated topics; (2) contributing to the developments of health ontologies and consumer health vocabularies; (3) assisting Website designers in organizing online health information and helping them identify popular family-health-related topics; (4) providing a new approach for query recommendation in information retrieval systems; (5) supporting temporal information retrieval by presenting the temporal changes of family-health-related topics; and (6) providing a new combination of data collection and analysis methods for researchers

    Practical application of a Bayesian network approach to poultry epigenetics and stress

    Get PDF
    This work was supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 812777. We also greatly appreciate funding from the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (FORMAS) grants #2018-01074 and #2017-00946 to CG-B. FP appreciates funding from São Paulo Research Foundation (FAPESP, Brazil) projects #2016/20440-3 and #2018/13600-0.Background: Relationships among genetic or epigenetic features can be explored by learning probabilistic networks and unravelling the dependencies among a set of given genetic/epigenetic features. Bayesian networks (BNs) consist of nodes that represent the variables and arcs that represent the probabilistic relationships between the variables. However, practical guidance on how to make choices among the wide array of possibilities in Bayesian network analysis is limited. Our study aimed to apply a BN approach, while clearly laying out our analysis choices as an example for future researchers, in order to provide further insights into the relationships among epigenetic features and a stressful condition in chickens (Gallus gallus). Results: Chickens raised under control conditions (n = 22) and chickens exposed to a social isolation protocol (n = 24) were used to identify differentially methylated regions (DMRs). A total of 60 DMRs were selected by a threshold, after bioinformatic pre-processing and analysis. The treatment was included as a binary variable (control = 0; stress = 1). Thereafter, a BN approach was applied: initially, a pre-filtering test was used for identifying pairs of features that must not be included in the process of learning the structure of the network; then, the average probability values for each arc of being part of the network were calculated; and finally, the arcs that were part of the consensus network were selected. The structure of the BN consisted of 47 out of 61 features (60 DMRs and the stressful condition), displaying 43 functional relationships. The stress condition was connected to two DMRs, one of them playing a role in tight and adhesive intracellular junctions in organs such as ovary, intestine, and brain. Conclusions: We clearly explain our steps in making each analysis choice, from discrete BN models to final generation of a consensus network from multiple model averaging searches. The epigenetic BN unravelled functional relationships among the DMRs, as well as epigenetic features in close association with the stressful condition the chickens were exposed to. The DMRs interacting with the stress condition could be further explored in future studies as possible biomarkers of stress in poultry species.Publisher PDFPeer reviewe

    Understanding sexual concurrency and HIV/AIDS: implicit and explicit attitudes in a South African student population

    Get PDF
    There are more people infected with HIV in South Africa, than in any other country in the world. Studies indicate a plausible relationship between concurrently organised sexual partnership and the spread of STIs, with concurrency being accountable for as much as 74% of HIV infections in South Africa. Understanding sexual concurrency is therefore of vital importance, especially in the South African perspective. It has, however, become increasingly unreliable to rely solely on explicit self-measures to study sexual concurrency, and research has suggested that implicit cognition is a reliable alternative to understanding sexual behaviour and attitudes towards sexuality, which cannot be directly measured by explicit means. The purpose of this study was to understand sexual concurrency among a population of university students by researching their implicit and explicit attitudes towards sexual concurrency; and thereby to aid in understanding sexual concurrency in relation to the spread of HIV. A quantitative research methodology was used to analyse results from explicit measures of sexual concurrency in the form of a questionnaire, and implicit measures of sexual concurrency in the form of the Implicit Association Test (IAT). Although no correlation existed between implicit and explicit measures attitudes towards sexual concurrency, it was, however, observed that sexual concurrency has and is being broadly practiced, and that age is a key determinant for sexual concurrency
    • …
    corecore