15 research outputs found

    In search of knowledge: text mining dedicated to technical translation

    Get PDF
    Articolo pubblicato su CD e commercializzato direttamente dall'ASLIB (http://shop.emeraldinsight.com/product_info.htm/cPath/56_59/products_id/431). Programma del convegno su http://aslib.co.uk/conferences/tc_2011/programme.htm

    Categorizing and measuring social ties

    Get PDF
    he analysis of social networks has boomed recently, mainly as online social networking systems such as Twitter allow researchers to access these data. However, the research is less and less focused on the fundamental question on the validity of the data and interpretation of the results. For example, Golder et al. (2007) use the word 'friend' in quotes while describing their results. To enhance the discussion around the validity of results, our work contributes a categorization of social network data. We also discuss the differences of the data sources, especially highlighting the fact that different data sources disclose different kinds of networks. Our approach is to examine social networks based on several sources of data, and thus acquire a richer data set. Based on this extended data set, we are more equipped to understand the social relations represented via links between nodes. After reviewing the existing literature, we make two observations of social relationships in online services. Firstly, the friendship data may be shared in public or with the specific group of users of that service - this may affect how people perceive and use these relationships, especially when compared with the private displays of relations (e.g., Donath & boyd, 2004). On the other hand, people interact only with part of their social relations (e.g., Golder et al., 2007) and research has started to focus from statical networks to more dynamical activity based networks (e.g., Huberman et al., 2009). Based on the existing literature, shortly discussed above, a 2x2 matrix can be developed. Relations may be public or private and active or passive. For instance, those relations with which you use Instant Messaging can be considered private and active whereas Facebook friends are passive and public. As they are different in this nature, also the conclusions based on the analysis should differ. After confirming that the data measure the phenomenon desired, one should use several kinds of data sources to really understand the social structures behind the group under study. We claim that multiple data sets should be used when measuring social relations. McPherson et al. (2001) have also concluded that the priority for future social network researchers should be to gather dynamic data on multiple social relations. By studying existing research and our own empirical data (e.g., Karikoski & Nelimarkka, 2011), we discuss the opportunities and challenges of using multiple data sets to cover the same group.Peer reviewe

    Text Mining for Industrial Machine Predictive Maintenance with Multiple Data Sources

    Get PDF
    This paper presents an innovative methodology, from which an efficient system prototype is derived, for the algorithmic prediction of malfunctions of a generic industrial machine tool. It integrates physical devices and machinery with Text Mining technologies and allows the identification of anomalous behaviors, even of minimal entity, rarely perceived by other strategies in a machine tool. The system works without waiting for the end of the shift or the planned stop of the machine. Operationally, the system analyzes the log messages emitted by multiple data sources associated with a machine tool (such as different types of sensors and log files produced by part programs running on CNC or PLC) and deduces whether they can be inferred from them future machine malfunctions. In a preliminary offline phase, the system associates an alert level with each message and stores it in a data structure. At runtime, three algorithms guide the system: pre-processing, matching and analysis: Preprocessing, performed only once, builds the data structure; Matching, in which the system issues the alert level associated with the message; Analysis, which identifies possible future criticalities. It can also analyze an entire historical series of stored messages The algorithms have a linear execution time and are independent of the size of the data structure, which does not need to be sorted and therefore can be updated without any computational effort
    corecore