27 research outputs found

    Finding usage patterns from generalized weblog data

    Get PDF
    Buried in the enormous, heterogeneous and distributed information, contained in the web server access logs, is knowledge with great potential value. As websites continue to grow in number and complexity, web usage mining systems face two significant challenges - scalability and accuracy. This thesis develops a web data generalization technique and incorporates it into the web usage mining framework in an attempt to exploit this information-rich source of data for effective and efficient pattern discovery. Given a concept hierarchy on the web pages, generalization replaces actual page-clicks with their general concepts. Existing methods do this by taking a level-based cut through the concept hierarchy. This adversely affects the quality of mined patterns since, depending on the depth of the chosen level, either significant pages of user interests get coalesced, or many insignificant concepts are retained. We present a usage driven concept ascension algorithm, which only preserves significant items, possibly at different levels in the hierarchy. Concept usage is estimated using a small stratified sample of the large weblog data. A usage threshold is then used to define the nodes to be pruned in the hierarchy for generalization. Our experiments on large real weblog data demonstrate improved performance in terms of quality and computation time of the pattern discovery process. Our algorithm yields an effective and scalable tool for web usage mining

    Modeling usage of an online research community

    Get PDF
    Although online communities have been thought of as a new way for collaboration across geographic boundaries in the scientific world, they have a problem attracting people to keep visiting. The main purpose of this study is to understand how people behave in such communities, and to build and evaluate tools to stimulate engagement in a research community. These tools were designed based on a research framework of factors that influence online participation and relationship development. There are two main objectives for people to join an online community, information sharing and interpersonal relationship development, such as friends or colleagues. The tools designed in this study are to serve both information sharing and interpersonal relationship development needs. The awareness tool is designed to increase the sense of a community and increase the degree of social presence of members in the community. The recommender system is designed to help provide higher quality and personalized information to community members. It also helps to match community members into subgroups based on their interests. The designed tools were implemented in a field site - the Asynchronous Learning Networks (ALN) Research community. A longitudinal field study was used to evaluate the effectiveness of the designed tools. This research explored people\u27s behavior inside a research community by analyzing web server logs. The results show that although there are not many interactions in the community space, the WebCenter has been visited extensively by its members. There are over 2,000 hits per day on average and over 5,000 article accesses during the observation period. This research also provided a framework to identify factors that affect people\u27s engagement in an online community. The research framework was tested using the PLS modeling method with online survey responses. The results show that perceived usefulness performs a very significant role in members\u27 intention to continue using the system and their perceived preliminary networking. The results also show that the quality of the content of the system is a strong indicator for both perceived usefulness of the community space and perceived ease of use of the community system. Perceived ease of use did not show a strong correlation with intention to continue use which was consistent with other studies of Technology Acceptance Model (TAM). For the ALN research community, this online community helps its members to broaden their contacts, improve the quality and quantity of their research, and increase the dissemination of knowledge among community members

    WSN based sensing model for smart crowd movement with identification: a conceptual model

    Get PDF
    With the advancement of IT and increase in world population rate, Crowd Management (CM) has become a subject undergoing intense study among researchers. Technology provides fast and easily available means of transport and, up-to-date information access to the people that causes crowd at public places. This imposes a big challenge for crowd safety and security at public places such as airports, railway stations and check points. For example, the crowd of pilgrims during Hajj and Ummrah while crossing the borders of Makkah, Kingdom of Saudi Arabia. To minimize the risk of such crowd safety and security identification and verification of people is necessary which causes unwanted increment in processing time. It is observed that managing crowd during specific time period (Hajj and Ummrah) with identification and verification is a challenge. At present, many advanced technologies such as Internet of Things (IoT) are being used to solve the crowed management problem with minimal processing time. In this paper, we have presented a Wireless Sensor Network (WSN) based conceptual model for smart crowd movement with minimal processing time for people identification. This handles the crowd by forming groups and provides proactive support to handle them in organized manner. As a result, crowd can be managed to move safely from one place to another with group identification. The group identification minimizes the processing time and move the crowd in smart way

    Education All A\u27Twitter: Twitter\u27s Role in Educational Technology

    Get PDF
    The purpose of this mixed methods study was to examine whether current uses of Twitter by educators correlate with the literature on the uses and advantages of using Twitter in education through an examination of United States educators and West Virginia educators. Data was obtained from responses to the online survey, Education All A’Twitter: Twitter’s Role in Educational Technology, content analysis of public Twitter feeds, and semi-structured interviews that were sorted, coded, organized, and analyzed to identify emergent themes. The study had a population that included 97 survey responses, 78 Twitter feeds, and 8 semi-structured interviews. There were survey respondents from West Virginia and 26 other states in the United States, as well as international respondents. The study determined to what extent West Virginia and United States educators used Twitter for instructional strategies, professional development, and personal learning networks, as well as identified barriers and challenges educators face when attempting to employ the use of Twitter educationally. In addition, there were four ancillary findings that emerged through the study. As triangulation of the data supported the current literature, this study has several implications for current educators, policymakers, and researchers

    Usability in digitalen Kooperationsnetzwerken. Nutzertests und Logfile-Analyse als kombinierte Methode

    Get PDF
    Usability is a key factor when developing new applications. The interaction between the users and the application should be efficient, effective and engaging. Furthermore, a good usability includes a high error tolerance and an good learnability. Different methods allow the measurement of usability throughout the development (process). All methods have in common that the different employed steps like planning, conducting and evaluating are rather time-consuming. When end-users are included as subjects, usability tests are employed. Due to the high time-effort, usually ten or less tests are conducted. The thesis tries to solve this point by trying to combine usability tests and logfile analysis. The empirical work is two-folded. First, usability tests within a learning management system (LMS) are logged in the background. These logfiles are assigned to severe usability problems. Second, the paths of the severe usability problems are combined with logfile data from a real-world LMS that runs the same application. The real-world logfiles contain a period of about 300 days with 133 active users. Prior to the combination, both data sets converted into a similar format. Being a new procedure, the definite similarity value had to be specified by descriptive statistics and visual inspections. The final combination makes it possible to determine the severity of usability problems on the basis of real-world usage data. The proposed method offers a more precise overview of the occurrence of the found usability problems, independent of the test situation. This thesis provides additional value to the fields of (Web) Data Mining, Usability and Human-Computer Interaction (HCI). It also offers additional knowledge to the field of software development, quantitative and quantitative research as well as computer-supported cooperative work (CSCW) and learning management systems (LMS)

    Stylistics versus Statistics: A corpus linguistic approach to combining techniques in forensic authorship analysis using Enron emails

    Get PDF
    This thesis empirically investigates how a corpus linguistic approach can address the main theoretical and methodological challenges facing the field of forensic authorship analysis. Linguists approach the problem of questioned authorship from the theoretical position that each person has their own distinctive idiolect (Coulthard 2004: 431). However, the notion of idiolect has come under scrutiny in forensic linguistics over recent years for being too abstract to be of practical use (Grant 2010; Turell 2010). At the same time, two competing methodologies have developed in authorship analysis. On the one hand, there are qualitative stylistic approaches, and on the other there are statistical ‘stylometric’ techniques. This study uses a corpus of over 60,000 emails and 2.5 million words written by 176 employees of the former American company Enron to tackle these issues in the contexts of both authorship attribution (identifying authors using linguistic evidence) and author profiling (predicting authors’ social characteristics using linguistic evidence). Analyses reveal that even in shared communicative contexts, and when using very common lexical items, individual Enron employees produce distinctive collocation patterns and lexical co-selections. In turn, these idiolectal elements of linguistic output can be captured and quantified by word n-grams (strings of n words). An attribution experiment is performed using word n-grams to identify the authors of anonymised email samples. Results of the experiment are encouraging, and it is argued that the approach developed here offers a means by which stylistic and statistical techniques can complement each other. Finally, quantitative and qualitative analyses are combined in the sociolinguistic profiling of Enron employees by gender and occupation. Current author profiling research is exclusively statistical in nature. However, the findings here demonstrate that when statistical results are augmented by qualitative evidence, the complex relationship between language use and author identity can be more accurately observed
    corecore