3 research outputs found

    Text segmentation techniques: A critical review

    Get PDF
    Text segmentation is widely used for processing text. It is a method of splitting a document into smaller parts, which is usually called segments. Each segment has its relevant meaning. Those segments categorized as word, sentence, topic, phrase or any information unit depending on the task of the text analysis. This study presents various reasons of usage of text segmentation for different analyzing approaches. We categorized the types of documents and languages used. The main contribution of this study includes a summarization of 50 research papers and an illustration of past decade (January 2007- January 2017)’s of research that applied text segmentation as their main approach for analysing text. Results revealed the popularity of using text segmentation in different languages. Besides that, the “word” seems to be the most practical and usable segment, as it is the smaller unit than the phrase, sentence or line

    Compilation of Malay criminological terms from online news

    Get PDF
    A Malay language corpus has been established by the Institute of Language and Literature (Dewan Bahasa dan Pustaka, DBP in Malaysia). Most of the past research on the Malay language corpus has focused on the description, lexicography and translation of the Malay language. However, in the existing literature, there is no list of Malay words that categorizes crime terminologies. This study aims to fill that linguistic gap. First, we aggregated the most frequently used crime terminology words from Malaysian online news sources. Five hundred crime-related words were compiled. No automatic machines were in the initial process, but they were subsequently used to verify the data. Four human coders were used to validate the data and ensure the originality of the semantic understanding of the Malay text. Finally, major crime terminologies were outlined from a set of keywords to serve as taggers in our solution. The ultimate goal of this study is to provide a corpus for forensic linguistics, police investigations, and general crime research. This study has established the first corpus of a criminological text in the Malay language

    Does the way museum staff define inspiration help them work with information from visitors' Social Media?

    Get PDF
    Since the early 2000s, Social Media has become part of the everyday activity of billions of people. Museums and galleries are part of this major cultural change - the largest museums attract millions of Social Media 'friends' and 'followers', and museums now use Social Media channels for marketing and audience engagement activities. Social Media has also become a more heavily-used source of data with which to investigate human behaviour. Therefore, this research investigated the potential uses of Social Media information to aid activities such as exhibition planning and development, or fundraising, in museums. Potential opportunities provided by the new Social Media platforms include the ability to capture data at high volume and then analyse them computationally. For instance, the links between entities on a Social Media platform can be analysed. Who follows who? Who created the content related to a specific event, and when? How did communication flow between people and organisations? The computerised analysis techniques used to answer such questions can generate statistics for measuring concepts such as the 'reach' of a message across a network (often equated simply with the potential size of the a message's audience) or the degree of 'engagement' with content (often a simple count of the number of responses, or the number of instances of communication between correspondents). Other computational analysis opportunities related to Social Media rely upon various Natural Language Processing (NLP) techniques; for example indexing content and counting term frequency, or using lexicons or online knowledge bases to relate content to concepts. Museums, galleries and other cultural organisations have known for some time, however, that simple quantifications of their audiences (the number of tickets sold for an exhibition, for example), while certainly providing indications of an event's success, do not tell the whole story. While it is important to know that thousands of people have visited an exhibition, it is also part of a museum's remit to inspire the audience, too. A budding world-class artist or ground-breaking engineer could have been one of the thousands in attendance, and the exhibition in question could have been key to the development of their artistic or technical ideas. It is potentially helpful to museums and galleries to know when they have inspired members of their audience, and to be able to tell convincing stories about instances of inspiration, if their full value to society is to be judged. This research, undertaken in participation with two museums, investigated the feasibility of using new data sources from Social Media to capture potential expressions of inspiration made by visitors. With a background in IT systems development, the researcher developed three prototype systems during three cycles of Action Research, and used them to collect and analyse data from the Twitter Social Media platform. This work had two outcomes: firstly, prototyping enabled investigation of the technical constraints of extracting data from a Social Media platform (Twitter), and the computing processes used to analyse that data. Secondly, and more importantly, the prototypes were used to assess potential changes to the work of museum staff information about events visited and experienced by visitors was synthesised, then investigated, discussed and evaluated with the collaborative partners, in order to assess the meaning and value of such information for them. Could the museums use the information in their event and exhibition planning? How might it fit in with event evaluation? Was it clear to the museum what the information meant? What were the risks of misinterpretation? The research made several contributions. Firstly, the research developed a definition of inspiration that resonated with museum staff. While this definition was similar to the definition of 'engagement' from the marketing literature, one difference was an emphasis upon creativity. The second set of contributions related to a deeper understanding of Social Media from museums' perspective, and included findings about how Social Media information could be used to segment current and potential audiences by 'special interest', and find potential expressions of creativity and innovation in the audience's responses to museum activities. These findings also considered some of the pitfalls of working with data from Social Media, in particular the tendency of museum staff to use the information to confirm positive biases, and the often hidden biases caused by the mediating effects of the platforms from which the data came. The final major contribution was a holistic analysis of the ways in which Social Media information could be integrated into the work of a museum, by helping to plan and evaluate audience development and engagement. This aspect of the research also highlighted some of the dangers of an over-dependency upon individual Social Media platforms which was previously absent from the museums literature
    corecore