612 research outputs found

    Metadata enrichment for digital heritage: users as co-creators

    Get PDF
    This paper espouses the concept of metadata enrichment through an expert and user-focused approach to metadata creation and management. To this end, it is argued the Web 2.0 paradigm enables users to be proactive metadata creators. As Shirky (2008, p.47) argues Web 2.0’s social tools enable “action by loosely structured groups, operating without managerial direction and outside the profit motive”. Lagoze (2010, p. 37) advises, “the participatory nature of Web 2.0 should not be dismissed as just a popular phenomenon [or fad]”. Carletti (2016) proposes a participatory digital cultural heritage approach where Web 2.0 approaches such as crowdsourcing can be sued to enrich digital cultural objects. It is argued that “heritage crowdsourcing, community-centred projects or other forms of public participation”. On the other hand, the new collaborative approaches of Web 2.0 neither negate nor replace contemporary standards-based metadata approaches. Hence, this paper proposes a mixed metadata approach where user created metadata augments expert-created metadata and vice versa. The metadata creation process no longer remains to be the sole prerogative of the metadata expert. The Web 2.0 collaborative environment would now allow users to participate in both adding and re-using metadata. The case of expert-created (standards-based, top-down) and user-generated metadata (socially-constructed, bottom-up) approach to metadata are complementary rather than mutually-exclusive. The two approaches are often mistakenly considered as dichotomies, albeit incorrectly (Gruber, 2007; Wright, 2007) . This paper espouses the importance of enriching digital information objects with descriptions pertaining the about-ness of information objects. Such richness and diversity of description, it is argued, could chiefly be achieved by involving users in the metadata creation process. This paper presents the importance of the paradigm of metadata enriching and metadata filtering for the cultural heritage domain. Metadata enriching states that a priori metadata that is instantiated and granularly structured by metadata experts is continually enriched through socially-constructed (post-hoc) metadata, whereby users are pro-actively engaged in co-creating metadata. The principle also states that metadata that is enriched is also contextually and semantically linked and openly accessible. In addition, metadata filtering states that metadata resulting from implementing the principle of enriching should be displayed for users in line with their needs and convenience. In both enriching and filtering, users should be considered as prosumers, resulting in what is called collective metadata intelligence

    Time management : how to better manage your workload and time

    Get PDF
    Meeting proceedings of a seminar by the same name, held September 17, 2020

    Time Management_ How To Better Manage Your Workload & Time

    Get PDF
    Meeting proceedings of a seminar by the same name, held September 16, 202

    Automated classification of receipts and invoices along with document extraction

    Get PDF
    Companies might receive dozens or even hundreds of receipts and invoices per day. It consumes a lot of working hours to keep them all organized – invoices must be paid on time and receipts must be archived properly. This research aims to reduce the amount of manual labor the organizing requires with automated classification. Personally, I’m writing this thesis in collaboration with my workplace – a company called Eneroc Ltd. They had a problem with document classification consuming too many working hours. Therefore, they created a system to automate this process. The existing system uses a text-based approach that searches for specific key words in the documents. The system works rather well, but the company wanted to find out if some modern approach could outperform the existing system and add more features into the process. The goal of this research is to find out if a machine learning based approach could be used to classify documents into invoices and receipts. In addition to the classification, the approach should also be able to collect key information from the documents. This thesis describes the workflow of creating a machine learning based solution to tackle the given challenge. The research resulted in an application that takes in invoices and receipts in PDF format. The system trains a k-nearest neighbors model with training data, that was created in the process of the research. The model is then used to classify different parts of the new PDF files into predefined categories. The key information is extracted from these categories. The k-NN model was validated with k-fold cross-validation. The validation showed that the model is performing correctly. Some preprocessing was also introduced in the process, which further improved the results. Good results with the k-NN model imply that using a proper machine learning solution would be profitable. The final classification between receipts and invoices, as well as the key information extraction, is done based on the classified document parts. This works rather well on the classification and simple key information extraction. But more complex key information extraction – like the product list extraction – still requires more work. The research proved that machine learning solution could be used to classify documents into invoices and receipts, and also to collect key information from the documents. The created application isn’t yet ready for deployment, but it gives a good foundation for future development. The research also shows which steps to take next and where to focus on when improving the system

    Analysis and Comparison of various Methods for Text Detection from Images using MSER Algorithm

    Get PDF
    In this paper analysis and comparison of various methods for text detection is carried by using canny edge detection algorithm and MSER based method along with the image enhancement which results in the improved performance in terms of text detection. In addition, we improve current MSERs by developing a contrast enhancement mechanism that enhances region stability of text patterns to remove the blurring caused during the capture of image Lucy Richardson de blurring Algorithm is used

    Recognition and classification: the use of computer vision in the retail industry

    Get PDF
    Project work presented as a partial requisite to obtain the Master Degree in Information Management with specialization in Knowledge Management and Business intelligenceAutomatic recognition of text and classification of it, using image processing techniques such as optical character recognition and machine learning, are indicating new ways of capturing information on fast-moving consumer goods. Such systems can play an important role in market research processes and operations, in being more efficient and agile. The necessity is to create a system that is able to extract all text available on the packaging and quickly arrange it into attributes. The goal of this investigation is to use a combination of optical character recognition and machine learning to achieve a satisfactory level of efficiency and quality. In order for such a system to be introduced to the organization, it needs to be faster and more effective than currents process. One of the advantages of using such a system is the independence of the human factor, which leads to a higher probability of error

    Time Management_ How to Better Manage Your Workload & Time

    Get PDF
    Meeting proceedings of a seminar by the same name, held September 16, 202
    • …
    corecore