Search CORE

869 research outputs found

CS-Web: A Lightweight Summarizer for HTML

Author: Russell-Rose Tony
Wrigley Ave
Publication venue
Publication date: 01/01/1999
Field of study

Desktop Search Engine for Linux

Author: Musa@Mahmud Ahmad Razif
Publication venue: Universiti Teknologi Petronas
Publication date: 01/07/2005
Field of study

Desktop Search Engine become more popular for personal and enterprise after some difficulties occurs when dealing with the huge amount of files and in a multi-user environment. Desktop Search Engine for Linux is a desktop search tool, integrated between Namazu with web-based interface that can search text format files inthe hard drive ofpersonal computer. As increasingly demand for aneffective and efficient desktop search tools especially for the Linux environment where there were just a few tools have been developed for Linux compared for Windows although the usage of Linux operating system are increased from days to days. This system is just for Linux (Debian platform) operating system and just search for a text format files. This system index the entire words of files in the hard drive and create one index files that contains all details about the files in the disk. The system just refers to this index files when processing the searching process for a fast and effective results. From the studies and analysis that has been done during the development of this system, there have a benchmark criterion for desktop search tools that can be use as a reference and also a lot of indexer that can be used to index the files. Only the best indexer was be taken to integrate with this system. This system still can be improved with the support, effort and deep knowledge about desktop searchtools and technical skills

UTPedia

A structural query system for Han characters

Author: Skala Matthew
Publication venue
Publication date: 01/01/2016
Field of study

The IT University of Copenhagen's Repository

Recommended from our members

Classification of building design information

Author: Down Peter G.
Publication venue
Publication date: 01/12/1976
Field of study

The more widely used classification/coding systems for building elements and components, such as CI /SfB and UDC, were developed to classify documents. A classification/coding system for use with computer aided design has to be able to convey detailed information about the features and properties of components. Previous studies of the use of information in the construction industry, in particular the CACCI Reports, have examined the logical structure of design operations and how this influences the structure of a corresponding information system. This Thesis examines also the traditional roles of the participants in the design team and demonstrates that these roles modify the ideal structure. A number of existing classification systems are analysed to provide, with an analysis of the theory of classification, the desirable features of a practical classification system. The CACCI Report proposed the development of a national commodity file. In the Section on an outline of a possible classification system it is argued that the function of a national commodity file could be replaced by a three-level classification/code with responsibility for information being divided between manufacturer, trade sector organisation and the design team, responsibility for information rests with the participant-most-concerned. Examples are provided of an individual participant's use of the proposed system and how the system would be used by several participants. In the absence of a national system, it is suggested that the proposed system would allow teams of designers to proceed with the development of a data base for computer aided design

Greenwich Academic Literature Archive

Meta-modeling design expertise

Author: Bernal Verdejo Marcelo
Publication venue: Georgia Institute of Technology
Publication date: 22/08/2016
Field of study

The general problem that this research addresses is that despite the efforts of cognitive studies to describe and document the behavior of designers in action and the evolution of computer-aided design from conceptual design to fabrication, efforts to provide computational support for high-level actions that designers execute during the creation of their work have made minimal progress. In this regard this study seeks answers to the following questions: What is the nature of design expertise? How do we capture the knowledge that expert designers embed in their patterns of organization for creating a coherent arrangement of parts? And how do we use this knowledge to develop computational methods and techniques that capture and reuse such expertise to augment the capability of designers to explore alternatives? The challenge is that such an expertise is largely based on experience, assumptions, and heuristics, and requires a process of elucidation and interpretation before any implementation into computational environments. This research adopts the meta-modeling process from the model-based systems engineering field (MBSE), understood as the creation of models of attributes and relationships among objects of a domain. Meta-modeling can contribute to elucidating, structuring, capturing, representing, and creatively manipulating knowledge embedded in design patterns. The meta-modeling process relies on abstractions that allow the integration of myriad physical and abstract entities independent from the complexity of the geometric models; mapping mechanisms that facilitate the interfacing of a repository of parts, functions, and even other systems; and computer-interpretable and human-readable meta-models that enable the generation and the assessment of both configuration specifications and geometric representations. For validation purposes three case studies from the domain of customs façade systems have been deeply studied using techniques of verbal analysis, complemented with digital documentation, for distilling the design knowledge that have been captured into the meta-models for reutilization in the generation of design alternatives. The results of this research include a framework for capturing and reusing design expertise, parametric modeling guidelines for reutilization, methods for multiplicity of external geometric representations, and the augmentation of the design space of exploration. The framework is the result of generalizing verbal analyses of the three case studies that allow the identification of the mechanics behind the application of a pattern of organization over physical components. The guidelines for reutilization are the outcome of the iterative process of automatically generating well-formed parametric models out of existing parts. The capability of producing multiple geometric representations is the product of identifying ae generic operation for interpreting abstract configuration specifications. The amplification of the design space is derived from the flexibility of the process to specify and represent alternatives. In summary, the adoption of the meta-modeling process fosters the integration of abstract constructs developed in the design cognition field that facilitate the manipulation of knowledge embedded in the underlying patterns of design organization. Meta-modeling is a mental and computational process based on abstraction and generalization that enable reutilization.Ph.D

Scholarly Materials And Research @ Georgia Tech

On the Effect of Semantically Enriched Context Models on Software Modularization

Author: Hage Jurriaan
Jansen Slinger
Khadka Ravi
Saeidi Amir
Publication venue: 'Aspect-Oriented Software Association (AOSA)'
Publication date: 04/08/2017
Field of study

Many of the existing approaches for program comprehension rely on the linguistic information found in source code, such as identifier names and comments. Semantic clustering is one such technique for modularization of the system that relies on the informal semantics of the program, encoded in the vocabulary used in the source code. Treating the source code as a collection of tokens loses the semantic information embedded within the identifiers. We try to overcome this problem by introducing context models for source code identifiers to obtain a semantic kernel, which can be used for both deriving the topics that run through the system as well as their clustering. In the first model, we abstract an identifier to its type representation and build on this notion of context to construct contextual vector representation of the source code. The second notion of context is defined based on the flow of data between identifiers to represent a module as a dependency graph where the nodes correspond to identifiers and the edges represent the data dependencies between pairs of identifiers. We have applied our approach to 10 medium-sized open source Java projects, and show that by introducing contexts for identifiers, the quality of the modularization of the software systems is improved. Both of the context models give results that are superior to the plain vector representation of documents. In some cases, the authoritativeness of decompositions is improved by 67%. Furthermore, a more detailed evaluation of our approach on JEdit, an open source editor, demonstrates that inferred topics through performing topic analysis on the contextual representations are more meaningful compared to the plain representation of the documents. The proposed approach in introducing a context model for source code identifiers paves the way for building tools that support developers in program comprehension tasks such as application and domain concept location, software modularization and topic analysis

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

ZENODO

Utrecht University Repository

FigShare

Information Extraction from Text for Improving Research on Small Molecules and Histone Modifications

Author: Klein Corinna
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The cumulative number of publications, in particular in the life sciences, requires efficient methods for the automated extraction of information and semantic information retrieval. The recognition and identification of information-carrying units in text – concept denominations and named entities – relevant to a certain domain is a fundamental step. The focus of this thesis lies on the recognition of chemical entities and the new biological named entity type histone modifications, which are both important in the field of drug discovery. As the emergence of new research fields as well as the discovery and generation of novel entities goes along with the coinage of new terms, the perpetual adaptation of respective named entity recognition approaches to new domains is an important step for information extraction. Two methodologies have been investigated in this concern: the state-of-the-art machine learning method, Conditional Random Fields (CRF), and an approximate string search method based on dictionaries. Recognition methods that rely on dictionaries are strongly dependent on the availability of entity terminology collections as well as on its quality. In the case of chemical entities the terminology is distributed over more than 7 publicly available data sources. The join of entries and accompanied terminology from selected resources enables the generation of a new dictionary comprising chemical named entities. Combined with the automatic processing of respective terminology – the dictionary curation – the recognition performance reached an F1 measure of 0.54. That is an improvement by 29 % in comparison to the raw dictionary. The highest recall was achieved for the class of TRIVIAL-names with 0.79. The recognition and identification of chemical named entities provides a prerequisite for the extraction of related pharmacological relevant information from literature data. Therefore, lexico-syntactic patterns were defined that support the automated extraction of hypernymic phrases comprising pharmacological function terminology related to chemical compounds. It was shown that 29-50 % of the automatically extracted terms can be proposed for novel functional annotation of chemical entities provided by the reference database DrugBank. Furthermore, they are a basis for building up concept hierarchies and ontologies or for extending existing ones. Successively, the pharmacological function and biological activity concepts obtained from text were included into a novel descriptor for chemical compounds. Its successful application for the prediction of pharmacological function of molecules and the extension of chemical classification schemes, such as the the Anatomical Therapeutic Chemical (ATC), is demonstrated. In contrast to chemical entities, no comprehensive terminology resource has been available for histone modifications. Thus, histone modification concept terminology was primary recognized in text via CRFs with a F1 measure of 0.86. Subsequent, linguistic variants of extracted histone modification terms were mapped to standard representations that were organized into a newly assembled histone modification hierarchy. The mapping was accomplished by a novel developed term mapping approach described in the thesis. The combination of term recognition and term variant resolution builds up a new procedure for the assembly of novel terminology collections. It supports the generation of a term list that is applicable in dictionary-based methods. For the recognition of histone modification in text it could be shown that the named entity recognition method based on dictionaries is superior to the used machine learning approach. In conclusion, the present thesis provides techniques which enable an enhanced utilization of textual data, hence, supporting research in epigenomics and drug discovery

bonndoc – Der Publikationsserver der Universität Bonn