1,541 research outputs found
Sanskrit Lexical Sources: Digital Synthesis and Revision
The proposed project aims to synthesize, extend, revise, and improve the principal lexical reference works of Sanskrit, one of the world's richest culture-bearing languages, and to provide wide public access to them in the digital Sanskrit library
Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit
The primary focus of this thesis is to make Sanskrit manuscripts more
accessible to the end-users through natural language technologies. The
morphological richness, compounding, free word orderliness, and low-resource
nature of Sanskrit pose significant challenges for developing deep learning
solutions. We identify four fundamental tasks, which are crucial for developing
a robust NLP technology for Sanskrit: word segmentation, dependency parsing,
compound type identification, and poetry analysis. The first task, Sanskrit
Word Segmentation (SWS), is a fundamental text processing task for any other
downstream applications. However, it is challenging due to the sandhi
phenomenon that modifies characters at word boundaries. Similarly, the existing
dependency parsing approaches struggle with morphologically rich and
low-resource languages like Sanskrit. Compound type identification is also
challenging for Sanskrit due to the context-sensitive semantic relation between
components. All these challenges result in sub-optimal performance in NLP
applications like question answering and machine translation. Finally, Sanskrit
poetry has not been extensively studied in computational linguistics.
While addressing these challenges, this thesis makes various contributions:
(1) The thesis proposes linguistically-informed neural architectures for these
tasks. (2) We showcase the interpretability and multilingual extension of the
proposed systems. (3) Our proposed systems report state-of-the-art performance.
(4) Finally, we present a neural toolkit named SanskritShala, a web-based
application that provides real-time analysis of input for various NLP tasks.
Overall, this thesis contributes to making Sanskrit manuscripts more accessible
by developing robust NLP technology and releasing various resources, datasets,
and web-based toolkit.Comment: Ph.D. dissertatio
Proposing a Multi-lingual Translation Scheme Utilizing the Extensible Markup Language XML
The paper proposes a new idea concerning a Multi-lingual translation scheme utilizing the newly evolving Internet tag language, Extensible Markup Language XML. The data description property of XML can be used to create an effective system to translate documents. First of all the XML tagged document of a source language is prepared manually. These tags are not only for the words but also for the grammatical structure, or for example, the phrase structure grammar, so that the analytical process of the translation can be reduced to a level suitable for many Internet applications. Next, XML document type definitions (DTDs) of grammatical structures of different languages are created. In the translation process the source sentences are broken down into pieces and then categorized to the respective DTDs to which they belong. The sentence structure of each language serves as a main structure tree, the broken elements are mapped with the elements of the DTD accordingly, which consequently maps them with the relative elements of the target language structure tree and the appropriate transformations are made. Among the possible transformation patterns, the most appropriate one is selected as an output
Enabling High-Level Application Development for the Internet of Things
Application development in the Internet of Things (IoT) is challenging
because it involves dealing with a wide range of related issues such as lack of
separation of concerns, and lack of high-level of abstractions to address both
the large scale and heterogeneity. Moreover, stakeholders involved in the
application development have to address issues that can be attributed to
different life-cycles phases. when developing applications. First, the
application logic has to be analyzed and then separated into a set of
distributed tasks for an underlying network. Then, the tasks have to be
implemented for the specific hardware. Apart from handling these issues, they
have to deal with other aspects of life-cycle such as changes in application
requirements and deployed devices. Several approaches have been proposed in the
closely related fields of wireless sensor network, ubiquitous and pervasive
computing, and software engineering in general to address the above challenges.
However, existing approaches only cover limited subsets of the above mentioned
challenges when applied to the IoT. This paper proposes an integrated approach
for addressing the above mentioned challenges. The main contributions of this
paper are: (1) a development methodology that separates IoT application
development into different concerns and provides a conceptual framework to
develop an application, (2) a development framework that implements the
development methodology to support actions of stakeholders. The development
framework provides a set of modeling languages to specify each development
concern and abstracts the scale and heterogeneity related complexity. It
integrates code generation, task-mapping, and linking techniques to provide
automation. Code generation supports the application development phase by
producing a programming framework that allows stakeholders to focus on the
application logic, while our mapping and linking techniques together support
the deployment phase by producing device-specific code to result in a
distributed system collaboratively hosted by individual devices. Our evaluation
based on two realistic scenarios shows that the use of our approach improves
the productivity of stakeholders involved in the application development
Development of Multilingual Resource Management Mechanisms for Libraries
Multilingual is one of the important concept in any library. This study is create on the basis of global recommendations and local requirement for each and every libraries. Select the multilingual components for setting up the multilingual cluster in different libraries to each user. Development of multilingual environment for accessing and retrieving the library resources among the users as well as library professionals. Now, the methodology of integration of Google Indic Transliteration for libraries have follow the five steps such as (i) selection of transliteration tools for libraries (ii) comparison of tools for libraries (iii) integration Methods in Koha for libraries (iv) Development of Google indic transliteration in Koha for users (v) testing for libraries (vi) results for libraries. Development of multilingual framework for libraries is also an important task in integrated library system and in this section have follow the some important steps such as (i) Bengali Language Installation in Koha for libraries (ii) Settings Multilingual System Preferences in Koha for libraries (iii) Translate the Modules for libraries (iv) Bengali Interface in Koha for libraries. Apart from these it has also shows the Bengali data entry process in Koha for libraries such as Data Entry through Ibus Avro Phonetics for libraries and Data Entry through Virtual Keyboard for libraries. Development of Multilingual Digital Resource Management for libraries by using the DSpace and Greenstone. Management of multilingual for libraries in different areas such as federated searching (VuFind Multilingual Discovery tool ; Multilingual Retrieval in OAI-PMH tool ; Multilingual Data Import through Z39.50 Server ). Multilingual bibliographic data edit through MarcEditor for the better management of integrated library management system. It has also create and editing the content by using the content management system tool for efficient and effective retrieval of multilingual digital content resources among the users
- …