9 research outputs found

    Indica, an Indic preprocessor for TeX. A Sinhalese TeX System

    Get PDF
    International audienceIn this paper a two-fold project is described: the first part is a generalized preprocessor for Indic scripts (scripts of languages currently spoken in India—except Urdu—, Sanskrit and Tibetan), with several kinds of input (LaTeX commands, 7-bit ASCII, CSX, Unicode) and TeX output. This utility is written in standard Flex (the GNU version of Lex), and hence can be painlessly compiled on any platform. The same input methods are used for all Indic languages, so that the user does not need to memorize different conventions and commands for each one of them. Moreover, the switch from one language to another can be done by use of user-defineable preprocessor directives.The second part is a complete TeX typesetting system for Sinhalese. The design of the fonts is described, and METAFONT-related features, such as metaness and optical correction, are discussed.At the end of the paper, the reader can find tables showing the different input methods for the four Indic scripts currently implemented in Indica: Devanagari, Tamil, Malayalam, Sinhalese

    Test: Internet Indexing Systems vs List of Known URLs: Revisited

    Get PDF
    This is a compilation of the tests done in Sept./Oct. 1997 by the author on the hen existing search engines. It was published on the web on the authors home page . As the web pages changed, this was pushed of into the old site and forgotten. The original HTML pages were converted into PDF using LibreOffice in Aug 2022 and is placed in the Spectrum repository for the record

    The Units of Time in Ancient and Medieval India

    Get PDF

    Roots of Wisdom, Branches of Devotion: Plant Life in South Asian Traditions.

    Get PDF
    This book collects a series of contributions by academics from around the world, dedicated to investigating different aspects of plant life in the religious traditions of Southern Asia

    Scanning the Science-Society Horizon

    No full text
    Science communication approaches have evolved over time gradually placing more importance on understanding the context of the communication and audience. The increase in people participating in social media on the Internet offers a new resource for monitoring what people are discussing. People self publish their views on social media, which provides a rich source of every day, every person thinking. This introduces the possibility of using passive monitoring of this public discussion to find information useful to science communicators, to allow them to better target their communications about different topics. This research study is focussed on understanding what open source intelligence, in the form of public tweets on Twitter, reveals about the contexts in which the word 'science' is used by the English speaking public. By conducting a series of studies based on simpler questions, I gradually build up a view of who is contributing on Twitter, how often, and what topics are being discussed that include the keyword 'science'. An open source a data gathering tool for Twitter data was developed and used to collect a dataset from Twitter with the keyword 'science' during 2011. After collection was completed, data was prepared for analysis by removing unwanted tweets. The size of the dataset (12.2 million tweets by 3.6 million users (authors)) required the use of mainly quantitative approaches, even though this only represents a very small proportion, about 0.02%, of the total tweets per day on Twitter Fourier analysis was used to create a model of the underlying temporal pattern of tweets per day and revealed a weekly pattern. The number of users per day followed a similar pattern, and most of these users did not use the word 'science' often on Twitter. An investigation of types of tweets suggests that people using the word 'science' were engaged in more sharing of both links, and other peoples tweets, than is usual on Twitter. Consideration of word frequency and bigrams in the text of the tweets found that while word frequencies were not particularly effective when trying to understand such a large dataset, bigrams were able to give insight into the contexts in which 'science' is being used in up to 19.19% of the tweets. The final study used Latent Dirichlet Allocation (LDA) topic modelling to identify the contexts in which 'science' was being used and gave a much richer view of the whole corpus than the bigram analysis. Although the thesis has focused on the single keyword 'science' the techniques developed should be applicable to other keywords and so be able to provide science communicators with a near real time source of information about what issues the public is concerned about, what they are saying about those issues and how that is changing over time

    Intertextual Readings of the Nyāyabhūṣaṇa on Buddhist Anti-Realism

    Get PDF
    This two-part dissertation has two goals: 1) a close philological reading of a 50-page section of a 10th-century Sanskrit philosophical work (Bhāsarvajña's Nyāyabhūṣaṇa), and 2) the creation and assessment of a novel intertextuality research system (Vātāyana) centered on the same work. The first half of the dissertation encompasses the philology project in four chapters: 1) background on the author, work, and key philosophical ideas in the passage; 2) descriptions of all known manuscript witnesses of this work and a new critical edition that substantially improves upon the editio princeps; 3) a word-for-word English translation richly annotated with both traditional explanatory material and novel digital links to not one but two interactive online research systems; and 4) a discussion of the Sanskrit author's dialectical strategy in the studied passage. The second half of the dissertation details the intertextuality research system in a further four chapters: 5) why it is needed and what can be learned from existing projects; 6) the creation of the system consisting of curated textual corpus, composite algorithm in natural language processing and information retrieval, and live web-app interface; 7) an evaluation of system performance measured against a small gold-standard dataset derived from traditional philological research; and 8) a discussion of the impact such new technology could have on humanistic research more broadly. System performance was assessed to be quite good, with a 'recall@5' of 80%, meaning that most previously known cases of mid-length quotation and even paraphrase could be automatically found and returned within the system's top five hits. Moreover, the system was also found to return a 34% surplus of additional significant parallels not found in the small benchmark. This assessment confirms that Vātāyana can be useful to researchers by aiding them in their collection and organization of intertextual observations, leaving them more time to focus on interpretation. Seventeen appendices illustrate both these efforts and a number of side projects, the latter of which span translation alignment, network visualization of an important database of South Asian prosopography (PANDiT), and a multi-functional Sanskrit text-processing web application (Skrutable).:Preface (i) Table of Contents (ii) Abbreviations (v) Terms and Symbols (v) Nyāyabhūṣaṇa Witnesses (v) Main Sanskrit Editions (vi) Introduction (vii) A Multi-Disciplinary Project in Intertextual Reading (vii) Main Object of Study: Nyāyabhūṣaṇa 104–154 (vii) Project Outline (ix) Part I: Close Reading (1) 1 Background (1) 1.1 Bhāsarvajña (1) 1.2 The Nyāyabhūṣaṇa (6) 1.2.1 Ts One of Several Commentaries on Bhāsarvajña's Nyāyasāra (6) 1.2.2 In Modern Scholarship, with Focus on NBhū 104–154 (8) 1.3 Philosophical Context (11) 1.3.1 Key Philosophical Concepts (12) 1.3.2 Intra-Textual Context within the Nyāyabhūṣaṇa (34) 1.3.3 Inter-Textual Context (36) 2 Edition of NBhū 104–154 (39) 2.1 Source Materials (39) 2.1.1 Edition of Yogīndrānanda 1968 (E) (40) 2.1.2 Manuscripts (P1, P2, V) (43) 2.1.3 Diplomatic Transcripts (59) 2.2 Notes on Using the Edition (60) 2.3 Critical Edition of NBhū 104–154 with Apparatuses (62) 3 Translation of NBhū 104–154 (108) 3.1 Notes on Translation Method (108) 3.2 Notes on Outline Headings (112) 3.3 Annotated Translation of NBhū 104–154 (114) 4 Discussion (216) 4.1 Internal Structure of NBhū 104–154 (216) 4.2 Critical Assessment of Bhāsarvajña's Argumentation (218)   Part II: Distant Reading with Digital Humanities (224) 5 Background in Intertextuality Detection (224) 5.1 Sanskrit Projects (225) 5.2 Non-Sanskrit Projects (228) 5.3 Operationalizing Intertextuality (233) 6 Building an Intertextuality Machine (239) 6.1 Corpus (Pramāṇa NLP) (239) 6.2 Algorithm (Vātāyana) (242) 6.3 User Interface (Vātāyana) (246) 7 Evaluating System Performance (255) 7.1 Previous Scholarship on NBhū 104–154 as Philological Benchmark (255) 7.2 System Performance Relative to Benchmark (257) 8 Discussion (262) Conclusion (266) Works Cited (269) Main Sanskrit Editions (269) Works Cited in Part I (271) Works Cited in Part II (281) Appendices (285) Appendix 1: Correspondence of Joshi 1986 to Yogīndrānanda 1968 (286) Appendix 1D: Full-Text Alignment of Joshi 1986 to Yogīndrānanda 1968 (287) Appendix 2: Prosopographical Relations Important for NBhū 104–154 (288) Appendix 2D: Command-Line Tool “Pandit Grapher” (290) Appendix 3: Previous Suggestions to Improve Text of NBhū 104–154 (291) Appendix 4D: Transcript and Collation Data for NBhū 104–154 (304) Appendix 5D: Command-Line Tool “cte2cex” for Transcript Data Conversion (305) Appendix 6D: Deployment of Brucheion for Interactive Transcript Data (306) Appendix 7: Highlighted Improvements to Text of NBhū 104–154 (307) Appendix 7D: Alternate Version of Edition With Highlighted Improvements (316) Appendix 8D: Digital Forms of Translation of NBhū 104–154 (317) Appendix 9: Analytic Outline of NBhū 104–154 by Shodo Yamakami (318) Appendix 10.1: New Analytic Outline of NBhū 104–154 (Overall) (324) Appendix 10.2: New Analytic Outline of NBhū 104–154 (Detailed) (325) Appendix 11D: Skrutable Text Processing Library and Web Application (328) Appendix 12D: Pramāṇa NLP Corpus, Metadata, and LDA Modeling Info (329) Appendix 13D: Vātāyana Intertextuality Research Web Application (330) Appendix 14: Sample of Yamakami Citation Benchmark for NBhū 104–154 (331) Appendix 14D: Full Yamakami Citation Benchmark for NBhū 104–154 (333) Appendix 15: Vātāyana Recall@5 Scores for NBhū 104–154 (334) Appendix 16: PVA, PVin, and PVSV Vātāyana Search Hits for Entire NBhū (338) Appendix 17: Sample Listing of Vātāyana Search Hits for Entire NBhū (349) Appendix 17D: Full Listing of Vātāyana Search Hits for Entire NBhū (355) Overview of Digital Appendices (356) Zusammenfassung (Thesen Zur Dissertation) (357) Summary of Results (361

    The literal/non-literal divide synchronically and diachronically: The lexical semantics of an English posture verb

    Get PDF
    This thesis' main research goal is to provide an account of the English posture verb sit, from a synchronic a diachronic perspective. My proposed account of sit comprises various components, including a characterisation of the different possible meanings of sit and a comparison with stand and lie. The two relevant meanings are a literal one and non-literal one (The girl is sitting on the chair vs. The wine bottle is sitting on the chair; in the former the subject is described to be in a sitting position, while in the latter the subject is not in a sitting position). I analyse each meaning/use separately, noting which semantic patterns occur with one type only and those which occur with both. I argue that the non-literal use is diachronically connected to the literal one, and I motivate this claim based on the shared components identified in the thesis and on data from corpus studies reported in the thesis. A consequence of acknowledging a divide between the literal and non-literal uses---a perspective not usually taken in theoretical linguistics---is that I am able to account for important semantic details which might be otherwise overlooked. The cognitive and typological literature includes account of posture verbs cross-linguistically, but in the theoretical literature these verbs have not received much attention. In this thesis, I review existing proposals and highlight the uncertainties surrounding the posture verbs. In order to fillthese gaps in the literature and to better understand the phenomena, I analyse data from synchronic and diachronic corpus studies, and incorporate these insights into my account of sitEl principal objetivo de investigación de esta tesis es dar cuenta del verbo de postura inglés sit (`sentarse¿), desde una perspectiva sincrónica y diacrónica. La descripción que propongo de sit comprende varios componentes, incluida una caracterización de los diferentes significados posibles de sit y una comparación con stand (`estar de pie¿) y lie (`estar echado¿). La literatura cognitiva y tipológica incluye una descripción de los verbos de postura de forma interlingüística, pero en la literatura teórica estos verbos no han recibido mucha atención. En esta tesis, reviso las propuestas existentes y destaco las preguntas sin responder que rodean a los verbos de postura. Para llenar estos vacíos en la literatura científica y comprender mejor los fenómenos, analizo datos de estudios de corpus sincrónicos y diacrónicos, e incorporo estos conocimientos en mi explicación de sit. Los dos significados relevantes son uno literal y uno no literal (The girl is sitting on the chair `La niña está sentada en la silla' vs. The wine bottle is sitting on the chair `(lit.) La botella de vino está sentada en la silla¿; en la primera frase, se describe el sujeto en posición de estar sentado, mientras que en la segunda frase el sujeto no está sentado). Analizo cada significado/uso por separado, notando qué patrones semánticos ocurren con un solo tipo y cuáles ocurren con ambos. Argumento que el uso no literal está conectado diacrónicamente con el literal, y motivo esta afirmación a partir de los componentes compartidos identificados en la tesis y en los datos de los estudios de corpus tratados en la tesis. Una consecuencia de reconocer una división entre los usos literales y no literales (una perspectiva que no suele adoptarse en la lingüística teórica) es que se consigue dar cuenta de importantes detalles semánticos que de otro modo podrían pasarse por alto

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)
    corecore