    ISOcat: Remodeling metadata for language resources

    The Max Planck Institute for Psycholinguistics in Nijmegen, The Netherlands, is creating a state-of-the-art web environment for the ISO TC 37 (terminology and other language and content resources) metadata registry. This Data Category Registry (DCR) is called ISOcat and encompasses data categories for a broad range of language resources. Under the governance of the DCR Board, ISOcat provides an open work space for creating data category specifications, defining Data Category Selections (DCSs) (domain-specific groups of data categories), and standardising selected data categories and DCSs. Designers visualise future interactivity among the DCR, reference registries and ontological knowledge space

    The caCORE Software Development Kit: Streamlining construction of interoperable biomedical information services

    BACKGROUND: Robust, programmatically accessible biomedical information services that syntactically and semantically interoperate with other resources are challenging to construct. Such systems require the adoption of common information models, data representations and terminology standards as well as documented application programming interfaces (APIs). The National Cancer Institute (NCI) developed the cancer common ontologic representation environment (caCORE) to provide the infrastructure necessary to achieve interoperability across the systems it develops or sponsors. The caCORE Software Development Kit (SDK) was designed to provide developers both within and outside the NCI with the tools needed to construct such interoperable software systems. RESULTS: The caCORE SDK requires a Unified Modeling Language (UML) tool to begin the development workflow with the construction of a domain information model in the form of a UML Class Diagram. Models are annotated with concepts and definitions from a description logic terminology source using the Semantic Connector component. The annotated model is registered in the Cancer Data Standards Repository (caDSR) using the UML Loader component. System software is automatically generated using the Codegen component, which produces middleware that runs on an application server. The caCORE SDK was initially tested and validated using a seven-class UML model, and has been used to generate the caCORE production system, which includes models with dozens of classes. The deployed system supports access through object-oriented APIs with consistent syntax for retrieval of any type of data object across all classes in the original UML model. The caCORE SDK is currently being used by several development teams, including by participants in the cancer biomedical informatics grid (caBIG) program, to create compatible data services. caBIG compatibility standards are based upon caCORE resources, and thus the caCORE SDK has emerged as a key enabling technology for caBIG. CONCLUSION: The caCORE SDK substantially lowers the barrier to implementing systems that are syntactically and semantically interoperable by providing workflow and automation tools that standardize and expedite modeling, development, and deployment. It has gained acceptance among developers in the caBIG program, and is expected to provide a common mechanism for creating data service nodes on the data grid that is under development

    Metainformation scenarios in Digital Humanities: Characterization and conceptual modelling strategies

    Requirements for the analysis, interpretation and reuse of information are becoming more and more ambitious as we generate larger and more complex datasets. This is leading to the development and widespread use of information about information, often called metainformation (or metadata) in most disciplines. The Digital Humanities are not an exception. We often assume that metainformation helps us in documenting information for future reference by recording who has created it, when and how, among other aspects. We also assume that recording metainformation will facilitate the tasks of interpreting information at later stages. However, some works have identified some issues with existing metadata approaches, related to 1) the proliferation of too many “standards” and difficulties to choose between them; 2) the generalized assumption that metadata and data (or metainformation and information) are essentially different, and the subsequent development of separate sets of languages and tools for each (introducing redundant models); and 3) the combination of conceptual and implementation concerns within most approaches, violating basic engineering principles of modularity and separation of concerns. Some of these problems are especially relevant in Digital Humanities. In addition, we argue here that the lack of characterization of the scenarios in which metainformation plays a relevant role in humanistic projects often results in metainformation being recorded and managed without a specific purpose in mind. In turn, this hinders the process of decision making on issues such as what metainformation must be recorded in a specific project, and how it must be conceptualized, stored and managed. This paper presents a review of the most used metadata approaches in Digital Humanities and, taking a conceptual modelling perspective, analyses their major issues as outlined above. It also describes what the most common scenarios for the use of metainformation in Digital Humanities are, presenting a characterization that can assist in the setting of goals for metainformation recording and management in each case. Based on these two aspects, a new approach is proposed for the conceptualization, recording and management of metainformation in the Digital Humanities, using the ConML conceptual modelling language, and adopting the overall view that metainformation is not essentially different to information. The proposal is validated in Digital Humanities scenarios through case studies employing real-world datasetsThis work was partially supported by Spanish Ministry of Economy, Industry and Competitiveness under its Competitive Juan de la Cierva Postdoctoral Research Programme (FJCI-2016-28032)S

    Perancangan Metamodel Registry Proses Penilaian Angka Kredit Kenaikan Pangkat/Jabaran Akademik Dosen Berdasarkan Standar ISO/IEC 11179 (Studi Kasus: Data Kepegawaian ITS)

    Kualitas suatu instansi, termasuk perguruan tinggi perlu didukung oleh pengelolaan data dan informasi dengan baik. Pengelolaan data yang baik artinya data harus tersusun secara terstruktur, sistematis dan terintegrasi sehingga dapat memberikan informasi yang cepat, tepat, akurat dan relevan. Sebuah organisasi membutuhkan upaya menata data agar data tersebut terstruktur dengan baik. Data tersbut diintegrasikan kedalam sebuah Metamodel Registry untuk memudahkan proses kepegawaian. Metamodel Registry dirancang untuk memecahkan masalah koordinasi antar perspektif organisasi dengan perspektif perorangan. Metamodel diperlukan untuk memudahkan koordinasi representasi antar orang dan atau sistem yang menyimpan, memanipulasi dan melakukan pertukaran data. Metamodel juga akan membantu dalam menjaga konsistensi antar registry yang berbeda. Sumber atau bahan dalam penetapan sebuah keputusan di bidang kepegawaian adalah: data, terutama data individu masing-masing pegawai di lingkungan pemerintahan di ITS. Namun pada kenyataannya masih terdapat beberapa masalah pada data kepegawaian yang dapat mempengaruhi pengambilan keputusan diantaranya: 1) Tidak mutakhirnya data pada SIM Kepegawaian ITS, sehingga sering kali terdapat kesalahan gelar dosen pada saat akan pembuatan laporan; 2) Masih terdapat beberapa proses pada kepegawaian yang tidak menggunakan SIM; 3) Tidak memiliki definisi data yang valid, sehingga terdapat perbedaan perspektif yang dapat mempengaruhi pengambilan keputusan. Luaran dari penelitian ini adalah oerancangan metamodel registry berdasarkan standar ISO/IEC 11179 bagian 4 dan 5 yang disusun dalam bentuk kamus data dari proses bisnis Penilaian Angka Kredit Kenaikan Pangkat/Jabatan Akademik Dosen. Dengan usaha tersebut diharapkan hasil penelitian ini dapat diterapkan pada pengembangan SIM kedepanya dapat menggunakan patokan dari istilah dan kelompok data berdsarkan kamus data yang telah disusun dan divalidasi kepada pemilik data. ========== The quality of an agency, including universities, needs to be supported by good data and information management. Good data management means data must be structured in a structured, systematic and integrated so as to provide information fast, precise, accurate and relevant. An organization needs an effort to organize the data so that the data is well structured. The data is integrated into a Metamodel Registry to facilitate the personnel process. Metamodel Registry is designed to solve the problem of coordination between an organizational perspective with an individual perspective [1]. Metamodels are needed to facilitate coordination of representation between people and/or systems that store, manipulate and exchange data. Metamodel will also help in maintaining the consistency between different registers [2]. Sources or materials in determining a decision in the field of personnel are: data, especially individual data of each employee in the government in ITS. However, in reality there are still some problems with personnel data that may affect decision making such as: 1) No data update on the ITS Staffing License, so often there is a lecturer degree error at the time of preparing the report [3]; 2) There are still some processes on staffing that do not use SIM; 3) Do not have a valid data definition, so there are differences in perspective that can affect decision making. The output of this research is oetancangan metamodel registry based on standard of ISO / IEC 11179 part 4 and 5 which compiled in the form of data dictionary from business process of Assessment of Credit Rate of Promoter / Academic Lecturer's Position. With this effort, it is hoped that the results of this research can be applied to the development of the future SIM can use the benchmark of the terms and data groups based on the data dictionary that has been compiled and validated to the owner of the data

    Perancangan Metamodel Registri Berdasarkan Standar ISO/IEC 11179: Studi Kasus Cuti dan Evaluasi Masa Studi di ITS

    ITS (Institut Teknologi Sepuluh Nopember) sebagai perguruan tinggi negeri di Indonesia telah menggunakan data akademik sebagai salah satu sumber data dalam pelaporan, indikator pencapaian tujuan organisasi dan faktor penentu penilaian institusi. Namun dalam pelaksanaannya, data proses cuti dan evaluasi masa studi di ITS memiliki kelemahan yakni belum adanya pedoman administrasi metadata dan pendefinisian yang tidak konsisten. Hal ini dapat menyebabkan kualitas pelaporan rendah, penilaian kinerja menjadi tidak valid, keputusan status diambil bias dan pemberi kerja tidak mengetahui kualitas lulusan ITS dengan tepat. Berdasarkan masalah tersebut maka ITS memerlukan pendefinisian dan penamaan elemen data dalam bentuk kamus data yang digunakan sebagai referensi mengenai deskripsi data sehingga akurasi data akademik ITS dapat dipertanggung jawabkan. Penelitian ini bertujuan untuk menghasilkan dokumen kamus data berdasarkan data yang terkait dengan proses cuti dan evaluasi masa studi. Pembuatan kamus data diawali dengan identifikasi kondisi eksisting memalui studi dokumen dan wawancara, kemudian mengidentifikasi data yang dibutuhkan dari kedua proses. Selanjutnya data tersebut dikelompokkan dalam suatu entitas dan digambarkan dalam diagram hubungan. Terdapat 13 entitas dan 78 elemen data yang digunakan dalam kedua proses ini. Kemudian data tersebut digunakan dalam proses perancangan metamodel yang terdiri dari identifikasi object class, property, qualifier dan representation untuk digunakan dalam pembuatan penamaan elemen data yang baru, selain itu masing – masing elemen data didefinisikan. Nama dan devinisi tersebut kemudian diverifikasi dengan ISO/IEC 11179 dan divalidasi oleh BAPKM selaku pemilik data. Produk akhir dari penelitian ini adalah dokumen kamus data untuk elemen data yang terlibat pada proses cuti dan evaluasi masa studi di ITS. Bentuk kamus data terdiri dari nama data elemen, nama teknis, nama alias, definisi, format, tipe data, maksimal panjang karakter, nilai yang diijinkan, panduan penggunaan dan hubungan atribut. Rancangan metamodel registri beserta kamus data diharapkan menjadi acuam informasi utama mengenai data sistem yang dapat menghubungkan pengguna, analis dan pengembang. ============== ITS (Institute of Technology Sepuluh Nopember) as a state university in Indonesia has been using academic data as one source of data in reporting, indicators of achievement of organizational goals and determinants of institutional valuation. However, in the implementation, the data of the process of leave and evaluation of the study period in ITS has weaknesses, that there is no metadata administration guidelines and inconsistent definition. That can make poor reporting quality, invalid performance appraisals, bias in decision status and employers do not know the quality of ITS graduates properly. Based on these problems, ITS requires the definition and naming of elemen datas in the form of data dictionary used as a reference to the description of the data so that the accuracy of ITS academic data can be justified. This study aims to produce a data dictionary document based on data related to the process of leave and evaluation of the study period. Preparation of data dictionary begins with the identification of existing conditions through document studies and interviews, then identify the required data from both processes. Furthermore, the data is grouped in an entity and connected with the entity diagram. There are 13 entities and 78 data elements used in both of these processes. Then the data is used in the process of designing metamodel which consists of object class identification, property, qualifier and representation to be used in the naming of new data elements, besides each data element is defined. The name and the devision are then verified with ISO / IEC 11179 and validated by BAPKM as the data owner. The final product of this research is a data dictionary document for data elements involved in the process of leave and evaluation of the study period at ITS. The data dictionary form consists of the element data name, technical name, alias name, definition, format, data type, maximum character length, permitted values, usage guidance and attribute relationships. The design of the metamodel registry along with the data dictionary is expected to become the primary information about system data that can connect users, analysts and developers

    Data DNA: The Next Generation of Statistical Metadata

    Describes the components of a complete statistical metadata system and suggests ways to create and structure metadata for better access and understanding of data sets by diverse users

    An ontologically founded architecture for information systems in clinical and epidemiological research

    This paper presents an ontologically founded basic architecture for information systems, which are intended to capture, represent, and maintain metadata for various domains of clinical and epidemiological research. Clinical trials exhibit an important basis for clinical research, and the accurate specification of metadata and their documentation and application in clinical and epidemiological study projects represents a significant expense in the project preparation and has a relevant impact on the value and quality of these studies