4,528 research outputs found

    XWeB: the XML Warehouse Benchmark

    Full text link
    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

    Planning effort as an effective risk management tool

    Get PDF
    In project management, high levels of risk are considered to be a significant obstacle for project success. This paper investigates whether improving the project plan can lead to improved success for high-risk projects. A quality of planning index was designed to explore how the presence of high risk affects the quality of planning and project success. The index includes managerial aspects such as costs, human resources, procurement and quality, as well as organizational support aspects based on organization maturity models. In a field study based on data collected from 202 project managers regarding their most recent projects, it was found that the levels of risk at the beginning of projects has no effect on their final success. Drilling down to find an explanation for this surprising phenomenon, we found that in the presence of high risk, project managers significantly improve their project plans. Hence, in high-risk projects, better project plans improve all four dimensions of project success: schedule overrun, cost overrun, technical performance and customer satisfaction. However, in low-risk projects, better project plans did not contribute to reducing schedule or cost overruns. In other words, while endless risk management tools are developed, we found that improving the project plan is a more effective managerial tool in dealing with high-risk projects. Finally, the paper presents the most common planning tools currently being used in high-risk projects

    Student Engagement in Law School: Preparing 21st Century Lawyers

    Get PDF
    Presents findings from an annual survey, with a focus on teaching and learning methods that help students develop as ethical professionals and improve their legal writing and problem-solving skills

    Bringing Human Factor to Business Intelligence

    Get PDF
    Trabalho apresentado em 11th INEKA Conference, 11-13 junho 2019, Verona, ItรกliaStarting from Business Intelligence (BI) reference models, this work proposes to extend the multi-dimensional data modeling approach to integrate Human Factors (HF) related dimensions. The overall goal is to promote a fine grain understanding of derived Key Performance Indicators (KPIs) through an enhanced characterization of the operational level of work context. HF research has traditionally approached critical domains and complex socio-technical systems with a chief consideration of human situated action. Grounded on a review of the body of knowledge of the HF field this work proposes the Business Intelligence for Human Factors (BI4HF) framework. It intends to provide guidance on pertinent data identification, collection methods, modeling and integration within a BI project endeavor. BI4HF foundations are introduced and a use case on a manufacturing industry organization is presented. The outcome of the enacted BI project referred in the use case allowed new analytical capabilities regarding newly derived and existing KPIs related to operational performance.N/

    Software Engineering Methods for the Internet of Things: A Comparative Review

    Get PDF
    Accessing different physical objects at any time from anywhere through wireless network heavily impacts the living style of societies worldwide nowadays. Thus, the Internet of Things has now become a hot emerging paradigm in computing environments. Issues like interoperability, software reusability, and platform independence of those physical objects are considered the main current challenges. This raises the need for appropriate software engineering approaches to develop effective and efficient IoT applications software. This paper studies the state of the art of design and development methodologies for IoT software. The aim is to study how proposed approaches have been solved issues of interoperability, reusability, and independence of the platform. A comparative study is presented for the different software engineering methods used for the Internet of Things. Finally, the key research gaps and open issues are highlighted as future directions

    ์ž๋™์ฐจ ์‚ฌ์–‘ ๋ณ€๊ฒฝ์„ ์‹ค์‹œ๊ฐ„ ๋ฐ˜์˜ํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋””์ž์ธ ์ ‘๊ทผ ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์œตํ•ฉ๊ณผํ•™๊ธฐ์ˆ ๋Œ€ํ•™์› ์œตํ•ฉ๊ณผํ•™๋ถ€(์ง€๋Šฅํ˜•์œตํ•ฉ์‹œ์Šคํ…œ์ „๊ณต), 2020. 8. ๊ณฝ๋…ธ์ค€.The automotive industry is entering a new phase in response to changes in the external environment through the expansion of eco-friendly electric/hydrogen vehicles and the simplification of modules during the manufacturing process. However, in the existing automotive industry, conflicts between structured production guidelines and various stake-holders, who are aligned with periodic production plans, can be problematic. For example, if there is a sudden need to change either production parts or situation-specific designs, it is often difficult for designers to reflect those requirements within the preexisting guidelines. Automotive design includes comprehensive processes that represent the philosophy and ideology of a vehicle, and seeks to derive maximum value from the vehicle specifications. In this study, a system that displays information on parts/module components necessary for real-time design was proposed. Designers will be able to use this system in automotive design processes, based on data from various sources. By applying the system, three channels of information provision were established. These channels will aid in the replacement of specific component parts if an unexpected external problem occurs during the design process, and will help in understanding and using the components in advance. The first approach is to visualize real-time data aggregation in automobile factories using Google Analytics, and to reflect these in self-growing characters to be provided to designers. Through this, it is possible to check production and quality status data in real time without the use of complicated labor resources such as command centers. The second approach is to configure the data flow to be able to recognize and analyze the surrounding situation. This is done by applying the vehicles camera to the CCTV in the inventory and distribution center, as well as the direction inside the vehicle. Therefore, it is possible to identify and record the parts resources and real-time delivery status from the internal camera function without hesitation from existing stakeholders. The final approach is to supply real-time databases of vehicle parts at the site of an accident for on-site repair, using a public API and sensor-based IoT. This allows the designer to obtain information on the behavior of parts to be replaced after accidents involving light contact, so that it can be reflected in the design of the vehicle. The advantage of using these three information channels is that designers can accurately understand and reflect the modules and components that are brought in during the automotive design process. In order to easily compose the interface for the purpose of providing information, the information coming from the three channels is displayed in their respective, case-specific color in the CAD software that designers use in the automobile development process. Its eye tracking usability evaluation makes it easy for business designers to use as well. The improved evaluation process including usability test is also included in this study. The impact of the research is both dashboard application and CAD system as well as data systems from case studies are currently reflected to the design ecosystem of the motors group.์ž๋™์ฐจ ์‚ฐ์—…์€ ์นœํ™˜๊ฒฝ ์ „๊ธฐ/์ˆ˜์†Œ ์ž๋™์ฐจ์˜ ํ™•๋Œ€์™€ ์ œ์กฐ ๊ณต์ •์—์„œ์˜ ๋ชจ๋“ˆ ๋‹จ์ˆœํ™”๋ฅผ ํ†ตํ•ด์„œ ์™ธ๋ถ€ ํ™˜๊ฒฝ์˜ ๋ณ€ํ™”์— ๋”ฐ๋ฅธ ์ƒˆ๋กœ์šด ๊ตญ๋ฉด์„ ๋งž์ดํ•˜๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ธฐ์กด์˜ ์ž๋™์ฐจ ์‚ฐ์—…์—์„œ ๊ตฌ์กฐํ™”๋œ ์ƒ์‚ฐ ๊ฐ€์ด๋“œ๋ผ์ธ๊ณผ ๊ธฐ๊ฐ„ ๋‹จ์œ„ ์ƒ์‚ฐ ๊ณ„ํš์— ๋งž์ถฐ์ง„ ์—ฌ๋Ÿฌ ์ดํ•ด๊ด€๊ณ„์ž๋“ค๊ณผ์˜ ๊ฐˆ๋“ฑ์€ ๋ณ€ํ™”์— ๋Œ€์‘ํ•˜๋Š” ๋ฐฉ์•ˆ์ด ๊ด€์„ฑ๊ณผ ๋ถ€๋”ชํžˆ๋Š” ๋ฌธ์ œ๋กœ ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๊ฐ‘์ž‘์Šค๋Ÿฝ๊ฒŒ ์ƒ์‚ฐ์— ํ•„์š”ํ•œ ๋ถ€ํ’ˆ์„ ๋ณ€๊ฒฝํ•ด์•ผ ํ•˜๊ฑฐ๋‚˜ ํŠน์ • ์ƒํ™ฉ์— ์ ์šฉ๋˜๋Š” ๋””์ž์ธ์„ ๋ณ€๊ฒฝํ•  ๊ฒฝ์šฐ, ์ฃผ์–ด์ง„ ๊ฐ€์ด๋“œ๋ผ์ธ์— ๋”ฐ๋ผ ๋””์ž์ด๋„ˆ๊ฐ€ ์ง์ ‘ ์˜๊ฒฌ์„ ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ค์šด ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ž๋™์ฐจ ๋””์ž์ธ์€ ์ฐจ์ข…์˜ ์ฒ ํ•™๊ณผ ์ด๋…์„ ๋‚˜ํƒ€๋‚ด๊ณ  ํ•ด๋‹น ์ฐจ๋Ÿ‰์ œ์›์œผ๋กœ ์ตœ๋Œ€์˜ ๊ฐ€์น˜๋ฅผ ๋Œ์–ด๋‚ด๊ณ ์ž ํ•˜๋Š” ์ข…ํ•ฉ์ ์ธ ๊ณผ์ •์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์—ฌ๋Ÿฌ ์›์ฒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž๋™์ฐจ ๋””์ž์ธ ๊ณผ์ •์—์„œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋””์ž์ธ์— ํ•„์š”ํ•œ ๋ถ€ํ’ˆ/๋ชจ๋“ˆ ๊ตฌ์„ฑ์š”์†Œ๋“ค์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ํ‘œ์‹œํ•ด์ฃผ๋Š” ์‹œ์Šคํ…œ์„ ๊ณ ์•ˆํ•˜์˜€๋‹ค. ์ด๋ฅผ ์ ์šฉํ•˜์—ฌ ์ž๋™์ฐจ ๋””์ž์ธ ๊ณผ์ •์—์„œ ์˜ˆ์ƒ ๋ชปํ•œ ์™ธ๋ถ€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ์„ ํƒํ•  ๊ตฌ์„ฑ ๋ถ€ํ’ˆ์„ ๋Œ€์ฒดํ•˜๊ฑฐ๋‚˜ ์‚ฌ์ „์— ํ•ด๋‹น ๋ถ€ํ’ˆ์„ ์ดํ•ดํ•˜๊ณ  ๋””์ž์ธ์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ธ ๊ฐ€์ง€ ์ •๋ณด ์ œ๊ณต ์ฑ„๋„์„ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” ์ž๋™์ฐจ ๊ณต์žฅ ๋‚ด ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ์ง‘๊ณ„๋ฅผ Google Analytics๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‹œ๊ฐํ™”ํ•˜๊ณ , ์ด๋ฅผ ๊ณต์žฅ ์ž์ฒด์˜ ์ž๊ฐ€ ์„ฑ์žฅ ์บ๋ฆญํ„ฐ์— ๋ฐ˜์˜ํ•˜์—ฌ ๋””์ž์ด๋„ˆ์—๊ฒŒ ์ œ๊ณตํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ข…ํ•ฉ์ƒํ™ฉ์‹ค ๋“ฑ์˜ ๋ณต์žกํ•œ ์ธ๋ ฅ ์ฒด๊ณ„ ์—†์ด๋„ ์ƒ์‚ฐ ๋ฐ ํ’ˆ์งˆ ํ˜„ํ™ฉ ๋ฐ์ดํ„ฐ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ํ™•์ธ ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ๋Š” ์ฐจ๋Ÿ‰์šฉ ์ฃผ์ฐจ๋ณด์กฐ ์„ผ์„œ ์นด๋ฉ”๋ผ๋ฅผ ์ฐจ๋Ÿ‰ ๋ถ€์ฐฉ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ธ๋ฒคํ† ๋ฆฌ์™€ ๋ฌผ๋ฅ˜์„ผํ„ฐ์˜ CCTV์—๋„ ์ ์šฉํ•˜์—ฌ ์ฃผ๋ณ€์ƒํ™ฉ์„ ์ธ์‹ํ•˜๊ณ  ๋ถ„์„ํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ์ฐจ๋Ÿ‰์˜ ์กฐ๋ฆฝ ์ƒ์‚ฐ ๋‹จ๊ณ„์—์„œ ๋ถ€ํ’ˆ ๋‹จ์œ„์˜ ์ด๋™, ์šด์†ก, ์ถœํ•˜๋ฅผ ๊ฑฐ์ณ ์™„์„ฑ์ฐจ์˜ ์ฃผํ–‰ ๋‹จ๊ณ„์— ์ด๋ฅด๊ธฐ๊นŒ์ง€ ๋ฐ์ดํ„ฐ ํ๋ฆ„์„ ํŒŒ์•…ํ•˜๋Š” ๊ฒƒ์ด ๋””์ž์ธ ๋ถ€๋ฌธ์— ํ•„์š”ํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ํ™œ์šฉ๋˜์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์กด ์ดํ•ด๊ด€๊ณ„์ž๋“ค์˜ ํฐ ๋ฐ˜๋ฐœ ์—†์ด ๋‚ด๋ถ€์˜ ์นด๋ฉ”๋ผ ๊ธฐ๋Šฅ์œผ๋กœ๋ถ€ํ„ฐ ๋ถ€ํ’ˆ ๋ฆฌ์†Œ์Šค์™€ ์šด์†ก ์ƒํƒœ๋ฅผ ์‹ค์‹œ๊ฐ„ ํŒŒ์•… ๋ฐ ๊ธฐ๋ก ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๊ณต๊ณต API์™€ ์„ผ์„œ ๊ธฐ๋ฐ˜์˜ ์‚ฌ๋ฌผ์ธํ„ฐ๋„ท์„ ํ™œ์šฉํ•ด์„œ ๋„๋กœ ์œ„ ์ฐจ๋Ÿ‰ ์‚ฌ๊ณ ๊ฐ€ ๋ฐœ์ƒํ•œ ์œ„์น˜์—์„œ์˜ ํ˜„์žฅ ์ˆ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์ฐจ๋Ÿ‰ ๋ถ€ํ’ˆ ์ฆ‰์‹œ ์ˆ˜๊ธ‰ ๋ฐ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šคํ™” ๋ฐฉ๋ฒ•๋„ ๊ฐœ๋ฐœ ๋˜์—ˆ๋‹ค. ์ด๋Š” ๋””์ž์ด๋„ˆ๋กœ ํ•˜์—ฌ๊ธˆ ๊ฐ€๋ฒผ์šด ์ ‘์ด‰ ์‚ฌ๊ณ ์—์„œ์˜ ๋ถ€ํ’ˆ ๊ต์ฒด ํ–‰ํƒœ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์–ป๊ฒŒ ํ•˜์—ฌ ์ฐจ๋Ÿ‰์˜ ๋””์ž์ธ์— ๋ฐ˜์˜ ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜์˜€๋‹ค. ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ด ์„ธ ๊ฐ€์ง€ ์ •๋ณด ์ œ๊ณต ์ฑ„๋„์„ ํ™œ์šฉํ•  ๊ฒฝ์šฐ, ์ž๋™์ฐจ ๋””์ž์ธ ๊ณผ์ •์—์„œ ๋ถˆ๋Ÿฌ๋“ค์—ฌ์˜ค๋Š” ๋ถ€ํ’ˆ ๋ฐ ๋ชจ๋“ˆ์˜ ๊ตฌ์„ฑ ์š”์†Œ๋“ค์„ ๋””์ž์ด๋„ˆ๊ฐ€ ์ •ํ™•ํžˆ ์•Œ๊ณ  ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ๋ถ€๊ฐ๋˜์—ˆ๋‹ค. ์ •๋ณด ์ œ๊ณต์˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‰ฝ๊ฒŒ ๊ตฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ, ์‹ค์ œ๋กœ ๋””์ž์ด๋„ˆ๋“ค์ด ์ž๋™์ฐจ ๊ฐœ๋ฐœ ๊ณผ์ •์—์„œ ๋””์ž์ธ ํ”„๋กœ์„ธ์Šค ์ƒ์—์„œ ํ™œ์šฉํ•˜๋Š” CAD software์— ์„ธ ๊ฐ€์ง€ ์ฑ„๋„๋“ค๋กœ๋ถ€ํ„ฐ ๋“ค์–ด์˜ค๋Š” ์ •๋ณด๋ฅผ ์‚ฌ๋ก€๋ณ„ ์ปฌ๋Ÿฌ๋กœ ํ‘œ์‹œํ•˜๊ณ , ์ด๋ฅผ ์‹œ์„ ์ถ”์  ์‚ฌ์šฉ์„ฑ ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ํ˜„์—… ๋””์ž์ด๋„ˆ๋“ค์ด ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฝ๊ฒŒ ๊ฐœ์„ ํ•œ ๊ณผ์ •๋„ ๋ณธ ์—ฐ๊ตฌ์— ํฌํ•จ์‹œ์ผœ ์„ค๋ช…ํ•˜์˜€๋‹ค.1 Introduction 1 1.1 Research Background 1 1.2 Objective and Scope 2 1.3 Environmental Changes 3 1.4 Research Method 3 1.4.1 Causal Inference with Graphical Model 3 1.4.2 Design Thinking Methodology with Co-Evolution 4 1.4.3 Required Resources 4 1.5 Research Flow 4 2 Data-driven Design 7 2.1 Big Data and Data Management 6 2.1.1 Artificial Intelligence and Data Economy 6 2.1.2 API (Application Programming Interface) 7 2.1.3 AI driven Data Management for Designer 7 2.2 Datatype from Automotive Industry 8 2.2.1 Data-driven Management in Automotive Industry 8 2.2.2 Automotive Parts Case Studies 8 2.2.3 Parameter for Generative Design 9 2.3 Examples of Data-driven Design 9 2.3.1 Responsive-reactive 9 2.3.2 Dynamic Document Design 9 2.3.3 Insignts from Data-driven Design 10 3 Benchmark of Data-driven Automotive Design 12 3.1 Method of Global Benchmarking 11 3.2 Automotive Design 11 3.2.1 HMI Design and UI/UX 11 3.2.2 Hardware Design 12 3.2.3 Software Design 12 3.2.4 Convergence Design Process Model 13 3.3 Component Design Management 14 4 Vehicle Specification Design in Mobility Industry 16 4.1 Definition of Vehicle Specification 16 4.2 Field Study 17 4.3 Hypothesis 18 5 Three Preliminary Practical Case Studies for Vehicle Specification to Datadriven 21 5.1 Production Level 31 5.1.1 Background and Input 31 5.1.2 Data Process from Inventory to Designer 41 5.1.3 Output to Designer 51 5.2 Delivery Level 61 5.2.1 Background and Input 61 5.2.2 Data Process from Inventory to Designer 71 5.2.3 Output to Designer 81 5.3 Consumer Level 91 5.3.1 Background and Input 91 5.3.2 Data Process from Inventory to Designer 101 5.3.3 Output to Designer 111 6 Two Applications for Vehicle Designer 86 6.1 Real-time Dashboard DB for Decision Making 123 6.1.1 Searchable Infographic as a Designer's Tool 123 6.1.2 Scope and Method 123 6.1.3 Implementation 123 6.1.4 Result 124 6.1.5 Evaluation 124 6.1.6 Summary 124 6.2 Application to CAD for vehicle designer 124 6.2.1 CAD as a Designer's Tool 124 6.2.2 Scope and Method 125 6.2.3 Implementation and the Display of the CAD Software 125 6.2.4 Result 125 6.2.5 Evaluation: Usability Test with Eyetracking 126 6.2.6 Summary 128 7 Conclusion 96 7.1 Summary of Case Studies and Application Release 129 7.2 Impact of the Research 130 7.3 Further Study 131Docto

    Evaluating the quality of project planning: a model and field results

    Get PDF
    Faulty planning will result in project failure, whereas high-quality project planning increases the project's chances of success. The paper reports on the successful development and implementation of a model aimed at evaluating the quality of project planning. The model is based on both the abilities required of the project manager and the organizational support required for a proper project management infrastructure. The model was validated and applied by 282 project managers in nine organizations, where strong and weak planning processes were identified and analysed

    Employing Data Warehousing for Contract Administration: e-Dispute Resolution Prototype

    Get PDF
    Although data warehouse is very practical for decision making, its application in contract administration is rather limited because of the complicated legal issues and the voluminous data involved. This research attempts to bridge this gap in two ways. First, conceptual models of data warehouse are developed to explain the contents and overall features of the system that were verified by 12 experts in Malaysia. Second, an electronic dispute resolution template, known as e-Dispute Resolution (e-DR), is prototyped by using a database tool based on the guidelines of contractual variations agreed by the experts. Subsequently, the prototype is evaluated by 16 professional quantity surveyors from an established consulting firm. The prototype was organized based on a systematic breakdown of issues and incorporated a Boolean keyword search feature. The results show that the concept of data warehouse is applicable to contract administration and is well received by practitioners. Overall, this article renders significant theoretical and practical contributions in which the resulting e-DR does not only lead toward more informed decision making but is also able to mitigate or prevent contractual disputes in the construction industry, where such a phenomenon seems to be inevitabl

    BioWarehouse: a bioinformatics database warehouse toolkit

    Get PDF
    BACKGROUND: This article addresses the problem of interoperation of heterogeneous bioinformatics databases. RESULTS: We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. CONCLUSION: BioWarehouse embodies significant progress on the database integration problem for bioinformatics
    • โ€ฆ
    corecore