Search CORE

122 research outputs found

Data Science and Prediction

Author: Dhar Vasant
Publication venue
Publication date: 01/01/2012
Field of study

The world's data is growing more than 40% annually. Coupled with exponentially growing computing horsepower, this provides us with unprecedented basis for 'learning' useful things from the data through statistical induction without material human intervention and acting on them. Philosophers have long debated the merits and demerits of induction as a scientific method, the latter being that conclusions are not guaranteed to be certain and that multiple and numerous models can be conjured to explain the observed data. I propose that 'big data' brings a new and important perspective to these problems in that it greatly ameliorates historical concerns about induction, especially if our primary objective is prediction as opposed to causal model identification. Equally significantly, it propels us into an era of automated decision making, where computers will make the bulk of decisions because it is infeasible or more costly for humans to do so. In this paper, I describe how scale, integration and most importantly, prediction will be distinguishing hallmarks in this coming era of Data Science.' In this brief monograph, I define this newly emerging field from business and research perspectives.NYU Stern School of Business, NYU Stern Center for Digital Economy Researc

Crossref

New York University Faculty Digital Archive

A VALUE-CHAIN BASED MODEL FOR SUPPORTING INFORMATION TECHNOLOGY INVESTMENTS

Author: Dhar Vasant
Publication venue: Stern School of Business, New York University
Publication date: 01/10/1991
Field of study

Business organizations are thinking increasingly in terms of information technology solutions to business problems, as opposed to data processing for supporting the business. Information technology is now viewed as an important means for achieving competitive advantage. For firms in hardware/software business it is therefore becoming increasingly important to provide clients with the means to do an analysis of business needs and strategies and to think in terms of providing global IT solutions that address these needs. The value-chain model articulated by Porter (1985) attempts to link IT solutions to business strategy. It is based on a simple economic theory: a firm remains competitive by virtue of being a low cost producer or differentiating its products/services; accordingly its strategies must be based on countering forces (such as new entrants, substitute products, bargaining power of buyers and suppliers) that erode these advantages . Information technology is considered a key factor in being able to deal with these forces Accordingly, how much to spend and where to spend on information technology is determined by how well it enables the firm to deal with its dominant forces (threats). Porter's model has found widespread appeal among practitioners (notably information systems executives) due to its simplicity and intuitive appeal. Several methodologies have been designed around this model that encourage executives to "think through" this model in order to identify technologies that could provide competitive advantage. However, there are no existing formalizations of the value-chain model either by industry, market structure, or organizational structure. We have been developing such a model for a specific industry (insurance) with the objective of building an executive support tool that can show interactively, how a proposed technology or organizational change can impact specific metrics/values of interest of business processes defined at various levels of abstraction, and thereby the bottom line. By using such a model, an executive can also analyze technology and resource requirements required to transform one set of business processes into another, more desirable state.Information Systems Working Papers Serie

CiteSeerX

New York University Faculty Digital Archive

ON THE PLAUSIBILITY AND SCOPE OF EXPERT SYSTEMS IN MANAGEMENT

Author: Dhar Vasant
Publication venue: Stern School of Business, New York University
Publication date: 01/07/1985
Field of study

Over the last decade there have been several efforts at building knowledge based "expert systemsâ, mostly in the scientific and medical arenas. Despite the fact that almost all such systems are in their experimental stages, designers are optimistic about their eventual success. In the last few years, there have been many references to the possibility of expert systems in the management literature. However, what is lacking is a clear theoretical perspective on how various management problems differ in nature from problems in other domains, and the implications of these differences for knowledge based decision support systems for management. In this paper, I examine some of these differences, what they suggest in terms of the functionality that a computer based system must have in order to support organizational decision making, and the scope of such a system as a decision aid. The discussion is grounded in the context of a computer based system called PLANET that exhibits some of the desired functionality.Information Systems Working Papers Serie

New York University Faculty Digital Archive

Prediction in Financial Markets: The Case for Small Disjuncts

Author: Dhar Vasant
Publication venue
Publication date: 24/09/2009
Field of study

Predictive models in regression and classification problems typically have a single model that covers most, if not all, cases in the data. At the opposite end of the spectrum is a collection of models each of which covers a very small subset of the decision space. These are referred to as “small disjuncts.” The tradeoffs between the two types of models have been well documented. Single models, especially linear ones, are easy to interpret and explain. In contrast, small disjuncts do not provide as clean or as simple an interpretation of the data, and have been shown by several researchers to be responsible for a disproportionately large number of errors when applied to out of sample data. This research provides a counterpoint, demonstrating that “simple” small disjuncts provide a credible model for financial market prediction, a problem with a high degree of noise. A related novel contribution of this paper is a simple method for measuring the “yield” of a learning system, which is the percentage of in sample performance that the learned model can be expected to realize on out-of-sample data. Curiously, such a measure is missing from the literature on regression learning algorithms.NYU Stern School of Busines

New York University Faculty Digital Archive

Data Science and Prediction

Author: Dhar Vasant
Publication venue
Publication date: 16/10/2012
Field of study

The use of the term 'Data Science' is becoming increasingly common along with 'Big Data.' What does Data Science mean? Is there something unique about it? What skills should a 'data scientist' possess to be productive in the emerging digital age characterized by a deluge of data? What are the implications for business and for scientific inquiry? In this brief monograph I address these questions from a predictive modeling perspective.NYU Stern, IOMS Department, Center for Business Analytic

New York University Faculty Digital Archive

A VALUE-CHAIN BASED PROCESS MODEL FOR SUPPORTING BUSINESS PROCESS REENGINEERING

Author: Dhar Vasant
Publication venue: Stern School of Business, New York University
Publication date: 01/12/1991
Field of study

Constantly envisioning how the rapid developments in information technology offer new opportunities, and engineering business processes accordingly will continue to be a difficult problem for senior management. An important observation by Keen (1991) is that over the last three decades, effective use of rapidly changing technology has lagged its availability. A central problem is that of justifying the technology, measuring its business value. The value-chain model articulated by Porter (1985) is a natural candidate in providing a basis for this evaluation. It is based on the simple economic theory that a firm remains competitive by virtue of being a low cost producer or differentiating its products/services to the customer, that is, by providing customer satisfaction. It is intuitive to think of "the customer" as the end user of a product or service. However, projecting this definition into the organization, where all pieces of work within it have a customer that needs to be satisfied provides a good basis for work design and its implementation. As technology evolves, forcing the organization to reassess its customers, the work must be redesigned. This is becoming known increasingly as "process reengineering" . Porter's model has found widespread appeal among practitioners at the strategic level due to its theoretical simplicity and commonsense appeal. Several methodologies have been designed around this model that encourage executives to "think through" and identify technologies that could provide competitive advantage. However, these methods have some serious limitations due to the lack of a sound conceptual underpinning and their inability to link explicitly, technology to business value metrics. Based on an analysis of one specific industry (insurance) we have found that simple process oriented models such as BSP, when extended to deal with value (in terms of cost or product/service differentiation to the customer), provide a sound basis for exploring process reengineering. An implementation of this methodology should enable management to simulate how a system would "react" to various types of inputs in terms of specific metrics of interest.Information Systems Working Papers Serie

New York University Faculty Digital Archive

ANALOGICAL AND DEPENDENCY DIRECTED REASONING STRATEGIES FOR LARGE SYSTEMS EVOLUTION

Author: Dhar Vasant
Jarke Matthias
Publication venue: Stern School of Business, New York University
Publication date: 01/08/1985
Field of study

The maintenance of large information systems involves continuous design modifications to designs in response to evolving business conditions or changing user requirements. Because of the complexity barrier associated with engineering such systems, changes can be ad hoc and prone to errors. Based on our observations of such a process in the oil industry, we believe that the systems maintenance activity would benefit greatly if the process knowledge reflecting the teleology of a design could be captured and used in order to reason about changing requirements, and to design parts of systems that might be âsimilarâ to existing ones. In this paper, we describe a partially implemented formalism called REMAP (REpresentation and MAintenance of Process knowledge) that accumulates design process knowledge to manage systems evolution. To accomplish this, REMAP acquires and maintains dependencies among the design decisions made during a prototyping process as well as the general domain-specific design rules on which such dependencies are based. This knowledge can then be applied to prototype refinement, systems maintenance, and the re-use of existing designs to construct âsimilarâ design fragments.Information Systems Working Papers Serie

New York University Faculty Digital Archive

A PROBLEM-SOLVER/TMS ARCHITECTURE FOR GENERAL CONSTRAINT SATISFACTION PROBLEMS

Author: Croker Albert
Dhar Vasant
Publication venue: Stern School of Business, New York University
Publication date: 01/12/1988
Field of study

Constraints, in various forms, are ubiquitous to design problems. In this paper, we provide a formal characterization of a generalized constraint satisfaction problem (CSP) that can be used to model many types of design/planning problems, and the architecture of an imlemented reasoning system for solving this problem. The architecture includes a truth maintenance system (TMS) which is specifically designed to reason about the relationships expressed in the constraints as a problem solution evolves. The CSP consists of two types of data. The first type of datum corresponds to assignments that are handled by the problem solver, and the second type corresponds to constraint terms handled by the TMS. The dependency network, representing the relationships among constraint terms, is static and generally quite small, depending on the number of constraint terms. Also, justifications are never manipulated (only evaluated). This results in an architecture that makes efficient use of both space and time. The need for efficient TMSs, even though these might deal only with certain classes of problems, is underscored by the fact that general purpose TMSs have often been found to be highly inefficient for solving large problems. We also show how certain instances of the generalized CSP can be formulated as an integer programming problem, special cases of which can be solved efficiently using mathematical (integer) programming techniques.Information Systems Working Papers Serie

New York University Faculty Digital Archive

Abstract-Driven Pattern Discovery In Databases

Author: Dhar Vasant
Tuzhilin Alexander
Publication venue: Stern School of Business, New York University
Publication date: 01/03/1992
Field of study

In this paper, we study the problem of discovering interesting patterns in large volumes of data. Patterns can be expressed not only in terms of the database schema but also in user-defined terms, such as relational views and classification hierarchies. The user-defined terminology is stored in a data dictionary that maps it into the language of the database schema. We define a pattern as a deductive rule expressed in user-defined terms that has a degree of certainty associated with it. We present methods of discovering interesting patterns based on abstracts which are summaries of the data expressed in the language of the user.Information Systems Working Papers Serie

New York University Faculty Digital Archive

DEPENDENCY BASED COORDINATION FOR CONSISTENT SOLUTIONS IN DISTRIBUTED WORK

Author: Dhar Vasant
Johar Hardeep
Publication venue: Stern School of Business, New York University
Publication date: 01/01/1992
Field of study

Many organizational problems can be decomposed into nearly independent subproblems the solution of which is the responsibility of independent agents. In this kind of work, which we call distributed work, the problems are only nearly independent since dependencies exist between the commitments required from each agent. As a consequence of these dependencies, the coordination problem becomes one of maintaining a consistent global solution in the face of the possibly conflicting activities of each agent. We define a normative model for coordination protocols that indicates the formal requirements for maintaining a globally consistent solution. The model identifies several properties that the protocol must enforce, namely serializability, atomicity, completeness, and soundness. We show that these properties are desirable in coordination protocols for distributed work problems.Information Systems Working Papers Serie

New York University Faculty Digital Archive