Today's computer networks provide access to a wide variety of DATA SOURCES. There are company directories, product catalogs, inventories, airline schedules, governmental databases, scientific databases, weather reports, patient records, drug studies, and many other sources of "structured" data.
The data in some of these sources is independent of the data in other sources. However, this is not always the case. Frequently, here are CONSTRAINTS that relate the data in different sources. (1) In some cases, the data in one source may replicate the data in other sources. (2) In some cases, the data in different sources may be similar but expressed in different schemas or with different vocabularies (e.g. values in euros rather than dollars, properties encoded in French rather than English, data structures in a single table rather than multiple tables). (3) In yet other cases, the data in different sources may be linked by real world constraints (such as physical laws and business rules).
When data sources are independent, their data can be managed independently. However, in the presence of constraints, updating the data in one source can and should affect the data in other sources; and in such cases, the individuals performing these update must collaborate to ensure that sources are updated correctly. COLLABORATIVE DATA MANAGEMENT (CDM) must replace independent data update.
Support for collaborative data management is common in closed information systems, such as those run by individual organizations. Based on our previous work in data integration, we believe that it is also possible to provide support for OPEN INFORMATION SYSTEMS, such as virtual enterprises and the World Wide Web.
To explore the possibilities here, the Stanford Center for Enterprise Management proposes to conduct research on relevant technologies, including data integration and dissemination, collaborative spreadsheets and websheets, and open workflow management. We propose to create and deploy appropriate technological components, such as Internet standards, Internet-based data brokers, and Internet-based workflow managers. Finally, we propose to demonstrate the power of this technology by building prototypes of collaborative data management in three separate areas, one each in science, industry, and government. Currently, we are planning to focus on the following problems: (1) the management of PATIENT RECORDS and/or DRUG TRIALS in the health care arena, (2) PRODUCT DATA MANAGEMENT in support of electronic commerce, and (3) ORGANIZATIONAL DATA MANAGEMENT in support of corporate governance and reporting.