Metadata Architecture and Standards

There have been a number of competing and sometimes overlapping data representation standards that have attempted to address various aspects of metadata integration, interoperation, and management in general over the last ten years. While there is some evidence of convergence in metadata standards for data warehousing in particular, new technologies requiring increasingly sophisticated metadata creation, exchange, and management capabilities continue to emerge, causing de facto standards proliferation where voids exist. Some of the more relevant standards include:
  • Database Management. The ISO/IEC-11179 Specification and Standardization of Data Elements1 standard specifies basic aspects of data element composition, including metadata. In the United States, it is supported by L8 - Metadata, a technical committee of the National Committee for Information Technology Standards (NCITS), a standards development organization accredited by the American National Standards Institute (ANSI). This committee is also responsible for several ANSI standards, including X3.285 (metamodel for the management of shareable data), as well as new proposed standards for knowledge representation.
  • Data Warehousing. The Common Warehouse Metamodel (CWM)2 describes metadata interchange among data warehousing, business intelligence, knowledge management and portal technologies. CWM combines standards originally developed by the Meta Data Coalition (MDC), including the Metadata Interchange Specification (MDIS) and Microsoft's Open Information Model (OIM), and the Object Management Group (OMG)'s earlier version of the CWM. The MDC OIM was designed as a technology-neutral, vendor-independent metadata standard by data warehousing vendors. It was a flat file definition intended for use by warehouse loading tools in batch mode through a public API, describing multiple object types, including databases, schemas, files, and relationships, and supporting extensions for exchanging tool-specific or proprietary metadata. The Common Warehouse Metamodel leverages various OMG standards, including the UML (Unified Modeling Language)3, XMI (XML Metadata Interchange) and MOF (Meta Object Facility), and the Coalition's OIM, and is an adopted OMG standard.
  • Computer-Aided Software Engineering (CASE). The CASE Data Interchange Format (CDIF), originally sponsored by the Electronics Industries Association and now also maintained by the OMG, represents a collection of standards intended to support the exchange of information among CASE tools. CDIF provides a published set of vendor-independent, method-independent definitions for metadata concepts in general and for modeling data and related concepts in particular, including the CDIF Integrated Meta-model, a multi-facetted, integrated, multi-disciplinary information model for modeling concepts. It supports the Integration Definition for Function Modeling (IDEF0)4, (IDEF1x)5, and UML, among other modeling paradigms. It also defines standard ways of moving this information between tools without the need for customized interfaces, including the CDIF Transfer Format, a file format to represent models. Formal standardization of CDIF at the international level is underway (ISO/IEC JTC1/SC7/WG11). This standards body also coordinates with the Object Management Group and ISO JTC1/SC32 (Metadata standards, including the 11179 standard listed above).
  • Web-based Document Exchange. The Resource Description Framework (RDF) Model and Syntax Specification [W3C, 1999], sponsored by the World Wide Web Consortium (W3C), is a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web. RDF emphasizes facilities to enable automated processing of Web resources. RDF can be used in a variety of application areas; for example: in resource discovery to provide better search engine capabilities, in cataloging for describing the content and content relationships available at a particular Web site, page, or digital library, by intelligent software agents to facilitate knowledge sharing and exchange, in content rating, in describing collections of pages that represent a single logical "document", for describing intellectual property rights of Web pages, and for expressing the privacy preferences of a user as well as the privacy policies of a Web site. Related activities include those of the Dublin Core Metadata Initiative, the Semantic Web's Web Ontology working group, and the DARPA Agent Markup Language Program.
It is important to note that these standards overlap in some areas, diverge in others, and leave some issues open, such as a standard approach to semantic content and context. The ISO/IEC 11179 standard comes closest to addressing the semantic properties of documents, databases, and other resources, but does not establish a framework for representing rules relevant to terminology usage and conflict resolution among other issues. Additionally, because the standards were developed from perspectives of distinct computing domains, there is no higher-level architecture that ties these various representation schemes together, or that suggests one approach over another in an environment that includes multiple technologies, diverse systems, or complex interrelationships.

Another complicating factor is that the concept of metadata itself is ambiguous. Metadata can be used to describe a variety of distinct classes of information that play different roles in a cross-organizational enterprise architecture or in a federation of enterprises, such as the clients and suppliers of a third-party manufacturer. These may include:

  • Descriptive metadata, sometimes called semantic or navigational metadata, or "data about data," which is intended to provide information consumers with sufficient data to allow them to access, browse, query, retrieve, and understand the data contained within the resources available to them.
  • Operational metadata, which, from a database or data warehousing perspective, facilitates data extraction, transformation, move and load operations, including mechanisms such as directory services and data translation.
  • Interface-specific or administrative metadata, which is the metadata used by database administrators to manage and maintain internal tables and other structures in a database or that describes an application programming interface, for example.

No single representation standard addresses all of these classes of metadata. Yet, creating a metadata architecture that can be leveraged by a broker (or federation of brokers) to facilitate knowledge sharing across such diverse teams and resources requires an understanding not only of the kinds of metadata listed above, but, where possible, of:

  • The target business processes.
  • Any special access mechanisms, security, or process-related requirements, including scripting languages, application programming interfaces, message sequencing requirements, or control and data flow requirements for the systems and repositories participating in the community.
  • The relevant taxonomies, domain, organization, or vendor-specific nomenclature and jargon, and mechanisms for expressing quantities and other scientific or business concepts relevant to the environment.
  • Ownership and configuration management rules for the information exchanged.
  • An understanding of how various users will interact with the broker and resultant integrated environment, including systems administration and management requirements.
  • Knowledge of the rules for determining whether or not the information shared is consistent, accurate, complete, and correct when brokered across applications or repositories.
  • Most importantly, an understanding of the terminology and business rules relevant to the people involved in the processes themselves.


Footnotes:
1. Gilliam, Daniel W. ISO/IEC 11179-1 Final Committee Draft, "Information technology -- Specification and standardization of data elements: Part 1:Framework for the specification and standardization of data elements", June 1998.
2. Common Warehouse Metamodel Specification, Version 1.0, The Object Management Group, February 2001. See also www.omg.org/cwm/.
3. Booch, Grady, Rumbaugh, James, and Jacobson, Ivar, The Unified Modeling Language User Guide, Addison Wesley Longman, Inc., Reading, MA, 1999.
OMG Unified Modeling Language Specification, Version 1.4, Object Management Group, Inc., Needham, MA, February 2001. See also http://www.omg.org/uml/.
4. Draft Federal Information Processing Standards Publication 183, "Integration Definition For Function Modeling (IDEF0)", December 1993.
5. Draft Federal Information Processing Standards Publication 184, "Integration Definition For Information Modeling (IDEF1x)", December 1993.


company | technology | products | services
partners | news | contact us | home
previous page

  Copyright © 1999-2007, Sandpiper Software™, Inc. All rights reserved.
The Sandpiper running bird logo are registered trademarks of Sandpiper Software.
For inquiries on this website, contact webmaster@sandsoft.com.