SCOPE
This document serves as a guide for individuals interested in utilizing and developing standard information architectural elements for Space Data Systems It encompasses essential software components like registries and repositories, as well as standard data components and descriptors, including profiles and resources While these elements are particularly beneficial in complex environments such as space, their applicability extends beyond that realm, offering a valuable roadmap for Information Architecture across various types of Data Systems.
The foundation of any standards document lies in its capacity to establish a consistent functional protocol, method, or model that ensures all adherents receive guaranteed properties and benefits In this context, the Information Object serves as the essential model, forming the basis of Information Architecture, which facilitates the standardized description, transfer, and retrieval of data.
This section, illustrated by Figure 2, outlines the composition of an information object, which includes a data object—a sequence of bits that physically represents data—and a metadata object, which provides classification and descriptive information about the data It presents a brief taxonomy of commonly used information object types in Information Architecture, along with a simple example from the space data systems domain to enhance understanding Additionally, the section defines meta models, domain models, and data dictionaries, highlighting their significance in describing information objects Finally, it concludes with a discussion of related work in information representations, such as Data Grids and CCSDS.
INTRODUCTION
DATA OBJECTS
Data objects constitute data as it is physically represented using a sequence of bits.
Although data objects may exist without metadata objects, as mentioned above, without metadata objects, the utility of a data object significantly decreases, since not even the data structure is known.
Figure 2 Information Architecture - the big picture
METADATA OBJECTS
Metadata objects are a special class of data objects (or bits) that describes the structure,
Data objects possess syntactic validity, interrelationships, and semantic rules that govern their structural elements Additionally, metadata objects often include classification information that specifies the type of data or resource contained within the data object A notable example of this classification is the Dublin Core model, which can be utilized to describe various electronic resources effectively.
Metadata objects, being data objects themselves, require their own metadata for proper identification, though this is rarely strictly enforced Various solutions have been proposed to tackle this challenge, including the SFDU concept, organizing metadata into registries, employing file extensions to signal metadata files, or analyzing data objects to ascertain their potential as metadata The following taxonomy of information objects aims to further clarify these issues.
Understanding the relationship between data objects and metadata objects is crucial, as metadata provides context and meaning to otherwise incomprehensible data Without metadata, a data object—like a zipped file—appears as a random sequence of bits, making it difficult for users and systems to access its information Metadata can describe various resources, not just electronic data objects, such as providing details about a spacecraft without returning the spacecraft itself When both data and metadata objects coexist, they enable numerous capabilities, such as identifying the format of an image or locating specific information within the data For instance, metadata can indicate the pixel value within an image file, while also linking related information, like a digitized image of a spacecraft, to enhance scientific understanding across different instruments and data types This interplay forms the foundation of the information architecture outlined in this document.
INFORMATION OBJECTS
Cardinality
An information object is composed of a data object, which is a sequence of bits, and a metadata object that describes this data In practical applications, such as NASA's Planetary Data System, each data object, like a raw raster image or histogram, is required to have one corresponding metadata object.
Figure 3 The Model of a Simple Information Object
Figure 4 NASA Planetary Data System cardinality restriction on information objects
In practical applications, an information object can be represented by multiple data objects along with a single metadata object According to the NASA Planetary Data System, the existence of several data objects rather than just one is a matter of implementation specificity Since these data objects are essentially bits, they can be treated as a single data object Therefore, the principle of one metadata object corresponding to one data object remains valid.
It is crucial to recognize that these two scenarios, while appearing distinct, are fundamentally identical, differing only in their practical application versus conceptual interpretation Although they represent information objects with varying cardinalities and relationships between data and metadata objects, ultimately, an information object can be understood as a 1-to-1 relationship between a metadata object and a data object.
Figure 5 Other Information Object cardinality relationships
Compositionality
Data objects can serve as information objects, as both are essentially sequences of bits Furthermore, information objects can be constructed from other information objects, highlighting the interconnected nature of data.
The key distinction between this scenario and the information package outlined in Section 2.2.3.3 lies in the presence of the metadata object While a data object can qualify as an information object, the reverse is not true; an information object cannot be classified as a data object.
In a compositional information object, each metadata object exclusively describes its corresponding data object, maintaining a one-to-one relationship Similarly, in an information package, this principle applies, but it also includes an additional metadata object that provides a description of the entire collection of data objects within the package.
Taxonomy of Information Objects
The discussion on Information Objects has encompassed a wide range of topics, leading to the conclusion that this standard is relevant to various classes of Information Objects We categorize these classes based on three distinct dimensions.
Compositionality of an Information Object
metadata type – defining the type of metadata (e.g accounting, structure, inter- relationships, etc.) in the metadata object
composition – defining whether the Information Object’s data object is actually an Information Object in itself (which this standard does not preclude from happening)
class – defining the type of data this information object contains and describes.
In this section, we present a preliminary classification of three different Information Objects along these three dimensions.
A primitive information object is defined as an information entity characterized by minimal metadata that prohibits compositional attributes and can encompass any data class This minimal metadata typically includes basic elements like size and format, without detailing other characteristics of the data object.
A primitive information object, such as a data file stored on a solid-state recorder, typically has minimal metadata associated with it, which may include the file's name and the address space it occupies.
Space data systems have typically focused on the management of primitive information objects, and have not made metadata objects first-class citizens.
A Simple Information Object (SIO) is defined as an Information Object characterized by a metadata type from a domain model, potentially compositional and capable of containing various data classes Numerous data systems within space agencies incorporate SIOs in their design, primarily within archive and science data systems The metadata for these objects is typically stored in an online registry or database, facilitating effective search and browsing of data products As there is a growing focus on developing end-to-end mission information system architectures, SIOs will be essential at multiple stages, including observation planning, execution, processing, and distribution throughout the mission pipeline Their versatility makes SIOs crucial for ensuring interoperability between systems, provided that the information objects and their models are well-planned.
Information Packages are collections of one or more Information Objects, coupled with a metadata object containing Descriptive Information, Packaging Information, and
Supporting Information regarding the package itself Defined in the OAIS reference model [8], descriptive information is the set of information, consisting primarily of
Package Descriptions are essential for Data Management, aiding Consumers in locating, ordering, and retrieving OAIS information holdings According to OAIS, packaging information binds and identifies the components of an Information Package, such as the ISO 9660 volume and directory information on a CD-ROM that contains multiple files with Content Information and Preservation Description Information This packaging may also detail the algorithms and formats used in the package structure, including whether it was compressed and the specific compression algorithm applied, like ZIP or TAR Additionally, supporting information is included to provide necessary representational context for understanding the data.
Information packages are categorized as Information Objects within the taxonomy of information objects They are characterized by specific metadata types, including supporting, descriptive, and packaging metadata, which must adhere to a required compositionality.
Each Information Object within the package contains its own metadata, which may not align with the metadata of other Information Objects This inconsistency complicates the interpretation and comparison of these objects, even if they originate from the same repository Adhering to a standard meta model is essential for effective comparison and understanding (as detailed in Section 2.3).
The Information Package is designed to consolidate relevant data for users, who are expected to be familiar with the usage of each Information Object.
Packaging Information, Descriptive Information, Supporting Information
The term "Has A" refers to the relationship within a set, allowing users to understand how to correlate information For those unfamiliar with this correlation, descriptive details about the package, including index information for individual Information Objects, can help infer the properties of the package Recent advancements in packaging have led to the establishment of a CCSDS Working Group focused on this area, which has contributed to the creation of the XFDU packaging standard.
DISCUSSION
This section illustrates a Space Data Systems Information Object through the example of a telemetry uplink packet dispatched from a ground station to a spacecraft The telemetry Information Object comprises a sequence of bits that encodes the command for the spacecraft, structured as a single field, Command Sequence, of type long integer The telemetry uplink packet includes three Data Elements: ground station name (identifying the origin of the command), instrument name (specifying the onboard instrument targeted by the command), and packet sent-time (a timestamp indicating when the packet was transmitted) The semantic information for these Data Elements defines valid values for the instrument name (such as spectrometer or high-resolution imager) and establishes that the packet sent-time must be equal to or later than the current time on the sending system This example is summarized in Table 1.
Table 1 Information Object View of a Telemetry Uplink Packet
Time Timestamp ≥ Current System Time
MODELING CONCEPTS
Domain Models
A domain model outlines a specific area of focus, exemplified by NASA’s planetary science domain model, which identifies key objects like instruments and data sets It details attributes of these objects, including names and spacecraft IDs, and illustrates the relationships between them, such as the interaction where "instruments produce data sets."
Besides describing a domain, domain models also help to facilitate correlative use and exchange of data between domains The planetary science domain model was defined inPVL/ODL.
A standard Domain Model - Dublin Core
The Dublin Core Metadata Element Set is a globally recognized online standard for metadata that facilitates the description of electronic resources on the web This interoperable framework supports cross-domain information resource description, where an information resource is defined as "anything that has identity." The elements within the set are structured according to the ISO/IEC 11179 framework, ensuring consistency and clarity in metadata usage.
Figure 10 Example Planetary Domain Model (Simplified)
Data Dictionary
A Data Dictionary captures the superset of de facto data elements for a particular domain.
In space sciences, data dictionaries play a vital role in ensuring consistent data correlation across various scientific disciplines For instance, the definition of mass as "the property of a body that causes it to have weight in a gravitational field" is universally accepted in both planetary and earth sciences Ideally, this definition and its constraints should be stored once to avoid unnecessary replication; however, such reuse is rarely practiced, leading to duplicated definitions and data elements across scientific domains Achieving consensus on data definitions enables software programs to accurately interpret data, which is essential for embedded systems in space data management This semantic reasoning reduces reliance on human intervention and facilitates the sharing and correlation of diverse information objects.
In Figure 9, a data dictionary can be defined by an instance of either DEDSL or the ISO/IEC 11179 specification for data elements
Practical examples of Data Dictionaries include the NASA Planetary Data System DataDictionary [12] and the ESA BEAT Data Dictionary[13].
Modeling As Applied to Space Data Systems
The heterogeneous nature of data exchanged between space data system components necessitates the ability to examine, compare, and modify modeling information, which directly supports essential services like object accountability, packaging, and delivery These services enable scientists to accurately locate and format information objects within the space-to-ground pipeline, leading to earlier discoveries and fault detection, while enhancing overall understanding of data within the system Furthermore, shared models are crucial for data system interoperability and correlative science, as they allow for effective comparison between multiple systems Without a common model, the reuse and exchange of information, whether from space data systems or other engineering and software domains, becomes impossible.
Standard Meta Models in Space Data Systems
A meta model is a model that defines the structure and rules for creating another model For instance, in the example shown in Figure 13, UML at level M2 serves as a meta model that guides the development of a domain model at level M1 Similarly, DEDSL acts as a meta model for the creation of specific models within its framework.
Meta models play a crucial role in information architecture by establishing standards for comparing and analyzing elements across various domains Without adherence to a specific meta model, comparing elements—even within the same domain—becomes unfeasible This capability is essential for ensuring data interoperability between systems, highlighting the need for common and compatible meta models to effectively describe domain elements both within and across different domains.
In the following, several standard meta models will be briefly described These include the ISO/IEC 11179 standard for the specification and standardization of data elements
[15], the CCSDS Data Entity Dictionary Specification Language (DEDSL) [16], the Dublin Core [6] Initiative’s data elements for describing electronic resources on the Web, and the XFDU [10] model for describing information packages.
The ISO/IEC 11179 standard framework serves as a foundational guideline for the specification and standardization of data elements within meta models and metadata registries It outlines essential registry functions, including definition, identification, naming, administration, and classification, while offering a universally accepted set of attributes for describing data elements As an international standard, it facilitates global data element definition and classification, enhancing data dictionary interoperability The specification categorizes these attributes into four key groups: identifying, definitional, representational, and administrative.
2.3.5.2 DATA ENTITY DICTIONARY SPECIFICATION LANGUAGE (DEDSL)
The Consultative Committee for Space Data Systems (CCSDS) has developed the Data Entity Dictionary Specification Language (DEDSL), which outlines the framework for creating and exchanging data entity dictionaries utilizing XML This specification also adheres to the standards set by ISO/IEC 11179, as documented in reference [16].
The Parameter Value Language (PVL) is a recommendation by the Consultative Committee for Space Data Systems (CCSDS) that establishes a standardized keyword value type language for naming and expressing data values Designed to be both human-readable and machine-processable, PVL facilitates the documentation of domain models in a manner akin to RDF Additionally, the Object Description Language (ODL) serves as a subset of PVL.
EAST is a comprehensive data description language created by the French Space Agency, CNES, that provides clear and unambiguous information regarding data formats Recognized as a standard by CCSDS and ISO, EAST is supported by an expanding array of tools designed to utilize its capabilities.
The XML Formatted Data Unit (XFDU) Structure and Construction Rules, established by CCSDS, outline guidelines for packaging data, metadata, and software into a unified format to enhance information transfer and archiving This framework offers a comprehensive specification of essential packaging structures and mechanisms that align with the current requirements of CCSDS agencies.
Interoperability
Data Dictionary interoperability is essential for facilitating the exchange and comparison of information across diverse data systems Since domain models consist of data elements specific to a particular domain, and these elements are derived from the corresponding data dictionary, the data dictionary is crucial for enabling effective communication between data systems.
A unified meta model for a data dictionary is essential for the consistent capture and exchange of data elements, facilitating the development of metadata registries for sharing information across various projects Moreover, it is crucial to understand that the creation of data dictionaries relies heavily on a well-defined domain model, highlighting the interconnectedness of these components.
Figure 11 Data Models, Meta Models and Domains
3 SOFTWARE COMPONENTS FOR INFORMATION ARCHITECTURE
This document outlines data standards and specifies standard information management objects (IMOs) used for accessing, distributing, capturing, and managing information objects It identifies two types of IMOs: Primitive and Complex, which play a crucial role in enhancing information management processes.
Information Management Objects (pIMOs) and Advanced Information Management Objects (aIMOs) play crucial roles in information architecture pIMOs are active entities that facilitate the retrieval, storage, and discovery of information from data stores In contrast, aIMOs are sophisticated constructs made up of multiple pIMOs, enabling essential functions such as data ingestion, processing, distribution, and querying of various data, metadata, and information objects.
In Section 3.1 and 3.2 we describe pIMO and aIMO components respectively We then offer a small survey of related work in Section 3.3.
Figure 12 The Internal Structure of a Physical Data Storage
PRIMITIVE INFORMATION MANAGEMENT OBJECTS
QUERY OBJECT
The Query Object facilitates the retrieval of data as Information Objects through the find operation, which uses a specified expression parameter to define search criteria for the physical data storage This operation can return zero or more matching Information Objects to the caller An illustrative example of the find operation and the data flow between the Query Object and the associated physical data stores is depicted in Figure 15.
ADVANCED INFORMATION MANAGEMENT OBJECTS
REPOSITORY SERVICE OBJECT
The Repository Service Object component, as illustrated in Figure 18, plays a crucial role in managing an underlying data store object or physical data store Unlike Data Store Objects (DSOs), repository service objects possess various non-functional properties, such as scalability, dependability, and uniformity, which are essential for quality assurance They provide the same get and put methods as data store objects; however, repository service objects ensure enhanced performance by reliably scaling across multiple physical data stores, maintaining 24/7 dependability, and offering a consistent software interface.
Figure 16 A Data Store Object and its Corresponding UML View
Figure 17 A Query Object and its Corresponding UML View
m_BulkDataStore get(handle) : Data Object put(handle, do : Data Object) : OperationResult
dso : Data Store Object query(expression) : Set of Data Objects
The primary interface of the Repository Service Object facilitates the management of information objects through a repository request Users can retrieve these objects via the request interface, which generates a response from the repository Additionally, the Repository Service Object allows for basic put capabilities of information objects, leveraging the functionalities of its associated Data Service Object (DSO).
3.2.1.1 A Taxonomy of Repository Service Objects
Information Architecture categorizes various types of Repository Service Objects across three key dimensions: repository object type, object properties, and object description Each of these dimensions is elaborated upon in this section, providing a comprehensive understanding of the taxonomy.
This document categorizes repository objects based on their type, which allows for the grouping of repositories with similar functional and non-functional characteristics We identify four main repository types: Data Store, Product Repository, Short-term Archive, and Long-term Archive Additionally, we introduce a properties dimension that encompasses various attributes of repositories Currently, this dimension addresses the full range of properties for each repository, with plans to refine it further for better comparison and classification of repository service objects Key dimensions may include compositionality, which examines the organization of a repository's sub-components, and supported data objects.
Figure 18 Repository Service Object and its corresponding UML diagram
A Data Store Object (DSO) within a repository plays a crucial role in managing data, while interface richness highlights the repository's capability to perform basic get/put operations as well as more complex tasks that may involve querying and processing data Additionally, the object description dimension outlines the essential services and responsibilities of the repository when integrated with other software components Our taxonomy and classification of repositories is summarized in Table 2.
REGISTRY SERVICE OBJECT
The Registry Service Object component offers an interface for retrieving Metadata Objects, which are Information Objects that exclusively contain representational information Through the metadataQuery interface, users can obtain metadata objects that meet specific query criteria Figure 19 illustrates the Registry Service Object in action.
Similar to the Repository Service Object, there also exist different classes of Registry
Table 2 A Taxonomy of Repository Service Objects
Repository Object Type Object Properties Object Description
Basic Data Store component described in Section 3.1 sits behind Data Store Object and supports Repository Interface to get and put data (lower level data such as streams and bits)
Product Repository Component that stores data products and higher level products, possibly including metadata Supports retrieval of data products through possibly complex methods, and processing.
Advanced Component supporting retrieval of possibly complex data products, including their metadata.
Short-term Archive No support for permanence
Stores products for short term (e.g less than 10 years), and allows retrieval of products.
Archive for short-term preservation of data products, get, put, and query retrieval methods.
Long-term Archive Stores products for long term archiving, and supports basic archive functionality.
Archive for long-term preservation of data products, and data permanence Supports basic archive functional interfaces (e.g get, put).
Figure 19 A Registry Service Object and its Corresponding UML View
metadataQuery(data elements) : metadata resourceQuery(resource description) : resourceMetadata
queryObj : Query Object dso : Data Store Object
3.2.2.1 A Taxonomy of Registry Service Objects
This section categorizes registries into three primary classes based on specific dimensions, including the type of registry, the types of return objects, and the parameters of the query interface.
The three main types of registries are Metadata Registry, Service Registry and Resource
A metadata registry provides structural information about metadata, often termed a meta-meta model The data objects retrieved from this registry are known as meta-metadata objects To query the metadata registry, users can specify constraints and values associated with various data elements, either implicitly by examining the properties of the data elements or explicitly by using the data element's ID.
A service registry offers a user-friendly interface to search for functional services that fulfill specific user-defined actions It manages detailed service descriptions, which encompass the locations, methods, and parameters of these services Additionally, modern standards like Web Services Description Language (WSDL) enhance the implementation of these service descriptions, ensuring efficient service management.
An additional implementation of a service description and its respective service registry exists in the form of the Profile Server and Resource Profile components specified in [18-
Service descriptions play a crucial role in outlining software methods, systems, and web resources through metadata This metadata allows for querying to obtain service endpoints, which act as pointers to the service's location, along with details on how to invoke the specific service Consequently, this dynamic approach enhances the utilization and consumption of services through software, minimizing the need for explicit requests and invocations.
The resource registry is designed to describe various resources and objects, but it is particularly focused on detailing information objects, including scientific data products and datasets This description is structured using the concept of a profile, as outlined in the previous section.
2 This enables description of an information object using the representational information defined with a profile Science catalogs such as the Simbad Astrophysics
Table 3 A Taxonomy of Registry Service Objects
Registry Type Return Object Types Query Interface Parameters
Metadata Registry Data Dictionaries, Data
Elements Query for Data Element properties, or Data Element IDs, or Data Dictionary IDs
Service Metadata (interface properties, interface type, return schema)
Resource Registry Data Products, Resource
Data Resource properties products Resource registries can also point to other resource registries to enable discovery of information objects across distributed registries
The introduced classification dimensions successfully categorize the functional properties of various registry types, while the non-functional aspects remain unspecified for now Recognizing the significance of non-functional registry service properties, we highlight this contribution as part of our ongoing efforts within this document and the IA-WG 3 A summary of the taxonomy of registry service objects can be found in Table 3.
PRODUCT SERVICE OBJECT
The Product Service Object (PSO) integrates a repository service object with a query object and a domain processing element, facilitating efficient data management This domain processing component transforms data objects from their original format in the data store to a user-specified format Serving as a standard interface for diverse data sources, the PSO enables users to query information objects through a query expression This expression is then processed by the internal query object, which evaluates it and converts it into a series of get calls to the repository service object.
ARCHIVE SERVICE OBJECT
Archive Service Objects provide a standardized architectural component responsible for
Data ingestion into a repository involves both data objects and accompanying metadata objects, which can be efficiently managed through a task processing approach Users can define specific tasks to guide the ingestion process for both types of objects These tasks are governed by a rule-based policy that evaluates criteria such as time, task type, and ingestion type to determine the execution of tasks This organized method of managing tasks is commonly known as workflow Additionally, archive service objects are equipped to handle transaction-based operations.
Figure 20 A Product Service Object and its Corresponding UML View
Figure 21 An Archive Service Object and its Corresponding UML View ingestion of data and metadata objects, similar to the ingestion interface described in the
OAIS model [8] An Archive Service Object is shown in Figure 21.
QUERY SERVICE OBJECT
The Query Service Object, defined in this document, is responsible for managing the routing of queries to identify and locate the relevant product and repository service objects It achieves this by querying registry service objects to find the locations of the necessary repository and product service information Once the appropriate Information Objects are retrieved, they are aggregated and sent back to the Query Service Object, which can then perform further processing, including packaging, translations, and advanced operations This aggregation typically includes a set of metadata objects, forming what we refer to as an information package.
RELATED WORK
This document builds upon existing research in data grids, CCSDS archiving standards, and architectural styles for network-based software systems We emphasize the significance of these foundational works, as they collectively contribute to the development of architecture-based information management through software components.
Chervenak et al [25-27] define a layered services architecture [2] of software components that can be used to federate heterogeneous data resources, which is related to
The article discusses the federation of space data system resources as outlined in the standard, highlighting the distinction between this standard and data grids through the introduction of a Domain-Specific Software Architecture (DSSA) and a Domain Reference Architecture for the Space Data Systems Domain Unlike the focus of Chervenak et al on super-computing systems characterized by extensive processing, memory, and network resources, space data systems are primarily embedded systems with significantly different constraints regarding memory resources, bandwidth availability, and latency.
The OAI protocol and OAIS reference model aim to outline functional components for digital libraries and archival systems; however, their broad focus is not suitable for Space Data Systems Additionally, the OAIS model lacks detailed specifications for system decomposition and implementation Our objective is to deliver an architecture and a comprehensive set of components that facilitate the creation of OAIS-compliant systems.
Fielding defines various software architectural styles for network-based systems that facilitate the interaction mechanisms among software components These styles offer standardized component types and interaction mechanisms, along with guidelines for their composition and deployment The long-term objective of this standard is to establish a recognized architectural style, encompassing a defined set of components, connectors, and valid configurations, specifically for information architecture in space data systems and beyond.
The client-server and peer-to-peer interaction styles are crucial within this standard due to their practical applications Functional software objects defined in this standard often utilize these communication mechanisms For example, the query service object interacts with underlying data stores in a client-server manner, where the query object serves as the client and the data stores function as servers Additionally, the query service object can engage with other query service objects in a peer-to-peer interaction while searching for resources that meet specific queries.
Information Architecture in Space Data Systems can be categorized into distinct views, similar to Software Architecture These views facilitate the sharing of information with various stakeholders, each possessing unique perspectives on the system The two primary views of Information Architecture relevant to Space Data Systems are the Information View, which focuses on data and its structure, and the Functional View, which emphasizes the processes of locating, searching, and retrieving data.
Section 4.1 explores the Information View of Information Architecture, building upon the concepts introduced in Section 2, while Section 4.2 examines the Functional View of Information Architecture, referencing the topics discussed in Section 3.
INFORMATION VIEW
The Information View in Space Data Systems, illustrated in Figure 23, is organized in a hierarchical stack, with the most abstract concepts related to space organizations like NASA and ESA at the top, and implementation-focused views at the bottom While the Information View is more abstract than the Communications View, it is still more closely aligned with implementation issues compared to the Functional View.
The Information View in Information Architecture raises significant concerns regarding data and metadata, including its structure, semantics, relationships, and security Additionally, it addresses how data is represented through various forms, such as Data Objects, which are explored in detail in Section 2.
Figure 23 Information View in Perspective
FUNCTIONAL VIEW
The Functional View of Information Architecture focuses on enhancing the capture, discovery, search, and retrieval of information through specific functional components For detailed insights into these components and their software implementations, refer to Section 3 This work expands upon the Reference Architecture for Space Data Systems (RASDS) and further elaborates on the Information View and Information Management Objects (IMOs) within the Functional View of RASDS.
OAIS
The CCSDS OAIS reference model emphasizes the importance of metadata for validating data product ingestion and understanding data formats, which are essential components of Information Architecture It introduces the concept of an "open archive," which serves as an archive service object that interacts with three primary external entities: Producers, Consumers, and Management.
1 producers produce submission information packages (or SIPs) to send to the OAIS compliant archive.
2 consumers consume dissemination information packages (or DIPs) that they retrieve from the OAIS compliant archive
3 management constitutes outside entities responsible for managing data within the archive and are not involved in the day-to-day operations of the component
OAIS archives manage archival information packages (AIPs) generated from submission information packages (SIPs) and dissemination information packages (DIPs) In the context of information architecture, AIPs, SIPs, and DIPs are categorized as information objects, each adhering to their designated package formats.
OAIS compliant archives focus on the preservation, management, and collection of information, aligning closely with the archive software component outlined in Section 3.2.4 These archives adhere to the OAIS reference model, which establishes standard data structures for information objects, ensuring compliance with the specific requirements of the model.
Figure 24 The Open Archival Information System Reference Model
GRIDS
SpaceGRID
The ESA’s Space Grid Study, conducted from 2001 to 2003, aimed to explore the integration of grid technology into Earth observation and space missions, focusing on distributed data management, data distribution, data access, and a unified architectural approach for software development Covering diverse fields such as Earth Observation, Space Research, Solar System Research, and Mechanical Engineering, the study identified 240 user requirements for grid systems, with 146 deemed "common" across at least three domains This comprehensive analysis highlighted key design areas and user-desired requirements for grid technologies.
The proposed SpaceGRID infrastructure features a layered architectural model akin to the standard outlined, with client applications residing at the top layer that interact via an organizational API This API leverages grid services to utilize the grid infrastructure, facilitating access to both hardware-based ("hard") and software-based ("soft") distributed resources.
The ESA report utilizes metadata catalogs to search for data provided by grid infrastructure These catalogs function as repositories for metadata objects that link to the data objects users seek Ultimately, the grid infrastructure detailed in the SpaceGRID report efficiently distributes, searches for, and delivers information objects to users.
EOSDIS
NASA's Earth Observing System Data and Information System (EOSDIS) aimed to enhance the distribution, processing, archival, and storage of earth science data from observing missions Established in 1996, EOSDIS highlighted the challenges of contemporary information systems, as disparate "one-off" data systems hindered access and usability of valuable data sets, often requiring time-consuming transfers via removable media The primary objective of EOSDIS was to connect existing earth science data systems, facilitating easier access to critical data for scientists.
EOSDIS has played a pivotal role in shaping the grid paradigm, which aligns with our standard Its significance lies in its provision of earth science-specific information objects, meta models, data dictionaries, metadata objects, and domain models This makes EOSDIS a key example within the realm of earth science, demonstrating the integration of various data structures and frameworks essential for effective scientific research and data management.
European Data Grid
The European Data Grid (EDG) is a project funded by the EU and ESA, designed to facilitate access to geographically distributed data and computational resources Utilizing Globus Toolkit technology, EDG establishes a foundational grid infrastructure and develops specialized services such as replica management, metadata management, and storage management With a strong emphasis on data and metadata, EDG effectively manages, distributes, processes, and archives information objects Metadata is organized in catalogs, while data objects are stored transparently within an underlying storage system Users can leverage software components to query and retrieve information objects and packages provided by the EDG system.
National Virtual Observatory
The National Virtual Observatory (NVO), funded by the NSF, aims to enhance scientific research by improving access to data and computational resources Utilizing the Globus Toolkit grid middleware, NVO facilitates the distribution, processing, retrieval, and search of astrophysical data Key components of NVO include a defined information architecture with standard astrophysical data products, standardized metadata for describing these data objects, and grid services that enable data exchange and collaboration in scientific endeavors.
This section presents a mapping of existing CCSDS Standards to the standard data and software components and ideas discussed in this document.
Table 4 CCSDS Information Standards Mapped to Information Architecture Concept
Information Architecture Concept CCSDS Standard
DEDSL (Data Entity Dictionary Specification Language) http://www.ccsds.org/documents/647x1b1.pdf
Archive Ingestion Model (Section 3) Reference Model for an Open Archival
Information System (OAIS) http://www.ccsds.org/documents/650x0b1.pdf
Specification (Section 2) The Data Description Language EAST
Specification (CCSD0010) Blue Book Issue 2
November 2000 http://www.ccsds.org/documents/644x0b2.pdf
Format (Section 2) Information Interchange Specification http://www.ccsds.org/documents/642x1g1.pdf
Data Value Representation (Section 2) Parameter Value Language Specification
(CCSD0006 and CCSD0008) Blue Book Issue
2 June 2000 http://www.ccsds.org/documents/641x0b2.pdf
Packaging Specification XML Formatted Data Unit (XFDU) Structure and Construction Rules White Book, Issue 2,
September 2004. http://www.ccsds.org/docu/dscgi/ds.py/Get/File- 1912/IPRWBv2a.doc
Data Object Format Specification Standard Formatted Data Units — Control
Authority Data Structures Blue Book Issue 1
November 1994 http://www.ccsds.org/documents/632x0b1.pdf
Figure 28 Classification of Information Package with Information Object Taxonomy
Figure 26 Classification of Primitive Information Object with Information Object Taxonomy
Figure 27 Classification of Simple Information Object with Information Object Taxonomy
Metadata Structured data which describes other data (not including metadata itself)
Meta Models Set of data elements describing the metadata object, used to capture metadata in an information object
Schema A set of semantic units, along with their attributes.
Data Element An OO class-like representation of metadata The
The Data Element class encompasses structural information about its own framework, while the Data Element instance functions as a descriptor for Data Value For instance, the Data Element instance "Author" is utilized within the Dublin Core metadata standard.
Information Object A compositional object containing an internal
The Data Object encompasses both structural and semantic information, detailing its internal composition and characteristics This representation is essential for understanding the Data Object's framework and the meaning behind its elements.
Information Architecture The notion of architecting information systems, with a focus on both data architecture, and software architectural concerns.
Data Architecture The specification the overall structure, logical components, and the logical interrelationships of data in information, or data-intensive systems.
Software Architecture The specification of overall structure, behavior, logical components, and logical interrelationships of a software system.
A Data Product is generated through an active function that produces data It can range from simple forms, consisting solely of data values, to more complex structures that incorporate both data and accompanying metadata objects.
Data-Intensive System Any system which is IO-bound.
Metadata Catalog Service A Service providing the storage and retrieval of descriptive metadata in Grid-based systems
Grid Computing A new paradigm focusing on supporting Virtual retrieval of heterogeneous information, stored in heterogeneous data sources, possibly across many organizations.
Grid-based systems Any software system which is modeled upon Grid
Computing, either the Data aspect of Grid Computing (i.e Data Grids), or the Computational Aspects of Grid Computing (i.e.
MCAT Catalog The San Diego Supercomputing Center’s