Contributed Data Set Policy

The Keck Observatory Archive (KOA) will host contributed datasets derived from Keck data and which provide added scientific value to the archive. Examples include carefully reduced and calibrated spectra, coadded data from multiple observations, line identifications, 2-D spectroscopy of extended objects, etc. The contributed data must be fully documented, so that they may be used in analysis, and traced to raw data in the archive, so that the contributed data set may be reproduced.

This policy provides guidelines on data formats and delivery mechanisms for data providers intending to deliver scientific data sets and associated metadata to KOA. It is not intended to provide an exhaustive set of requirements imposed by KOA on providers, and individual contributions will be considered on a case-by-case basis in cooperation with the data provider. The archive encourages data providers to contact KOA early in the process of defining products and estimating product volumes, rather than waiting until ready to deliver the final product.

Data Integrity

Data providers are responsible for the technical content of all datasets, ancillary information and documentation. KOA will not modify the contents of any dataset, ancillary information, or documentation without the full knowledge and consent of the data provider.

Proprietary Data Policies

Contributed datasets will be made public when all the proprietary periods of the raw data used to create them have expired, or when the PIs of any raw data that are still protected give KOA explicit authorization, via the KOA Helpdesk, to release these raw data earlier than scheduled. Contributed datasets will be made public immediately if all the raw data are public on delivery to KOA. Providers should supply KOA with an association of the contributed products to the original files; KOA will provide assistance if needed. KOA reserves the right to withhold publication of any contributed dataset that appears to violate proprietary rights until release authorization is received or all data in the release are public.

Contributor Support for Archive Users

KOA does not have the resources to become expert on the content of contributed datasets. Data providers should designate a contact person who will be available to answer technical questions from users regarding the observations, the processing techniques, or scientific quality issues. If the provided contact information changes or becomes obsolete, the data provider is responsible for updating the information.

Data Guidelines

Required Data

In order to facilitate the timely ingestion and integration of a contributed data set, we require the following:

  • A "README" file (plain ASCII text) that includes the file descriptions, defines the columns for any tables, and provides a published reference for the data. It is expected that this file would be downloaded with the contributed data.
  • Traceability of the contributed data products back to the original data. At a minumum, this should include the KOAID(s) used to create each of the reduced data products. It is helpful to also include the name of the PI, the program ID, and the date of the observations.

Acceptable Data

The following data types are accepted for inclusion in KOA:

  • Papers dedicated to describing the analysis and the data products themselves.
  • Images and/or spectra from WMKO instruments based on KOA Level 0 data products. Normally such images will be in the astronomical FITS format.
  • Derived data such as catalogs of objects or spectral lines, radial velocities, equivalent widths, etc.
  • Plots relevant to the contributed dataset and quicklook/preview images in common display formats i.e., PostScript, GIF, JPEG, PNG and PDF.
  • Observations from other observatories and missions that form part the data contents are allowed if they are closely related to the WMKO observing program. The data provider should provide a list of facilities used, and is responsible for ensuring that these data are public. KOA does not expect these data to be mapped to original raw data.
  • Output produced from theoretical models closely related to the KOA data and not generally available through another permanent archive.
  • Software or scripts that demonstrate the usage of the contributed data.

For more information, see Data Format Requirements.

Unacceptable Data

The following data types are not appropriate for contributing to KOA:

  • Papers and data that are several steps removed from the Keck-derived contributed data.
  • The software used to create the contributed products. These are best served from a software repository such as Github.
  • Ancillary files created by the data software that are not needed by users downloading the primary data products.

About Websites

KOA will point to data providers' websites (e.g., analysis software submitted to Github) but we require a basic presence on the KOA website. Because web pages become stale, KOA strongly prefers that the website content is served through KOA. We will work with contributors to ensure that all necessary content is housed at KOA. For highly dynamic datasets, a mutually agreeable approach to content updates will be negotiated.

Contributed Dataset Procedures

To initiate a request to provide a contributed dataset to KOA, contact the KOA Helpdesk and provide the following information:

  • Brief description of the dataset and its interface to be contributed
  • Name of an Initial Contact for the program, along with contact information including email address. The Initial Contact is the person that will work with KOA to get the dataset integrated into the archive.
  • Relationship of each part of the dataset to the original KOA files, usually using the KOAID.
  • The primary target of the observations, e.g. "Hubble Deep Field" or "NGC1068"
  • References of up to three papers closely related to the contributed data, in standard bibcode format, e.g., 2011ApJ....733...28K.
  • Name and contact information (e-mail address and telephone number) of a Support Contact who will be responsible for ongoing Help Desk support from users. This can be, but need not be, the same as the Initial Contact. The Support Contact may be listed on the web site.
  • Sample data sets as available.

After approval and negotiation of details and timescale, the data will be delivered either electronically (for example using ftp), or by hard media such as hard drives or DVDs. Electronic transfer is the preferable method if data volumes permit.

Data may be compressed and packaged with standard compression algorithms or utilities (zip, gzip, tar, etc). Generally, KOA will not accept data deliveries that use customized compression or packaging algorithms.

Organization of Delivery Packages

Data providers may organize the delivery package in any way that makes sense to them, but the package should come with adequate documentation to explain its contents and organization. KOA recommends the following minimum information be provided as a packing list:

  • The total number of files and data volume of each
  • A description of the rationale for the organization of the data package (e.g. organized by night, with subdirectories for science data, raw data, calibration files, processing reports and observing logs)
  • A listing of all subdirectories

What KOA does with the Contributed Data

KOA will verify that the delivery package contains that all expected data items and develops an access mechanism (i.e., a website) that meets the provider's requirements. KOA will spot-check the integrity and format of the data on a best-efforts basis. In special cases, KOA may ingest them into the archive itself.