Identifiers and CURIEs

Status: Recommendation

One of the GA4GH conventions is to use CURIEs as (external) identifiers.

This is a maturing recommendation; please see also a previous discussion on Github, and the links from there.

Contributors

CURIEs (“Compact URIs”) are namespace-scoped identifiers which can be expanded to Internationalized Resource Identifiers (IRI). A CURIE is comprised of two components, a prefix and a reference, separated by a colon symbol (:). CURIES are case sensitive, although for prefixes this practice is inconsistently being followed.

The GA4GH recommendations are:

ga4gh Prefix

In a “GA4GH Namespace Discussion” telecon on 2019-08-22, initiated by GKS and with the participation of different work stream and project leads, it was agreed that newly generated identifiers created and maintained in the “GA4GH ecosystem” should use a general ga4gh prefix, and not create scoped prefixes. Details and implementation of this general concept are currently being evaluated.

GA4GH CURIE Use

In GA4GH schemas, CURIEs constitute the recommended syntax for the referencing ontology classes or external references. Here, usually a CURIE as id is combined with a label for the text representation of the , such in the OntologyClass object prototype:

"onset": {
   "label" : "Juvenile onset",
   "id" : "HP:0003621"
},
"external_references": [
  {
    "id" : "cellosaurus:CVCL_0312",
    "label" : "HOS"
  },
]

The underscore in the Cellosaurus id cellosaurus:CVCL_0312 should usually not be problematic if it is properly prefixed; however, de novo identifier designs may avoid such a syntax.

Further Information

@mbaudis  @cmungall  @reece  @jmcmurry  @mellybelly  cross GA4GH alignment  2019-08-28
Edit on Github...