One of the GA4GH conventions is to use CURIEs as (external) identifiers.
This is a maturing recommendation; please see also a previous discussion on Github, and the links from there.
cross GA4GH alignment
CURIEs (“Compact URIs”) are namespace-scoped identifiers which can be expanded to Internationalized Resource Identifiers (IRI). A CURIE is comprised of two components, a prefix and a reference, separated by a colon symbol (:). CURIES are case sensitive, although for prefixes this practice is inconsistently being followed.
The GA4GH recommendations are:
ga4gh
namespace, one should avoid the use of the underscore _
character in the private part of an identifier
:
separator by _
, in computing environments where :
may be problematic.
characterga4gh
PrefixIn a “GA4GH Namespace Discussion” telecon on 2019-08-22, initiated by GKS and with the participation of different work stream and project leads, it was agreed that newly generated identifiers created and maintained in the “GA4GH ecosystem” should use a general ga4gh
prefix, and not create scoped prefixes. Details and implementation of this general concept are currently being evaluated.
In GA4GH schemas, CURIEs constitute the recommended syntax for the referencing ontology classes or external references. Here, usually a CURIE as id
is combined with a label
for the text representation of the , such in the OntologyClass
object prototype:
"onset": {
"label" : "Juvenile onset",
"id" : "HP:0003621"
},
"external_references": [
{
"id" : "cellosaurus:CVCL_0312",
"label" : "HOS"
},
]
The underscore in the Cellosaurus id cellosaurus:CVCL_0312
should usually not be problematic if it is properly prefixed; however, de novo identifier designs may avoid such a syntax.