Utils Docs

Public api for methods and functions to handle/verify the jsonschemas.

class inspire_schemas.utils.LocalRefResolver(base_uri, referrer, store=(), cache_remote=True, handlers=(), urljoin_cache=None, remote_cache=None)[source]

Bases: jsonschema.validators.RefResolver

Simple resolver to handle non-uri relative paths.

resolve_remote(uri)[source]

Resolve a uri or relative path to a schema.

inspire_schemas.utils.build_pubnote(title, volume, page_start, page_end, artid)[source]

Build pubnote string from parts (reverse of split_pubnote).

inspire_schemas.utils.classify_field(value)[source]

Normalize value to an Inspire category.

Parameters:value (str) – an Inspire category to properly case, or an arXiv category to translate to the corresponding Inspire category.
Returns:
None if value is not a non-empty string,
otherwise the corresponding Inspire category.
Return type:str
inspire_schemas.utils.convert_new_publication_info_to_old(publication_infos)[source]

Convert back a publication_info value from the new format to the old.

Does the inverse transformation of convert_old_publication_info_to_new(), to be used whenever we are sending back records from Labs to Legacy.

Parameters:publication_infos – a publication_info in the new format.
Returns:a publication_info in the old format.
Return type:list(dict)
inspire_schemas.utils.convert_old_publication_info_to_new(publication_infos)[source]

Convert a publication_info value from the old format to the new.

On Legacy different series of the same journal were modeled by adding the letter part of the name to the journal volume. For example, a paper published in Physical Review D contained:

{
    'publication_info': [
        {
            'journal_title': 'Phys.Rev.',
            'journal_volume': 'D43',
        },
    ],
}

On Labs we instead represent each series with a different journal record. As a consequence, the above example becomes:

{
    'publication_info': [
        {
            'journal_title': 'Phys.Rev.D',
            'journal_volume': '43',
        },
    ],
}

This function handles this translation from the old format to the new. Please also see the tests for various edge cases that this function also handles.

Parameters:publication_infos – a publication_info in the old format.
Returns:a publication_info in the new format.
Return type:list(dict)
inspire_schemas.utils.get_license_from_url(url)[source]

Get the license abbreviation from an URL.

Parameters:url (str) – canonical url of the license.
Returns:the corresponding license abbreviation.
Return type:str
Raises:ValueError – when the url is not recognized
inspire_schemas.utils.get_schema_path(schema, resolved=False)[source]

Retrieve the installed path for the given schema.

Parameters:
  • schema (str) – relative or absolute url of the schema to validate, for example, ‘records/authors.json’ or ‘jobs.json’, or just the name of the schema, like ‘jobs’.
  • resolved (bool) – if True, the returned path points to a fully resolved schema, that is to the schema with all $ref replaced by their targets.
Returns:

path to the given schema name.

Return type:

str

Raises:

SchemaNotFound – if no schema could be found.

inspire_schemas.utils.load_schema(schema_name, resolved=False)[source]

Load the given schema from wherever it’s installed.

Parameters:
  • schema_name (str) – Name of the schema to load, for example ‘authors’.
  • resolved (bool) – If True will return the resolved schema, that is with all the $refs replaced by their targets.
Returns:

the schema with the given name.

Return type:

dict

inspire_schemas.utils.normalize_arxiv_category(category)[source]

Normalize arXiv category to be schema compliant.

This properly capitalizes the category and replaces the dash by a dot if needed. If the category is obsolete, it also gets converted it to its current equivalent.

Example

>>> from inspire_schemas.utils import normalize_arxiv_category
>>> normalize_arxiv_category('funct-an')
u'math.FA'
inspire_schemas.utils.normalize_collaboration(collaboration)[source]

Normalize collaboration string.

Parameters:collaboration – a string containing collaboration(s) or None
Returns:List of extracted and normalized collaborations
Return type:list

Examples

>>> from inspire_schemas.utils import normalize_collaboration
>>> normalize_collaboration('for the CMS and ATLAS Collaborations')
['CMS', 'ATLAS']
inspire_schemas.utils.split_page_artid(page_artid)[source]

Split page_artid into page_start/end and artid.

inspire_schemas.utils.split_pubnote(pubnote_str)[source]

Split pubnote into journal information.

inspire_schemas.utils.valid_arxiv_categories()[source]

List of all arXiv categories that ever existed.

Example

>>> from inspire_schemas.utils import valid_arxiv_categories
>>> 'funct-an' in valid_arxiv_categories()
True
inspire_schemas.utils.validate(data, schema=None)[source]

Validate the given dictionary against the given schema.

Parameters:
  • data (dict) – record to validate.
  • schema (Union[dict, str]) – schema to validate against. If it is a string, it is intepreted as the name of the schema to load (e.g. authors or jobs). If it is None, the schema is taken from data['$schema']. If it is a dictionary, it is used directly.
Raises:
  • SchemaNotFound – if the given schema was not found.
  • SchemaKeyNotFound – if schema is None and no $schema key was found in data.
  • jsonschema.SchemaError – if the schema is invalid.
  • jsonschema.ValidationError – if the data is invalid.