Code reference¶
Handle TDC-structured metadata.
Submodules
Generate reports about TDC-structured metadata. |
|
Non-standard TDC Excel file format for collecting metadata. |
Module data
Concepts and metadata attributes in the TDC metadata structure. |
- transport_data.org.metadata.CONCEPTS = {'COMMENT': ('Comment', 'Any other information about the metadata values, for instance discrepancies or\nunclear or missing information.\n\nPrecede comments with initials; append to existing comments to keep\nchronological order; and include a date (for example, “2024-07-24”) if helpful.'), 'DATAFLOW': ('Data flow ID', 'A unique identifier for the data flow (=data source, data set, etc.).\n\nWe suggest to use IDs like ‘VN001’, where ‘VN’ is the ISO 3166 alpha-2 country\ncode, and ‘001’ is a unique number. The value MUST match the name of the sheet\nin which it appears.'), 'DATA_DESCR': ('Data description', 'Any information about the data flow that does not fit in other attributes.\n\nUntil or unless other metadata attributes are added to this metadata structure/\ntemplate, this MAY include:\n\n- Any conditions on data access, e.g. publicly available, proprietary, fee or\n subscription required, available on request, etc.\n- Frequency of data updates.\n- Any indication of quality, including third-party references that indicate data\n quality.\n'), 'DATA_PROVIDER': ('Data provider', 'Organization or individual that provides the data and any related metadata.\n\nThis can be as general (“IEA”) or specific (organization unit/department, specific\nperson responsible, contact details, etc.) as appropriate.'), 'DIMENSION': ('Dimensions', 'Formally, the “statistical concept used in combination with other statistical\nconcepts to identify a statistical series or individual observations.”\n\nRecord all dimensions of the data, either in a bulleted or numbered list, or\nseparated by semicolons. In parentheses, give some indication of the scope\nand/or resolution of the data along each dimension. Most data have at least time\nand space dimensions.\n\nExample:\n\n- TIME_PERIOD (annual, 5 years up to 2021)\n- REF_AREA (whole country; VN only)\n- Vehicle type (12 different types: […])\n- Emissions species (CO2 and 4 others)'), 'MEASURE': ('Measure (‘indicator’)', 'Statistical concept for which data are provided in the data flow.\n\nIf the data flow contains data for multiple measures, give each one separated by\nsemicolons. Example: “Number of cars; passengers per vehicle”.\n\nThis SHOULD NOT duplicate the value for ‘UNIT_MEASURE’. Example: “Annual driving\ndistance per vehicle”, not “Kilometres per vehicle”.'), 'METHOD': ('Methodology', 'Any information about methods used by the data provider to collect, process,\nor prepare the data.'), 'UNIT_MEASURE': ('Unit of measure', 'Unit in which the data values are expressed.\n\nIf ‘MEASURE’ contains 2+ items separated by semicolons, give the respective units in the\nsame way and order. If there are no units, write ‘dimensionless’, ‘1’, or similar.'), 'URL': ('URL or web address', 'Location on the Internet with further information about the data flow.')}[source]¶
Concepts and metadata attributes in the TDC metadata structure.
Functions
|
Return |
|
Return the ID of the dataflow targeted by mdr. |
Create a shared concept scheme for the concepts referenced by dimensions. |
|
|
Generate and return the TDC metadata structure definition. |
|
Group metadata reports in mds according to a key function. |
|
Generate a ReportedAttribute for mda_id with the given value. |
|
Generate a |
|
Return a mapping from unique concept IDs used for dimensions to data flow IDs. |
|
Return a mapping from unique reported attribute values to data flow IDs. |
|
Extend mds with metadata reports for ADB ATO data flows. |
|
Generate a unique DSD ID for mdr. |
- transport_data.org.metadata.contains_data_for(mdr: MetadataReport, ref_area: str) bool [source]¶
Return
True
if mdr contains data for ref_area.True
is returned if any of the following:The referenced data flow definition has an ID that starts with ref_area.
The country’s ISO 3166 alpha-2 code, alpha-3 code, official name, or common name appears in the value of the
DATA_DESCR
metadata attribute.
- Parameters:
ref_area (
str
) – ISO 3166 alpha-2 code for a country. Passed topycountry.countries.lookup()
.
- transport_data.org.metadata.dfd_id(mdr: MetadataReport) str [source]¶
Return the ID of the dataflow targeted by mdr.
- transport_data.org.metadata.get_cs_common() ConceptScheme [source]¶
Create a shared concept scheme for the concepts referenced by dimensions.
Concepts in this scheme have an annotation
tdc-aka
, which is a list of alternate IDs recognized for the concept.
- transport_data.org.metadata.get_msd() MetadataStructureDefinition [source]¶
Generate and return the TDC metadata structure definition.
- transport_data.org.metadata.groupby(mds: MetadataSet, key=typing.Callable[[ForwardRef('v21.MetadataReport')], typing.Hashable]) dict[Hashable, list[MetadataReport]] [source]¶
Group metadata reports in mds according to a key function.
Similar to
itertools.groupby()
.
- transport_data.org.metadata.make_ra(mda_id: str, value: Any) OtherNonEnumeratedAttributeValue [source]¶
Generate a ReportedAttribute for mda_id with the given value.
- transport_data.org.metadata.make_tok(dfd: BaseDataflow) TargetObjectKey [source]¶
Generate a
TargetObjectKey
that refers to dfd.
- transport_data.org.metadata.map_dims_to_ids(mds: MetadataSet) dict[str, set[str]] [source]¶
Return a mapping from unique concept IDs used for dimensions to data flow IDs.
- transport_data.org.metadata.map_values_to_ids(mds: MetadataSet, mda_id: str) dict[str, set[str]] [source]¶
Return a mapping from unique reported attribute values to data flow IDs.
- transport_data.org.metadata.merge_ato(mds: MetadataSet) None [source]¶
Extend mds with metadata reports for ADB ATO data flows.