Code reference

Asian Transport Observatory (ATO) provider.

Submodules

cli

CLI for ato.

Module data

CL_ECONOMY

List of all "ECONOMY" codes appearing in processed data.

CS_MEASURE

List of all measures (indicators) appearing in processed data.

FILES

Mapping from short codes for ATO data categories to file names.

POOCH

Pooch to fetch data from the ATO website.

POOCH_ZENODO

Pooch to fetch data from the Zenodo mirror at https://doi.org/10.5281/zenodo.14913729.

transport_data.ato.CL_ECONOMY = <Codelist ECONOMY (0 items): Asian Transport Outlook subject economy>[source]

List of all “ECONOMY” codes appearing in processed data.

transport_data.ato.CS_MEASURE = <ConceptScheme MEASURE (0 items): Asian Transport Outlook measures (indicators)>[source]

List of all measures (indicators) appearing in processed data.

Todo

Validate against the master list of indicators; or read from that file and validate IDs appearing in data files.

transport_data.ato.FILES = {'ACC': ('ATO Workbook (ACCESS & CONNECTIVITY (ACC)).xlsx', 'sha256:21c3b7e662d932f5cc61c22489acb3cf0e8a70200abc2372c7fe212602903fd7'), 'APH': ('ATO Workbook (AIR POLLUTION & HEALTH (APH)).xlsx', 'sha256:dcec4676c74566712e2771aad0afe196d1db9a3f7630eac1c3dba29d0b7c09f4'), 'CLC': ('ATO Workbook (CLIMATE CHANGE (CLC)).xlsx', 'sha256:2d582ade3dfe452fb2eedb3cbe9d06ac7167f0bb867e41016c8a1e7aa2efca15'), 'INF': ('ATO Workbook (INFRASTRUCTURE (INF)).xlsx', 'sha256:8ee1274cd2268c44b1f33d3917b72fc6626897aca28ecbee56c0ba8aa646e89a'), 'MIS': ('ATO Workbook (MISCELLANEOUS (MIS)).xlsx', 'sha256:c601e9e217e137a6071758f73cac050ea7dae4ff746e4a99d8c3297269175c03'), 'POL': ('ATO Workbook (TRANSPORT POLICY (POL)).xlsx', 'sha256:fbf23b012590b631239654d255d23ccb70fa717b466be8343a5b0f1e8b4ce720'), 'RSA': ('ATO Workbook (ROAD SAFETY (RSA)).xlsx', 'sha256:51a6658fa12fcb3ac77298f5908ab343492385ecb1b24602ba21b91dbbcedca5'), 'SEC': ('ATO Workbook (SOCIO-ECONOMIC (SEC)).xlsx', 'sha256:bc5e4a0006173a53f5b5f283c3b0174566b81842e368a432f25ba563ffcda93b'), 'TAS': ('ATO Workbook (TRANSPORT ACTIVITY & SERVICES (TAS)).xlsx', 'sha256:3e468c325ab508476d5d06e81d7d0e2c21655b4f3801abf20776812928126bb6')}[source]

Mapping from short codes for ATO data categories to file names.

transport_data.ato.POOCH = <transport_data.util.pooch.Pooch object>[source]

Pooch to fetch data from the ATO website.

transport_data.ato.POOCH_ZENODO = <transport_data.util.pooch.Pooch object>[source]

Pooch to fetch data from the Zenodo mirror at https://doi.org/10.5281/zenodo.14913729.

Functions

convert(part[, config])

Convert part of the ATO National Database to SDMX-ML and store.

convert_sheet(df, aa)

Convert df and aa from read_sheet() into SDMX data structures.

dataset_to_metadata_reports(ds, msd)

Convert the attributes of ATO ds to 1 or more MetadataReport.

expand(fname)

Callback to expand fname into a complete file name for POOCH.

expand_zenodo(fname)

Callback to expand fname into a complete file name for POOCH_ZENODO.

fetch(*parts[, config])

Fetch source data files for one or more parts of the ATO National Database.

format_data_provider(value)

Format the ATO “Source” data attribute as TDC DATA_PROVIDER metadata.

get_agencies()

prepare(aa)

Prepare an empty data set and associated structures.

provides()

read_sheet(ef, sheet_name)

Read a single sheet.

validate_economy(df)

Validate codes for the "ECONOMY" dimension of df against CL_ECONOMY.

transport_data.ato.convert(part: str, config: Config | None = None, **kwargs) None[source]

Convert part of the ATO National Database to SDMX-ML and store.

transport_data.ato.convert_sheet(df: DataFrame, aa: AnnotableArtefact)[source]

Convert df and aa from read_sheet() into SDMX data structures.

transport_data.ato.dataset_to_metadata_reports(ds: DataSet, msd: MetadataStructureDefinition) Iterable[MetadataReport][source]

Convert the attributes of ATO ds to 1 or more MetadataReport.

The metadata reports conform to the TDC metadata structure ( org.metadata.get_msd()).

If ds contains per-series values for attributes named “Source”, “Source (2024-11)”, or similar, then additional metadata reports are generated, one for each series (=GEO, or ‘economy’) and each distinct upstream source indicated by these attribute values.

transport_data.ato.expand(fname: str) str[source]

Callback to expand fname into a complete file name for POOCH.

transport_data.ato.expand_zenodo(fname: str) str[source]

Callback to expand fname into a complete file name for POOCH_ZENODO.

transport_data.ato.fetch(*parts: str, config: Config | None = None, **kwargs) list[Path][source]

Fetch source data files for one or more parts of the ATO National Database.

Parameters:

parts – If no positional args are given, all of the keys from FILES are used.

transport_data.ato.format_data_provider(value: str) str[source]

Format the ATO “Source” data attribute as TDC DATA_PROVIDER metadata.

This makes more explicit how the ATO has handled upstream data.

transport_data.ato.get_agencies()[source]
transport_data.ato.prepare(aa: AnnotableArtefact) tuple[DataSet, Callable][source]

Prepare an empty data set and associated structures.

transport_data.ato.provides()[source]
transport_data.ato.read_sheet(ef: ExcelFile, sheet_name: str) tuple[DataFrame, AnnotableArtefact][source]

Read a single sheet.

This function handles the particular layout of sheets in files like those listed in FILES. These combine data and metadata.

  • Row 1 is a title row.

  • Cell range A2:B10 contain a set of metadata fields, with the field name in column A and the value in column B.

  • Rows 11:13 contain no data or metadata; only a link back to a table of contents sheet.

  • Row 14 contains a label “Series” centre-spanned across

  • Row 15 contains column labels, described below.

  • Row 16 and onwards contain data, followed by two blank rows, and two rows with attribution/acknowledgements.

  • Columns labeled (i.e. in row 15) “Economy Code” and “Economy Name” contain codes and names, respectively, for the geographic units.

  • Columns with numeric labels describe time periods, specifically years, that are part of observation keys.

  • Some sheets have additional columns with non-numeric labels like “Remarks”, “Source (2022-04)”, etc.; these give annotations applying to the observations on the same row (i.e. for a single “Economy Code” and 1 or more time periods).

Note

Sheets in the POL category have a different format.

transport_data.ato.validate_economy(df: DataFrame) DataFrame[source]

Validate codes for the “ECONOMY” dimension of df against CL_ECONOMY.

  • Every unique pair of (Economy Code, Economy Name) is converted to a Code.

  • These are added to CL_ECONOMY. If a Code with the same ID already exists, it is checked for an exact match (name, description, etc.)

  • The “Economy Code” column of df is renamed “ECONOMY”, and contains only values from CL_ECONOMY. The “Economy Name” column is dropped.

Classes

Config([from_zenodo, dry_run])

Common configuration for fetch(), convert() and others.