Skip to main content

Data Dictionary

This page defines the metadata model of EcoData: every indicator is described by a standard set of fields so users can look it up, cite it, and reproduce their research. The dictionary applies to both international and Vietnam data.

Metadata model of an indicator

Every indicator in EcoData carries the following fields:

FieldMeaningExample
short_codeEcoData standardized code (see below)wb_mac_gdp_usd
codeNative code published by the sourceNY.GDP.MKTP.CD
label_vi / label_enBilingual labels"GDP (USD hiện hành)" / "GDP (current US$)"
unitUnit of measurementUSD, % of GDP, index, persons
frequencyObservation frequencyannual, quarterly, monthly, mixed
sourceOriginating organizationWorld Bank, IMF, GSO, Customs
time_start / time_endFirst and last year/period1990 / 2023
spatial_coverageUnit of analysiscountry, province, firm, household
data_quality_notesQuality / provenance notesunit, source group, methodological caveats
Dual lookup

Users can search by short_code (compact, stable) or by the source's native code. The interface shows short_code first; the native code is always kept in the metadata for cross-checking.

Standardized code scheme: ss_dom_key_mod

EcoData maps long, inconsistent native codes across sources into a single structured short code made of four segments separated by underscores:

ss _ dom _ key _ mod
│ │ │ └── modifier: unit / variant (usd, pct, idx, sa, ...)
│ │ └──────── key: indicator name (gdp, cpi, exp, pop, ...)
│ └────────────── domain: field (mac, ext, fis, lab, dem, ...)
└─────────────────── source: provider (wb, im, ad, un, fr, ...)
SegmentRoleExample values
ss (source)Data sourcewb (World Bank), im (IMF), ad (ADB), un (UN), fr (FRED)
dom (domain)Fieldmac (macro), ext (external), fis (fiscal), lab (labor), dem (demographic)
keyIndicator namegdp, cpi, exp, imp, pop, une
modUnit / variantusd, pct, idx, cap (per capita)

Example mappings:

Native codeshort_codeInterpretation
NY.GDP.MKTP.CDwb_mac_gdp_usdWorld Bank · macro · GDP · current USD
FP.CPI.TOTL.ZGwb_mac_cpi_pctWorld Bank · macro · CPI · %/year
SL.UEM.TOTL.ZSwb_lab_une_pctWorld Bank · labor · unemployment · %
Canonical source of the code scheme

The code scheme is generated and validated by the integrated Codebook tool (the Codebook admin page) and internal validation scripts. Auto-generated suffixes are limited to [a-z0-9]{2,8} characters. The entire catalogue (~18,000+ indicators) has short_code values.

Frequency and units

  • Frequency (frequency): annual, quarterly (Q1–Q4), monthly (M01–M12), or mixed when an indicator has multiple frequencies depending on the source.
  • Period key (period_key): for sub-annual data, each observation carries a standard period key — Q1Q4 for quarters, M01M12 for months. Annual data leaves the period key empty.
  • Unit (unit): keeps the source's original semantics (USD, % of GDP, index, persons, tonnes, etc.). When building a multi-source panel, read unit to avoid aggregating mismatched units.

Definitions by data group

Each group has its own characteristic variables; see the detail page for the full list and scope:

GroupUnit of analysisRepresentative variablesDetails
Global DataCountry × yearGDP, CPI, exports/imports, FDI, populationGlobal Data
Vietnam GSONational/Province × yearGRDP, population, industry, CPI, investmentVietnam GSO
CustomsCommodity × partner × periodexport/import value, quantity, balanceCustoms
Macro SurveyProvince × yearPCI, PAPI, PAR, SIPAS, ICTMacro Survey & VHLSS
VHLSS MicroHousehold/Individual × waveincome, expenditure, education, healthMacro Survey & VHLSS
Stock HubSymbol × timeOHLCV prices, revenue, net income, EPS, eventsStock Hub

Using metadata for reproducibility

  1. Keep the short_code + native code with your data so others can cross-check sources.
  2. Record the unit, frequency, time_start/time_end in your study's data description.
  3. When combining sources, check the unit and unit of analysis before building a panel.
  4. Export with metadata (CSV/Excel/JSON) — see Data Export.

See Also