Package 'passport' reference manual

Title:	Travel Smoothly Between Country Name and Code Formats
Description:	Smooths the process of working with country names and codes via powerful parsing, standardization, and conversion utilities arranged in a simple, consistent API. Country name formats include multiple sources including the Unicode Common Locale Data Repository (CLDR, <http://cldr.unicode.org/>) common-sense standardized names in hundreds of languages.
Authors:	Edward Visel [aut, cre]
Maintainer:	Edward Visel <[email protected]>
License:	GPL-3 \| file LICENSE
Version:	0.3.0
Built:	2025-02-04 03:24:13 UTC
Source:	https://github.com/alistaire47/passport

Convert standardized country names to country codes

Description

as_country_code converts a vector of standardized country names or codes to country codes

Usage

as_country_code(x, from, to = "iso2c", factor = is.factor(x))
as_country_code(x, from, to = "iso2c", factor = is.factor(x))

Arguments

`x`	A character, factor, or numeric vector of country names or codes
`from`	Format from which to convert. See Details for more options.
`to`	Code format to which to convert. Defaults to `"iso2c"`; see `codes` for more options.
`factor`	If `TRUE`, returns factor instead of character vector.

Details

as_country_code takes a character, factor, or numeric vector of country names or codes to translate into the specified code format. The default for to is "iso2c", the ISO 3166-1 Alpha-2 character codes, but many alternatives are available.

Several non-unique codes are available as well, including "continent", "is_independent", ISO 4217 currency codes, etc. Backwards conversion will not work for such cases.

See codes for all options, or run DT::datatable(codes) for a searchable widget.

Value

A vector of country codes. Warns if new NA values are added.

Examples

# Codifies standardized names
as_country_code(c("US", "Taiwan", "Myanmar", "Kosovo", "South Korea"), from = "en")

# Translates codes; if passed a factor, returns a releveled one
as_country_code(factor(c("SAH", "PCN", "OMA", "JPN")),
                from = "fifa", to = "iso4217_3c")

# Codifies standardized names
as_country_code(c("US", "Taiwan", "Myanmar", "Kosovo", "South Korea"), from = "en")

# Translates codes; if passed a factor, returns a releveled one
as_country_code(factor(c("SAH", "PCN", "OMA", "JPN")),
                from = "fifa", to = "iso4217_3c")

Convert standardized country codes to country names

Description

as_country_name converts a vector of standardized country codes to country names.

Usage

as_country_name(
  x,
  to = "en",
  from = "iso2c",
  short = TRUE,
  variant = FALSE,
  factor = is.factor(x)
)
as_country_name(
  x,
  to = "en",
  from = "iso2c",
  short = TRUE,
  variant = FALSE,
  factor = is.factor(x)
)

Arguments

`x`	A character, factor, or numeric vector of country codes or names
`to`	Language code of country names desired. Defaults to `"en"`; see `codes` for more options.
`from`	Code format from which to convert. Defaults to `"iso2c"`; see `codes` for more options.
`short`	Whether to use short alternative name when available. Can be length 1 or the same length as `x`.
`variant`	Whether to use variant alternative name when available. Can be length 1 or the same length as `x`.
`factor`	If `TRUE`, returns factor instead of character vector. If not supplied, defaults to `is.factor(x)`

Details

as_country_name takes a character, factor, or numeric vector of country codes (or names in another standardized format) and converts them to country names in the specified format. If you are trying to standardize an existing set of names, see parse_country().

The default "en" is from Unicode Common Locale Data Repository (CLDR), which aspires to use the most customary name e.g. "Switzerland" instead of official ones, which are frequently awkward for common usage, e.g. "Swiss Confederation". CLDR also supplies names in a huge variety of languages, allowing for easy translation. Short and variant alternates are available for some countries; if not, the function will fall back to the standard form. See LICENSE file for terms of use.

Other name sets are available from

the UN Statistics Division(UNSD), which maintains standardized names in English, Chinese, Russian, French, Spanish, and Arabic, here named as "en_un" etc.
the ISO, "en_iso" and "fr_iso", and
the CIA World Factbook:
- "en_cia", which include many longer official forms and shorter practical forms,
- "en_cia_local", which includes transliterations, and
- "en_cia_abbreviation", which includes commonly-used abbreviations.

See codes for all options, or run DT::datatable(codes) for a searchable widget.

Value

A character or factor vector of country names. Warns if new NA values are added.

Examples

# Usable names for tough-to-standardize places
as_country_name(c("US", "TW", "MM", "XK", "KR"))

# If passed a factor, will return a releveled one
as_country_name(factor(c("US", "NF", "CD", "SJ")), short = FALSE, variant = TRUE)

# Speaks a lot of languages, knows a lot of codes
as_country_name(c("SAH", "PCN", "OMA", "JPN"), from = "fifa", to = "cy")    # to Welsh

# Usable names for tough-to-standardize places
as_country_name(c("US", "TW", "MM", "XK", "KR"))

# If passed a factor, will return a releveled one
as_country_name(factor(c("US", "NF", "CD", "SJ")), short = FALSE, variant = TRUE)

# Speaks a lot of languages, knows a lot of codes
as_country_name(c("SAH", "PCN", "OMA", "JPN"), from = "fifa", to = "cy")    # to Welsh

Country code and name details and documentation

Description

A codebook data.frame of codes and details for country code and name conversions available. Contains Internet Engineering Task Force (IETF) language tags (e.g. "en-nz" for New Zealand English) for Unicode Common Locale Data Repository (CLDR) names, similar approximations for institutional names (e.g. "en-iso"), and short names (e.g. "iso2c") for country codes.

Usage

codes
codes

Format

A data.frame of 427 rows and 9 variables.

Variables:

column: The column name in the internal passport:::countries data.frame. Valid for use in from and to parameters.
code: column with hyphens for underscores, which is a valid IANA language tag for Unicode CLDR country names. Valid for use in from and to parameters.
name: Full name or code name for non-CLDR options.
notes: Things to note, including deprecations, oddities, etc.
language: Full language name parsed from code.
region: Full country or region name parsed from code.
script: Full language script name parsed from code.
variant: Full variant parsed from code. Also used for organization-standardized names.
extension: Further specification of name type.

Details

All functions can accept codes separated with underscores ⁠_⁠, hyphens -, or periods ..

Examples

# A searchable widget to find a code or name
if (requireNamespace("DT", quietly = TRUE)) {
    DT::datatable(codes)
}
# A searchable widget to find a code or name
if (requireNamespace("DT", quietly = TRUE)) {
    DT::datatable(codes)
}

Construct formatter function to format country codes as country names

Description

country_format is a constructor function that returns a function to format country codes as country names suitable for passing to ggplot2's scale functions' label parameters.

Usage

country_format(
  from = "iso2c",
  to = "en",
  short = TRUE,
  variant = FALSE,
  factor
)
country_format(
  from = "iso2c",
  to = "en",
  short = TRUE,
  variant = FALSE,
  factor
)

Arguments

`from`	Code format from which to convert. Defaults to `"iso2c"`; see `codes` for more options.
`to`	Language code of country names desired. Defaults to `"en"`; see `codes` for more options.
`short`	Whether to use short alternative name when available. Can be length 1 or the same length as `x`.
`variant`	Whether to use variant alternative name when available. Can be length 1 or the same length as `x`.
`factor`	If `TRUE`, returns factor instead of character vector. If not supplied, defaults to `is.factor(x)`

Details

A frequent reason to convert country codes back to country names is to make data visualizations more readable. While both a code and name could be stored in a data frame, the computation and extra storage required can be avoided by transforming codes to names directly within the visualization via a formatter function. as_country_name() could be used without parentheses to format ISO 2-character codes as English names, but format_country allows greater flexibility, returning a formatter function with the specified parameters set.

Value

A function that accepts a vector of country codes and returns them as country names.

Examples

if (require(ggplot2, quietly = TRUE)) {
    ggplot(data.frame(country = c("KOR", "MMR", "TWN", "COG"),
                      y = 1:4),
           aes(x = country, y = y)) +
        geom_col() +
        scale_x_discrete(labels = country_format(from = "iso3c"))
}

if (require(ggplot2, quietly = TRUE)) {
    ggplot(data.frame(country = c("KOR", "MMR", "TWN", "COG"),
                      y = 1:4),
           aes(x = country, y = y)) +
        geom_col() +
        scale_x_discrete(labels = country_format(from = "iso3c"))
}

NATO Member Defense Expenditures

Description

A sample dataset of NATO/OTAN member defense expenditures.

Usage

nato
nato

Format

A data.frame of 232 rows and 14 variables.

Variables:

country_stanag: Country code in NATO STANAG format
year: Year, from 2012 to 2019. 2018-2019 numbers may be estimates.
⁠Defense expenditure (USD, current prices)⁠: Defense expenditures in US dollars, using current prices and exchange rates.
⁠Defense expenditure (USD, 2015 prices)⁠: Defense expenditures in US dollars, using 2015 prices and exchange rates.
⁠Defense expenditure (% real GDP)⁠: Defense expenditure as a percentage of real gross domestic product. Based on 2015 prices.
⁠Defense expenditure annual real change (% GDP)⁠: Annual change in defense expenditure as a percentage of real gross domestic product. Based on 2015 prices.
⁠Real GDP (2015 prices)⁠: Real gross domestic product in 2015 US dollars and at 2015 exchange rates.
⁠GDP per capita (USD)⁠: Gross domestic product per capita in 2015 US dollars and at 2015 exchange rates.
⁠Defense expenditure per capita (USD)⁠: Defense expenditure per capita in 2015 US dollars.
⁠Military personnel⁠: Number of military personnel
⁠Equipment expenditure (%)⁠: Percent of defense expenditure spent on equipment. Includes major equipment expenditure and R&D devoted to major equipment.
⁠Personnel expenditure (%)⁠: Percentage of defense expenditure spent on personnel. Includes both military and civilian expenditure and pensions.
⁠Infrastructure expenditure (%)⁠: Percentage of defense expenditure spent on infrastructure. Includes NATO common infrastructure and national military construction.
⁠Other expenditure (%)⁠: Percentage of defense expenditure spent on other categories besides equipment, personnel, and infrastructure. Includes operations and maintenance expenditure, other R&D expenditure, and other expenditure not otherwise captured.

Source

https://www.nato.int/cps/en/natohq/news_167080.htm

Examples

as_country_name(nato$country_stanag, from = 'stanag')
as_country_name(nato$country_stanag, from = 'stanag')

Order a vector of countries

Description

order_countries reorders a vector of countries, returning a result useful for passing to ggplot2's scale functions' limits parameters.

Usage

order_countries(
  x,
  by,
  ...,
  from = "iso2c",
  short = TRUE,
  variant = FALSE,
  factor = is.factor(x)
)
order_countries(
  x,
  by,
  ...,
  from = "iso2c",
  short = TRUE,
  variant = FALSE,
  factor = is.factor(x)
)

Arguments

`x`	A character, factor, or numeric vector of country codes or names
`by`	Either a length-one country code from `codes` or a vector the same length as `x` by which to order `x`
`...`	Parameters passed on to `order()`, including addition vectors by which to sort, `decreasing`, and `na.last`.
`from`	Code format from which to convert. Defaults to `"iso2c"`; see `codes` for more options.
`short`	Whether to use short alternative name when available. Can be length 1 or the same length as `x`.
`variant`	Whether to use variant alternative name when available. Can be length 1 or the same length as `x`.
`factor`	If `TRUE`, returns factor instead of character vector. If not supplied, defaults to `is.factor(x)`

Details

order_countries orders a vector of countries by

itself converted to a country code or name if by is a code from codes to which to convert
a sortable vector if by is a vector of the same length as x
x itself if neither is supplied.

Value

The original vector of countries, ordered according to the parameters passed. Note that factors are not releveled, but are reordered. To relevel, pass the results to levels<-()

Examples

countries <- c("FR", "CP", "UZ", "BH", "BR")

order_countries(countries)

order_countries(countries, "ja")

order_countries(countries, rnorm(5))

order_countries(countries, grepl("F", countries), 1:5, decreasing = TRUE)

if (require(ggplot2, quietly = TRUE)) {
    df_countries <- data.frame(country = countries,
                               y = exp(1:5))

    ggplot(df_countries, aes(country, y)) +
        geom_col() +
        scale_x_discrete(
            limits = order_countries(df_countries$country,
                                     df_countries$y)[df_countries$y > 5],
            labels = country_format(to = "en-cia-local")
        )
}

countries <- c("FR", "CP", "UZ", "BH", "BR")

order_countries(countries)

order_countries(countries, "ja")

order_countries(countries, rnorm(5))

order_countries(countries, grepl("F", countries), 1:5, decreasing = TRUE)

if (require(ggplot2, quietly = TRUE)) {
    df_countries <- data.frame(country = countries,
                               y = exp(1:5))

    ggplot(df_countries, aes(country, y)) +
        geom_col() +
        scale_x_discrete(
            limits = order_countries(df_countries$country,
                                     df_countries$y)[df_countries$y > 5],
            labels = country_format(to = "en-cia-local")
        )
}

Parse country names to standardized form

Description

parse_country parses irregular country names to the ISO 3166-1 Alpha-2 code or other standardized code or name format.

Usage

parse_country(
  x,
  to = "iso2c",
  how = c("regex", "google"),
  language = c("en", "de"),
  factor = is.factor(x)
)
parse_country(
  x,
  to = "iso2c",
  how = c("regex", "google"),
  language = c("en", "de"),
  factor = is.factor(x)
)

Arguments

`x`	A character or factor vector of country names to standardize
`to`	Format to which to convert. Defaults to `"iso2c"`; see `codes` for more options.
`how`	How to parse; defaults to `"regex"`. '"google"“ uses the Google Maps geocoding API. See "Details" for more information.
`language`	If `how = "regex"`, the language from which to parse country names. Currently accepts `"en"` (default) and `"de"`. Ignored if `how = "google"`.
`factor`	If `TRUE`, returns factor instead of character vector. If not supplied, defaults to `is.factor(x)`

Details

parse_country tries to parse a character or factor vector of country names to a standardized form: by default, ISO 3166-1 Alpha-2 codes.

When how = "regex" (default), parse_country uses regular expressions to match irregular forms.

If regular expressions are insufficient, how = "google" will use the Google Maps geocoding API instead, which permits a much broader range of input formats and languages. The API allows 2500 calls per day, and should thus be called judiciously. parse_country will make one call per unique input. For more calls, see options that allow passing an API key like ggmap::geocode() with output = "all" or googleway::google_geocode().

Note that due to their flexibility, the APIs may fail unpredictably, e.g. parse_country("foo", how = "google") returns "CH" whereas how = "regex" fails with a graceful NA and warning.

Value

A character vector or factor of ISO 2-character country codes or other specified codes or names. Warns of any parsing failure.

Examples

parse_country(c("United States", "USA", "U.S.", "us", "United States of America"))

## Not run: 
# Unicode support for parsing accented or non-Latin scripts
parse_country(c("\u65e5\u672c", "Japon", "\u0698\u0627\u067e\u0646"), how = "google")
#> [1] "JP" "JP" "JP" "JP"

# Parse distinct place names via geocoding APIs
parse_country(c("1600 Pennsylvania Ave, DC", "Eiffel Tower"), how = "google")
#> [1] "US" "FR"

## End(Not run)

parse_country(c("United States", "USA", "U.S.", "us", "United States of America"))

## Not run: 
# Unicode support for parsing accented or non-Latin scripts
parse_country(c("\u65e5\u672c", "Japon", "\u0698\u0627\u067e\u0646"), how = "google")
#> [1] "JP" "JP" "JP" "JP"

# Parse distinct place names via geocoding APIs
parse_country(c("1600 Pennsylvania Ave, DC", "Eiffel Tower"), how = "google")
#> [1] "US" "FR"

## End(Not run)

Package 'passport'

Help Index

Convert standardized country names to country codes

Description

Usage

Arguments

Details

Value

See Also

Examples

Convert standardized country codes to country names

Description

Usage

Arguments

Details

Value

See Also

Examples

Country code and name details and documentation

Description

Usage

Format

Details

Examples

Construct formatter function to format country codes as country names

Description

Usage

Arguments

Details

Value

See Also

Examples

NATO Member Defense Expenditures

Description

Usage

Format

Source

Examples

Order a vector of countries

Description

Usage

Arguments

Details

Value

See Also

Examples

Parse country names to standardized form

Description

Usage

Arguments

Details

Value

Examples