Package 'wbstats'

Title: Programmatic Access to Data and Statistics from the World Bank API
Description: Search and download data from the World Bank Data API.
Authors: Jesse Piburn [aut, cre] , UT-Battelle, LLC [cph]
Maintainer: Jesse Piburn <[email protected]>
License: MIT + file LICENSE
Version: 1.0.5.9000
Built: 2025-02-25 06:02:23 UTC
Source: https://github.com/gshs-ornl/wbstats

Help Index


Download Data from the World Bank API

Description

This function downloads the requested information using the World Bank API

Usage

wb(
  country = "all",
  indicator,
  startdate,
  enddate,
  mrv,
  return_wide = FALSE,
  gapfill,
  freq,
  cache,
  lang = c("en", "es", "fr", "ar", "zh"),
  removeNA = TRUE,
  POSIXct = FALSE,
  include_dec = FALSE,
  include_unit = FALSE,
  include_obsStatus = FALSE,
  include_lastUpdated = FALSE
)

Arguments

country

Character vector of country or region codes. Default value is special code of all. Other permissible values are codes in the following fields from the wb_cachelist country data frame. iso3c, iso2c, regionID, adminID, and incomeID. Additional special values include aggregates, which returns only aggregates, and countries_only, which returns all countries without aggregates.

indicator

Character vector of indicator codes. These codes correspond to the indicatorID column from the indicator data frame of wbcache or wb_cachelist, or the result of wbindicators

startdate

Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". This also accepts a special value of "YTD", useful for more frequently updated subannual indicators.

enddate

Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1".

mrv

Numeric. The number of Most Recent Values to return. A replacement of startdate and enddate, this number represents the number of observations you which to return starting from the most recent date of collection. Useful in conjuction with freq

return_wide

Logical. If TRUE data is returned in a wide format instead of long, with a column named for each indicatorID. To necessitate this transformation, the indicator column, that provides the human readable description is dropped. This field is available through from the indicator data frame of wbcache or wb_cachelist, or the result of wbindicators. Default is FALSE

gapfill

Logical. Works with mrv. If TRUE fills values, if not available, by back tracking to the next available period (max number of periods back tracked will be limited by mrv number)

freq

Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values. Currently works along with mrv. Useful for querying high frequency data.

cache

List of data frames returned from wbcache. If omitted, wb_cachelist is used

lang

Language in which to return the results. If lang is unspecified, english is the default.

removeNA

if TRUE, remove any blank or NA observations that are returned. if FALSE, no blank or NA values are removed from the return.

POSIXct

if TRUE, additonal columns date_ct and granularity are added. date_ct converts the default date into a POSIXct. granularity denotes the time resolution that the date represents. Useful for subannual data and mixing subannual with annual data. If FALSE, these fields are not added.

include_dec

if TRUE, the column decimal is not removed from the return. if FALSE, this column is removed

include_unit

if TRUE, the column unit is not removed from the return. if FALSE, this column is removed

include_obsStatus

if TRUE, the column obsStatus is not removed from the return. if FALSE, this column is removed

include_lastUpdated

if TRUE, the column lastUpdated is not removed from the return. if FALSE, this column is removed. If TRUE and POSIXct = TRUE then column will be of class Date

Value

Data frame with all available requested data.

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin

The POSIXct parameter requries the use of lubridate (>= 1.5.0). All dates are rounded down to the floor. For example a value for the year 2016 would have a POSIXct date of 2016-01-01. If this package is not available and the POSIXct parameter is set to TRUE, the parameter is ignored and a warning is produced.

The include_dec, include_unit, and include_obsStatus are defaulted to FALSE because as of writing, all returns have a value of 0, NA, and NA, respectively. These columns might be used in the future by the API, therefore the option to include the column is available.

The include_lastUpdated is defaulted to FALSE as well to limit the

If there is no data available that matches the request parameters, an empty data frame is returned along with a warning. This design is for easy aggregation of multiple calls.

Examples

# GDP at market prices (current US$) for all available countries and regions
 wb(indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016)

 # GDP and Population in long format for the most recent 20 observations
 wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20)

 # GDP and Population in wide format for the most recent 20 observations
 wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20, return_wide = TRUE)

 # query using regionID or incomeID
 # High Income Countries and Sub-Saharan Africa (all income levels)
 wb(country = c("HIC", "SSF"), indicator = "NY.GDP.MKTP.CD", startdate = 1985, enddate = 1985)

 # if you do not know when the latest time an indicator is avaiable mrv can help
 wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 1)

 # increase the mrv value to increase the number of maximum number of returns
 wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35)

 # GDP at market prices (current US$) for only available countries
 wb(country = "countries_only", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016)

 # GDP at market prices (current US$) for only available aggregate regions
 wb(country = "aggregates", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016)

 # if you want to "fill-in" the values in between actual observations use gapfill = TRUE
 # this highlights a very important difference.
 # all other parameters are the same as above, except gapfill = TRUE
 # and the results are very different
 wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35, gapfill = TRUE)

 # if you want the most recent values within a certain time frame
 wb(country = c("US"), indicator = 'SI.DST.04TH.20', startdate = 1970, enddate = 2000, mrv = 2)

 # without the freq parameter the deafult temporal granularity search is yearly
 # should return the 12 most recent years of data
 wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12)

 # if another frequency is available for that indicator it can be accessed using the freq parameter
 # should return the 12 most recent months of data
 wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12, freq = "M")

Download an updated list of country, indicator, and source information

Description

Download an updated list of information regarding countries, indicators, sources, regions, indicator topics, lending types, income levels, and supported languages from the World Bank API

Usage

wb_cache(lang)

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

Value

A list containing the following items:

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wb_languages()

Saving this return and using it has the cache parameter in wb_data() and wb_search() replaces the default cached version wb_cachelist that comes with the package itself


Cached information from the World Bank API

Description

This data is a cached result of the wb_cache function. By default functions wb_data and wb_search use this data for the cache parameter.

Usage

wb_cachelist

Format

An object of class list of length 8.


Cached information from the World Bank API

Description

This data is a cached result of the wbcache function. By default functions wb and wbsearch use this data for the cache parameter.

Usage

wb_cachelist_dep

Format

A list containing 7 data frames:

  • countries: A data frame. The result of calling wbcountries

  • indicators: A data frame.The result of calling wbindicators

  • sources: A data frame.The result of calling wbsources

  • datacatalog: A data frame.The result of calling wbdatacatalog

  • topics: A data frame.The result of calling wbtopics

  • income: A data frame.The result of calling wbincome

  • lending: A data frame.The result of calling wblending


Download Data from the World Bank API

Description

This function downloads the requested information using the World Bank API

Usage

wb_data(
  indicator,
  country = "countries_only",
  start_date,
  end_date,
  return_wide = TRUE,
  mrv,
  mrnev,
  cache,
  freq,
  gapfill = FALSE,
  date_as_class_date = FALSE,
  lang
)

Arguments

indicator

Character vector of indicator codes. These codes correspond to the indicator_id column from the indicators tibble of wb_cache(), wb_cachelist, or the result of running wb_indicators() directly

country

Character vector of country, region, or special value codes for the locations you want to return data for. Permissible values can be found in the countries tibble in wb_cachelist or by running wb_countries() directly. Specifically, values listed in the following fields iso3c, iso2c, country, region, admin_region, income_level and all of the ⁠region_*⁠, ⁠admin_region_*⁠, ⁠income_level_*⁠, columns. As well as the following special values

  • "countries_only" (Default)

  • "regions_only"

  • "admin_regions_only"

  • "income_levels_only"

  • "aggregates_only"

  • "all"

start_date

Numeric or character. If numeric it must be in ⁠%Y⁠ form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". This also accepts a special value of "YTD", useful for more frequently updated subannual indicators.

end_date

Numeric or character. If numeric it must be in ⁠%Y⁠ form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1".

return_wide

Logical. If TRUE data is returned in a wide format instead of long, with a column named for each indicator_id or if the indicator argument is a named vector, the names() given to the indicator will be the column names. To necessitate this transformation, the indicator column that provides the human readable description is dropped, but provided as a column label. Default is TRUE

mrv

Numeric. The number of Most Recent Values to return. A replacement of start_date and end_date, this number represents the number of observations you which to return starting from the most recent date of collection. This may include missing values. Useful in conjuction with freq

mrnev

Numeric. The number of Most Recent Non Empty Values to return. A replacement of start_date and end_date, similar in behavior as mrv but excludes locations with missing values. Useful in conjuction with freq

cache

List of tibbles returned from wb_cache(). If omitted, wb_cachelist is used

freq

Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values. Useful for querying high frequency data.

gapfill

Logical. If TRUE fills in missing values by carrying forward the last available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

date_as_class_date

Logical. If TRUE the date field is returned as class Date, useful when working with non-annual data or data at mixed resolutions. Default is FALSE available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

Details

obs_status column

Indicates the observation status for location, indicator and date combination. For example "F" in the response indicates that the observation status for that data point is "forecast".

Value

a tibble of all available requested data.

Examples

# gdp for all countries for all available dates
df_gdp <- wb_data("NY.GDP.MKTP.CD")

# Brazilian gdp for all available dates
df_brazil <- wb_data("NY.GDP.MKTP.CD", country = "br")

# Brazilian gdp for 2006

df_brazil_1 <- wb_data("NY.GDP.MKTP.CD", country = "brazil", start_date = 2006)


# Brazilian gdp for 2006-2010

df_brazil_2 <- wb_data("NY.GDP.MKTP.CD", country = "BRA",
                       start_date = 2006, end_date = 2010)


# Population, GDP, Unemployment Rate, Birth Rate (per 1000 people)

my_indicators <- c("SP.POP.TOTL",
                   "NY.GDP.MKTP.CD",
                   "SL.UEM.TOTL.ZS",
                   "SP.DYN.CBRT.IN")


df <- wb_data(my_indicators)

# you pass multiple country ids of different types
# Albania (iso2c), Georgia (iso3c), and Mongolia

my_countries <- c("AL", "Geo", "mongolia")
df <- wb_data(my_indicators, country = my_countries,
              start_date = 2005, end_date = 2007)


# same data as above, but in long format

df_long <- wb_data(my_indicators, country = my_countries,
                   start_date = 2005, end_date = 2007,
                   return_wide = FALSE)


# regional population totals
# regions correspond to the region column in wb_cachelist$countries

df_region <- wb_data("SP.POP.TOTL", country = "regions_only",
                     start_date = 2010, end_date = 2014)


# a specific region

df_world <- wb_data("SP.POP.TOTL", country = "world",
                    start_date = 2010, end_date = 2014)


# if the indicator is part of a named vector the name will be the column name
my_indicators <- c("pop" = "SP.POP.TOTL",
                   "gdp" = "NY.GDP.MKTP.CD",
                   "unemployment_rate" = "SL.UEM.TOTL.ZS",
                   "birth_rate" = "SP.DYN.CBRT.IN")

df_names <- wb_data(my_indicators, country = "world",
                    start_date = 2010, end_date = 2014)


# custom names are ignored if returning in long format

df_names_long <- wb_data(my_indicators, country = "world",
                         start_date = 2010, end_date = 2014,
                         return_wide = FALSE)


# same as above but in Bulgarian
# note that not all indicators have translations for all languages

df_names_long_bg <- wb_data(my_indicators, country = "world",
                            start_date = 2010, end_date = 2014,
                            return_wide = FALSE, lang = "bg")

World Bank Information End Points

Description

These functions are simple wrappers around the various useful API end points that are helpful for finding avaiable data and filtering the data you are interested in when using wb_data()

Usage

wb_countries(lang)

wb_topics(lang)

wb_sources(lang)

wb_regions(lang)

wb_income_levels(lang)

wb_lending_types(lang)

wb_languages()

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

Value

A tibble of information about the end point

See Also

wb_cache()


Download Avialable Indicators from the World Bank

Description

This function returns a tibble of indicator IDs and related information that are available for download from the World Bank API

Usage

wb_indicators(lang, include_archive = FALSE)

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

include_archive

logical. If TRUE indicators that have been archived by the World Bank will be included in the return. Data for these additional indicators are not available through the standard API and querying them using wb_data() will not return data. Default is FALSE.

Examples

# can get a new list of available indicators by downloading new cache
fresh_cache <- wb_cache()
fresh_indicators <- fresh_cache$indicators

# or by running the wb_indicators() function directly
fresh_indicators <- wb_indicators()

# include archived indicators
# see include_archive parameter description
indicators_with_achrive <- wb_indicators(include_archive = TRUE)

Download an updated list of country, indicator, and source information

Description

Download an updated list of information regarding countries, indicators, sources, data catalog, indicator topics, lending types, and income levels from the World Bank API

Usage

wbcache(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A list containing the following items:

  • countries: A data frame. The result of calling wbcountries

  • indicators: A data frame.The result of calling wbindicators

  • sources: A data frame.The result of calling wbsources

  • datacatalog: A data frame.The result of calling wbdatacatalog

  • topics: A data frame.The result of calling wbtopics

  • income: A data frame.The result of calling wbincome

  • lending: A data frame.The result of calling wblending

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin

List item datacatalog will always return in english, as the API does not support any other langauges for that information.

Saving this return and using it has the cache parameter in wb and wbsearch replaces the default cached version wb_cachelist that comes with the package itself


Download updated country and region information from World Bank API

Description

Download updated information on available countries and regions from the World Bank API

Usage

wbcountries(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available countries and regions with related information

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin


Download an updated list of the World Bank data catalog

Description

Download an updated list of the World Bank data catalog from the World Bank API

Usage

wbdatacatalog()

Value

A data frame of the World Bank data catalog with related information

Note

This function does not support any languages other than english due to the lack of support from the World Bank API


Download updated income type information from World Bank API

Description

Download updated information on available income types from the World Bank API

Usage

wbincome(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available income types with related information

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin


Download updated indicator information from World Bank API

Description

Download updated information on available indicators from the World Bank API

Usage

wbindicators(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available indicators with related information

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin


Download updated lending type information from World Bank API

Description

Download updated information on available lending types from the World Bank API

Usage

wblending(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available lending types with related information

Note

Not all data returns have support for langauges other than english. If the specific return does notsupport your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin


Search indicator information available through the World Bank API

Description

This function allows finds indicators that match a search term and returns a data frame of matching results

Usage

wbsearch(
  pattern = "poverty",
  fields = c("indicator", "indicatorDesc"),
  extra = FALSE,
  cache
)

Arguments

pattern

Character string or regular expression to be matched

fields

Character vector of column names through which to search

extra

if FALSE, only the indicator ID and short name are returned, if TRUE, all columns of the cache parameter's indicator data frame are returned

cache

List of data frames returned from wbcache. If omitted, wb_cachelist_dep is used

Value

Data frame with indicators that match the search pattern.

Examples

wbsearch(pattern = "education")

wbsearch(pattern = "Food and Agriculture Organization", fields = "sourceOrg")

# with regular expression operators
# 'poverty' OR 'unemployment' OR 'employment'
wbsearch(pattern = "poverty|unemployment|employment")

Download updated data source information from World Bank API

Description

Download updated information on available data sources from the World Bank API

Usage

wbsources(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available data scources with related information

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin


wbstats: An R package for searching and downloading data from the World Bank API.

Description

The wbstats package provides structured access to data available from the World Bank API including; support for mutliple languages, access to annual, quarterly, and monthly data.


Download updated indicator topic information from World Bank API

Description

Download updated information on available indicator topics from the World Bank API

Usage

wbtopics(lang = c("en", "es", "fr", "ar", "zh"))

Arguments

lang

Language in which to return the results. If lang is unspecified, english is the default.

Value

A data frame of available indicator topics with related information

Note

Not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA. For an enumeration of supported languages by data source please see wbdatacatalog. The options for lang are:

  • en: English

  • es: Spanish

  • fr: French

  • ar: Arabic

  • zh: Mandarin