Title: | Programmatic Access to Data and Statistics from the World Bank API |
---|---|
Description: | Search and download data from the World Bank Data API. |
Authors: | Jesse Piburn [aut, cre] |
Maintainer: | Jesse Piburn <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.5.9000 |
Built: | 2025-02-25 06:02:23 UTC |
Source: | https://github.com/gshs-ornl/wbstats |
This function downloads the requested information using the World Bank API
wb( country = "all", indicator, startdate, enddate, mrv, return_wide = FALSE, gapfill, freq, cache, lang = c("en", "es", "fr", "ar", "zh"), removeNA = TRUE, POSIXct = FALSE, include_dec = FALSE, include_unit = FALSE, include_obsStatus = FALSE, include_lastUpdated = FALSE )
wb( country = "all", indicator, startdate, enddate, mrv, return_wide = FALSE, gapfill, freq, cache, lang = c("en", "es", "fr", "ar", "zh"), removeNA = TRUE, POSIXct = FALSE, include_dec = FALSE, include_unit = FALSE, include_obsStatus = FALSE, include_lastUpdated = FALSE )
country |
Character vector of country or region codes. Default value is special code of |
indicator |
Character vector of indicator codes. These codes correspond to the |
startdate |
Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". This also accepts a special value of "YTD", useful for more frequently updated subannual indicators. |
enddate |
Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". |
mrv |
Numeric. The number of Most Recent Values to return. A replacement of |
return_wide |
Logical. If |
gapfill |
Logical. Works with |
freq |
Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values.
Currently works along with |
cache |
List of data frames returned from |
lang |
Language in which to return the results. If |
removeNA |
if |
POSIXct |
if |
include_dec |
if |
include_unit |
if |
include_obsStatus |
if |
include_lastUpdated |
if |
Data frame with all available requested data.
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
The POSIXct
parameter requries the use of lubridate
(>= 1.5.0). All dates
are rounded down to the floor. For example a value for the year 2016 would have a POSIXct
date of
2016-01-01
. If this package is not available and the POSIXct
parameter is set to TRUE
,
the parameter is ignored and a warning
is produced.
The include_dec
, include_unit
, and include_obsStatus
are defaulted to FALSE
because as of writing, all returns have a value of 0
, NA
, and NA
, respectively.
These columns might be used in the future by the API, therefore the option to include the column is available.
The include_lastUpdated
is defaulted to FALSE
as well to limit the
If there is no data available that matches the request parameters, an empty data frame is returned along with a
warning
. This design is for easy aggregation of multiple calls.
# GDP at market prices (current US$) for all available countries and regions wb(indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # GDP and Population in long format for the most recent 20 observations wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20) # GDP and Population in wide format for the most recent 20 observations wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20, return_wide = TRUE) # query using regionID or incomeID # High Income Countries and Sub-Saharan Africa (all income levels) wb(country = c("HIC", "SSF"), indicator = "NY.GDP.MKTP.CD", startdate = 1985, enddate = 1985) # if you do not know when the latest time an indicator is avaiable mrv can help wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 1) # increase the mrv value to increase the number of maximum number of returns wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35) # GDP at market prices (current US$) for only available countries wb(country = "countries_only", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # GDP at market prices (current US$) for only available aggregate regions wb(country = "aggregates", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # if you want to "fill-in" the values in between actual observations use gapfill = TRUE # this highlights a very important difference. # all other parameters are the same as above, except gapfill = TRUE # and the results are very different wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35, gapfill = TRUE) # if you want the most recent values within a certain time frame wb(country = c("US"), indicator = 'SI.DST.04TH.20', startdate = 1970, enddate = 2000, mrv = 2) # without the freq parameter the deafult temporal granularity search is yearly # should return the 12 most recent years of data wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12) # if another frequency is available for that indicator it can be accessed using the freq parameter # should return the 12 most recent months of data wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12, freq = "M")
# GDP at market prices (current US$) for all available countries and regions wb(indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # GDP and Population in long format for the most recent 20 observations wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20) # GDP and Population in wide format for the most recent 20 observations wb(indicator = c("SP.POP.TOTL","NY.GDP.MKTP.CD"), mrv = 20, return_wide = TRUE) # query using regionID or incomeID # High Income Countries and Sub-Saharan Africa (all income levels) wb(country = c("HIC", "SSF"), indicator = "NY.GDP.MKTP.CD", startdate = 1985, enddate = 1985) # if you do not know when the latest time an indicator is avaiable mrv can help wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 1) # increase the mrv value to increase the number of maximum number of returns wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35) # GDP at market prices (current US$) for only available countries wb(country = "countries_only", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # GDP at market prices (current US$) for only available aggregate regions wb(country = "aggregates", indicator = "NY.GDP.MKTP.CD", startdate = 2000, enddate = 2016) # if you want to "fill-in" the values in between actual observations use gapfill = TRUE # this highlights a very important difference. # all other parameters are the same as above, except gapfill = TRUE # and the results are very different wb(country = c("IN"), indicator = 'EG.ELC.ACCS.ZS', mrv = 35, gapfill = TRUE) # if you want the most recent values within a certain time frame wb(country = c("US"), indicator = 'SI.DST.04TH.20', startdate = 1970, enddate = 2000, mrv = 2) # without the freq parameter the deafult temporal granularity search is yearly # should return the 12 most recent years of data wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12) # if another frequency is available for that indicator it can be accessed using the freq parameter # should return the 12 most recent months of data wb(country = c("CHN", "IND"), indicator = "DPANUSSPF", mrv = 12, freq = "M")
Download an updated list of information regarding countries, indicators, sources, regions, indicator topics, lending types, income levels, and supported languages from the World Bank API
wb_cache(lang)
wb_cache(lang)
lang |
Language in which to return the results. If |
A list containing the following items:
countries
: The result of calling wb_countries()
indicators
: The result of calling wb_indicators()
sources
: The result of calling wb_sources()
topics
: The result of calling wb_topics()
regions
: The result of calling wb_regions()
income_levels
: The result of calling wb_income_levels()
lending_types
: The result of calling wb_lending_types()
languages
: The result of calling wb_languages()
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wb_languages()
Saving this return and using it has the cache
parameter in wb_data()
and wb_search()
replaces the default cached version wb_cachelist that comes with the package itself
This data is a cached result of the wb_cache
function.
By default functions wb_data
and wb_search
use this
data for the cache
parameter.
wb_cachelist
wb_cachelist
An object of class list
of length 8.
This data is a cached result of the wbcache
function.
By default functions wb
and wbsearch
use this
data for the cache
parameter.
wb_cachelist_dep
wb_cachelist_dep
A list containing 7 data frames:
countries
: A data frame. The result of calling wbcountries
indicators
: A data frame.The result of calling wbindicators
sources
: A data frame.The result of calling wbsources
datacatalog
: A data frame.The result of calling wbdatacatalog
topics
: A data frame.The result of calling wbtopics
income
: A data frame.The result of calling wbincome
lending
: A data frame.The result of calling wblending
This function downloads the requested information using the World Bank API
wb_data( indicator, country = "countries_only", start_date, end_date, return_wide = TRUE, mrv, mrnev, cache, freq, gapfill = FALSE, date_as_class_date = FALSE, lang )
wb_data( indicator, country = "countries_only", start_date, end_date, return_wide = TRUE, mrv, mrnev, cache, freq, gapfill = FALSE, date_as_class_date = FALSE, lang )
indicator |
Character vector of indicator codes. These codes correspond
to the |
country |
Character vector of country, region, or special value codes for the
locations you want to return data for. Permissible values can be found in the
countries tibble in wb_cachelist or by running
|
start_date |
Numeric or character. If numeric it must be in |
end_date |
Numeric or character. If numeric it must be in |
return_wide |
Logical. If |
mrv |
Numeric. The number of Most Recent Values to return. A replacement
of |
mrnev |
Numeric. The number of Most Recent Non Empty Values to return. A replacement
of |
cache |
List of tibbles returned from |
freq |
Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values. Useful for querying high frequency data. |
gapfill |
Logical. If |
date_as_class_date |
Logical. If |
lang |
Language in which to return the results. If |
obs_status
columnIndicates the observation status for location, indicator and date combination.
For example "F"
in the response indicates that the observation status for
that data point is "forecast".
a tibble of all available requested data.
# gdp for all countries for all available dates df_gdp <- wb_data("NY.GDP.MKTP.CD") # Brazilian gdp for all available dates df_brazil <- wb_data("NY.GDP.MKTP.CD", country = "br") # Brazilian gdp for 2006 df_brazil_1 <- wb_data("NY.GDP.MKTP.CD", country = "brazil", start_date = 2006) # Brazilian gdp for 2006-2010 df_brazil_2 <- wb_data("NY.GDP.MKTP.CD", country = "BRA", start_date = 2006, end_date = 2010) # Population, GDP, Unemployment Rate, Birth Rate (per 1000 people) my_indicators <- c("SP.POP.TOTL", "NY.GDP.MKTP.CD", "SL.UEM.TOTL.ZS", "SP.DYN.CBRT.IN") df <- wb_data(my_indicators) # you pass multiple country ids of different types # Albania (iso2c), Georgia (iso3c), and Mongolia my_countries <- c("AL", "Geo", "mongolia") df <- wb_data(my_indicators, country = my_countries, start_date = 2005, end_date = 2007) # same data as above, but in long format df_long <- wb_data(my_indicators, country = my_countries, start_date = 2005, end_date = 2007, return_wide = FALSE) # regional population totals # regions correspond to the region column in wb_cachelist$countries df_region <- wb_data("SP.POP.TOTL", country = "regions_only", start_date = 2010, end_date = 2014) # a specific region df_world <- wb_data("SP.POP.TOTL", country = "world", start_date = 2010, end_date = 2014) # if the indicator is part of a named vector the name will be the column name my_indicators <- c("pop" = "SP.POP.TOTL", "gdp" = "NY.GDP.MKTP.CD", "unemployment_rate" = "SL.UEM.TOTL.ZS", "birth_rate" = "SP.DYN.CBRT.IN") df_names <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014) # custom names are ignored if returning in long format df_names_long <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014, return_wide = FALSE) # same as above but in Bulgarian # note that not all indicators have translations for all languages df_names_long_bg <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014, return_wide = FALSE, lang = "bg")
# gdp for all countries for all available dates df_gdp <- wb_data("NY.GDP.MKTP.CD") # Brazilian gdp for all available dates df_brazil <- wb_data("NY.GDP.MKTP.CD", country = "br") # Brazilian gdp for 2006 df_brazil_1 <- wb_data("NY.GDP.MKTP.CD", country = "brazil", start_date = 2006) # Brazilian gdp for 2006-2010 df_brazil_2 <- wb_data("NY.GDP.MKTP.CD", country = "BRA", start_date = 2006, end_date = 2010) # Population, GDP, Unemployment Rate, Birth Rate (per 1000 people) my_indicators <- c("SP.POP.TOTL", "NY.GDP.MKTP.CD", "SL.UEM.TOTL.ZS", "SP.DYN.CBRT.IN") df <- wb_data(my_indicators) # you pass multiple country ids of different types # Albania (iso2c), Georgia (iso3c), and Mongolia my_countries <- c("AL", "Geo", "mongolia") df <- wb_data(my_indicators, country = my_countries, start_date = 2005, end_date = 2007) # same data as above, but in long format df_long <- wb_data(my_indicators, country = my_countries, start_date = 2005, end_date = 2007, return_wide = FALSE) # regional population totals # regions correspond to the region column in wb_cachelist$countries df_region <- wb_data("SP.POP.TOTL", country = "regions_only", start_date = 2010, end_date = 2014) # a specific region df_world <- wb_data("SP.POP.TOTL", country = "world", start_date = 2010, end_date = 2014) # if the indicator is part of a named vector the name will be the column name my_indicators <- c("pop" = "SP.POP.TOTL", "gdp" = "NY.GDP.MKTP.CD", "unemployment_rate" = "SL.UEM.TOTL.ZS", "birth_rate" = "SP.DYN.CBRT.IN") df_names <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014) # custom names are ignored if returning in long format df_names_long <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014, return_wide = FALSE) # same as above but in Bulgarian # note that not all indicators have translations for all languages df_names_long_bg <- wb_data(my_indicators, country = "world", start_date = 2010, end_date = 2014, return_wide = FALSE, lang = "bg")
These functions are simple wrappers around the various useful API end points
that are helpful for finding avaiable data and filtering the data you are
interested in when using wb_data()
wb_countries(lang) wb_topics(lang) wb_sources(lang) wb_regions(lang) wb_income_levels(lang) wb_lending_types(lang) wb_languages()
wb_countries(lang) wb_topics(lang) wb_sources(lang) wb_regions(lang) wb_income_levels(lang) wb_lending_types(lang) wb_languages()
lang |
Language in which to return the results. If |
A tibble
of information about the end point
This function returns a tibble of indicator IDs and related information that are available for download from the World Bank API
wb_indicators(lang, include_archive = FALSE)
wb_indicators(lang, include_archive = FALSE)
lang |
Language in which to return the results. If |
include_archive |
|
# can get a new list of available indicators by downloading new cache fresh_cache <- wb_cache() fresh_indicators <- fresh_cache$indicators # or by running the wb_indicators() function directly fresh_indicators <- wb_indicators() # include archived indicators # see include_archive parameter description indicators_with_achrive <- wb_indicators(include_archive = TRUE)
# can get a new list of available indicators by downloading new cache fresh_cache <- wb_cache() fresh_indicators <- fresh_cache$indicators # or by running the wb_indicators() function directly fresh_indicators <- wb_indicators() # include archived indicators # see include_archive parameter description indicators_with_achrive <- wb_indicators(include_archive = TRUE)
This function allows finds indicators that match a search term and returns a data frame of matching results
wb_search( pattern, fields = c("indicator_id", "indicator", "indicator_desc"), extra = FALSE, cache, ignore.case = TRUE, ... )
wb_search( pattern, fields = c("indicator_id", "indicator", "indicator_desc"), extra = FALSE, cache, ignore.case = TRUE, ... )
pattern |
Character string or regular expression to be matched |
fields |
Character vector of column names through which to search |
extra |
if FALSE, only the indicator ID and short name are returned,
if |
cache |
List of data frames returned from |
ignore.case |
if |
... |
Any additional |
a tibble with indicators that match the search pattern.
d <- wb_search(pattern = "education") d <- wb_search(pattern = "Food and Agriculture Organization", fields = "source_org") # with regular expression operators # 'poverty' OR 'unemployment' OR 'employment' d <- wb_search(pattern = "poverty|unemployment|employment") # pass any other grep argument along as well # everything without 'education' d <- wb_search(pattern = "education", invert = TRUE) # contains "gdp" AND "trade" d <- wb_search("^(?=.*gdp)(?=.*trade).*", perl = TRUE) # contains "gdp" and NOT "trade" d <- wb_search("^(?=.*gdp)(?!.*trade).*", perl = TRUE)
d <- wb_search(pattern = "education") d <- wb_search(pattern = "Food and Agriculture Organization", fields = "source_org") # with regular expression operators # 'poverty' OR 'unemployment' OR 'employment' d <- wb_search(pattern = "poverty|unemployment|employment") # pass any other grep argument along as well # everything without 'education' d <- wb_search(pattern = "education", invert = TRUE) # contains "gdp" AND "trade" d <- wb_search("^(?=.*gdp)(?=.*trade).*", perl = TRUE) # contains "gdp" and NOT "trade" d <- wb_search("^(?=.*gdp)(?!.*trade).*", perl = TRUE)
Download an updated list of information regarding countries, indicators, sources, data catalog, indicator topics, lending types, and income levels from the World Bank API
wbcache(lang = c("en", "es", "fr", "ar", "zh"))
wbcache(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A list containing the following items:
countries
: A data frame. The result of calling wbcountries
indicators
: A data frame.The result of calling wbindicators
sources
: A data frame.The result of calling wbsources
datacatalog
: A data frame.The result of calling wbdatacatalog
topics
: A data frame.The result of calling wbtopics
income
: A data frame.The result of calling wbincome
lending
: A data frame.The result of calling wblending
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
List item datacatalog
will always return in english, as the API does not support any
other langauges for that information.
Saving this return and using it has the cache
parameter in wb
and wbsearch
replaces the default cached version wb_cachelist
that comes with the package itself
Download updated information on available countries and regions from the World Bank API
wbcountries(lang = c("en", "es", "fr", "ar", "zh"))
wbcountries(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available countries and regions with related information
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
Download an updated list of the World Bank data catalog from the World Bank API
wbdatacatalog()
wbdatacatalog()
A data frame of the World Bank data catalog with related information
This function does not support any languages other than english due to the lack of support from the World Bank API
Download updated information on available income types from the World Bank API
wbincome(lang = c("en", "es", "fr", "ar", "zh"))
wbincome(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available income types with related information
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
Download updated information on available indicators from the World Bank API
wbindicators(lang = c("en", "es", "fr", "ar", "zh"))
wbindicators(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available indicators with related information
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
Download updated information on available lending types from the World Bank API
wblending(lang = c("en", "es", "fr", "ar", "zh"))
wblending(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available lending types with related information
Not all data returns have support for langauges other than english. If the specific return
does notsupport your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
This function allows finds indicators that match a search term and returns a data frame of matching results
wbsearch( pattern = "poverty", fields = c("indicator", "indicatorDesc"), extra = FALSE, cache )
wbsearch( pattern = "poverty", fields = c("indicator", "indicatorDesc"), extra = FALSE, cache )
pattern |
Character string or regular expression to be matched |
fields |
Character vector of column names through which to search |
extra |
if |
cache |
List of data frames returned from |
Data frame with indicators that match the search pattern.
wbsearch(pattern = "education") wbsearch(pattern = "Food and Agriculture Organization", fields = "sourceOrg") # with regular expression operators # 'poverty' OR 'unemployment' OR 'employment' wbsearch(pattern = "poverty|unemployment|employment")
wbsearch(pattern = "education") wbsearch(pattern = "Food and Agriculture Organization", fields = "sourceOrg") # with regular expression operators # 'poverty' OR 'unemployment' OR 'employment' wbsearch(pattern = "poverty|unemployment|employment")
Download updated information on available data sources from the World Bank API
wbsources(lang = c("en", "es", "fr", "ar", "zh"))
wbsources(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available data scources with related information
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin
The wbstats package provides structured access to data available from the World Bank API including; support for mutliple languages, access to annual, quarterly, and monthly data.
Download updated information on available indicator topics from the World Bank API
wbtopics(lang = c("en", "es", "fr", "ar", "zh"))
wbtopics(lang = c("en", "es", "fr", "ar", "zh"))
lang |
Language in which to return the results. If |
A data frame of available indicator topics with related information
Not all data returns have support for langauges other than english. If the specific return
does not support your requested language by default it will return NA
. For an enumeration of
supported languages by data source please see wbdatacatalog
.
The options for lang
are:
en
: English
es
: Spanish
fr
: French
ar
: Arabic
zh
: Mandarin