library(npi)
This vignette provides an brief introduction to the npi package.
npi
is an R package that allows R users to access the U.S. National Provider
Identifier (NPI) Registry API by the Center for Medicare and
Medicaid Services (CMS).
The package makes it easy to obtain administrative data linked to a specific individual or organizational healthcare provider. Additionally, users can perform advanced searches based on provider name, location, type of service, credentials, and many other attributes.
To explore organizational providers with primary locations in New
York City, we could use the city
argument in the
npi_search()
. The nyc dataset here finds 10 organizational
providers with primary locations in New York City, since 10 is the
default number of records that are returned in
npi_search()
. The response is a tibble that has
high-cardinality data organized into list columns.
<- npi_search(city = "New York City")
nyc #> 10 records requested
#> Requesting records 0-10...
nyc#> # A tibble: 10 × 11
#> npi enume…¹ basic other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#> * <chr> <chr> <list> <list> <list> <list> <list> <list> <list>
#> 1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 2 more variables: created_date <dttm>, last_updated_date <dttm>, and
#> # abbreviated variable names ¹enumeration_type, ²other_names, ³identifiers,
#> # ⁴taxonomies, ⁵addresses, ⁶practice_locations, ⁷endpoints
Other search arguments for the function include number
,
enumeration_type
, taxonomy_description
,
first_name
, last_name
,
use_first_name_alias
, organization_name
,
address_purpose
, state
,
postal_code
, country_code
, and
limit
.
Additionally, more than one search argument can be used at once.
<- npi_search(city = "New York City", state = "NY", enumeration_type = "org")
nyc_multi #> 10 records requested
#> Requesting records 0-10...
nyc_multi#> # A tibble: 10 × 11
#> npi enume…¹ basic other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#> * <chr> <chr> <list> <list> <list> <list> <list> <list> <list>
#> 1 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 2 15886… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 3 16292… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 4 15383… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 5 10637… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 6 12354… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 7 12452… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 8 12353… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 9 11849… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 16799… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 2 more variables: created_date <dttm>, last_updated_date <dttm>, and
#> # abbreviated variable names ¹enumeration_type, ²other_names, ³identifiers,
#> # ⁴taxonomies, ⁵addresses, ⁶practice_locations, ⁷endpoints
Visit the function’s help page via ?npi_search
after
installing and loading the package for more details.
The limit
argument of npi_search()
lets you
set the maximum records to return from 1 to 1200 inclusive, defaulting
to 10 records if no value is specified.
<- npi_search(city = "New York City", limit = 25)
nyc_25 #> 25 records requested
#> Requesting records 0-25...
nyc_25#> # A tibble: 25 × 11
#> npi enume…¹ basic other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#> * <chr> <chr> <list> <list> <list> <list> <list> <list> <list>
#> 1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 15 more rows, 2 more variables: created_date <dttm>,
#> # last_updated_date <dttm>, and abbreviated variable names ¹enumeration_type,
#> # ²other_names, ³identifiers, ⁴taxonomies, ⁵addresses, ⁶practice_locations,
#> # ⁷endpoints
When using npi_search()
, searches with greater than 200
records (for example 300 records) may result in multiple API calls. This
is because the API itself returns up to 200 records per request, but
allows previously requested records to be skipped.
npi_search()
will automatically make additional API calls
up to the API’s limit of 1200 records for a unique set of query
parameter values, and will still return a single tibble. However, to
save time, the function only makes additional requests if needed. For
example, if you request 1200 records, and 199 are returned in the first
request, then the function does not need to make a second request
because there are no more records to return.
<- npi_search(city = "New York City", limit = 300)
nyc_300 #> 300 records requested
#> Requesting records 0-200...
#> Requesting records 200-300...
nyc_300#> # A tibble: 300 × 11
#> npi enume…¹ basic other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#> * <chr> <chr> <list> <list> <list> <list> <list> <list> <list>
#> 1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 290 more rows, 2 more variables: created_date <dttm>,
#> # last_updated_date <dttm>, and abbreviated variable names ¹enumeration_type,
#> # ²other_names, ³identifiers, ⁴taxonomies, ⁵addresses, ⁶practice_locations,
#> # ⁷endpoints
The NPPES API documentation does not specify additional API rate limitations. However, if you need more than 1200 NPI records for a set of search terms, you will need to download the NPPES Data Dissemination File.
npi_summarize()
provides a more human-readable overview
of output already obtained through npi_search()
.
npi_summarize(nyc)
#> # A tibble: 10 × 6
#> npi name enumeration_type primary_practi…¹ phone prima…²
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1326214693 BENJAMIN BOWLING Individual <NA> 212-… Psychi…
#> 2 1356498703 MICHAEL SCHMIDT Individual 4401 BRONX BOUL… 718-… Intern…
#> 3 1497228076 VIVIAN AYALA Individual <NA> 212-… Social…
#> 4 1972840189 BEVERLY SUAREZ LLC Organization 220-18 HORACE H… 718-… Psychi…
#> 5 1407906092 MELINDA SCHROEDER Individual <NA> 212-… Social…
#> 6 1366591505 ANNE GRIFFIN Individual 205 EAST 78TH S… 212-… Social…
#> 7 1851622625 TOD GRAPES Individual 169 MANHATTAN A… 212-… Social…
#> 8 1659422525 ELLEN FEINSTEIN Individual 441 W END AVE S… 212-… Intern…
#> 9 1669524237 LEE SHECHTMAN Individual 247 3RD AVE SUI… 212-… Kinesi…
#> 10 1093868499 BENJAMIN SADOCK Individual <NA> 212-… Day Tr…
#> # … with abbreviated variable names ¹primary_practice_address,
#> # ²primary_taxonomy
Additionally, users can flatten all the list columns using
npi_flatten()
.
npi_flatten(nyc)
#> # A tibble: 30 × 48
#> npi basic…¹ basic…² basic…³ basic…⁴ basic…⁵ basic…⁶ basic…⁷ basic…⁸ basic…⁹
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1093… BENJAM… SADOCK JAMES MD NO M 2007-0… 2007-0… A
#> 2 1093… BENJAM… SADOCK JAMES MD NO M 2007-0… 2007-0… A
#> 3 1326… BENJAM… BOWLING DOUGLAS M.D. NO M 2008-0… 2014-0… A
#> 4 1326… BENJAM… BOWLING DOUGLAS M.D. NO M 2008-0… 2014-0… A
#> 5 1356… MICHAEL SCHMIDT THOMAS MSW LC… NO M 2007-0… 2007-0… A
#> 6 1356… MICHAEL SCHMIDT THOMAS MSW LC… NO M 2007-0… 2007-0… A
#> 7 1366… ANNE GRIFFIN MCLEAN MD NO F 2007-0… 2007-0… A
#> 8 1366… ANNE GRIFFIN MCLEAN MD NO F 2007-0… 2007-0… A
#> 9 1407… MELINDA SCHROE… LUCY LCSW NO F 2007-0… 2007-0… A
#> 10 1407… MELINDA SCHROE… LUCY LCSW NO F 2007-0… 2007-0… A
#> # … with 20 more rows, 38 more variables: basic_name_prefix <chr>,
#> # basic_name_suffix <chr>, basic_organization_name <chr>,
#> # basic_organizational_subpart <chr>,
#> # basic_authorized_official_first_name <chr>,
#> # basic_authorized_official_last_name <chr>,
#> # basic_authorized_official_middle_name <chr>,
#> # basic_authorized_official_telephone_number <chr>, …
Alternatively, individual columns can be flattened for each npi by
using the cols
argument. Only the columns specified will be
flattened and returned with the npi column by default.
npi_flatten(nyc, cols = c("basic", "taxonomies"))
#> # A tibble: 10 × 26
#> npi basic…¹ basic…² basic…³ basic…⁴ basic…⁵ basic…⁶ basic…⁷ basic…⁸ basic…⁹
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1093… BENJAM… SADOCK JAMES MD NO M 2007-0… 2007-0… A
#> 2 1326… BENJAM… BOWLING DOUGLAS M.D. NO M 2008-0… 2014-0… A
#> 3 1356… MICHAEL SCHMIDT THOMAS MSW LC… NO M 2007-0… 2007-0… A
#> 4 1366… ANNE GRIFFIN MCLEAN MD NO F 2007-0… 2007-0… A
#> 5 1407… MELINDA SCHROE… LUCY LCSW NO F 2007-0… 2007-0… A
#> 6 1497… VIVIAN AYALA ROSE <NA> NO F 2019-0… 2019-0… A
#> 7 1659… ELLEN FEINST… MARCH CSW YES F 2007-0… 2007-0… A
#> 8 1669… LEE SHECHT… <NA> M.D. YES M 2007-0… 2012-0… A
#> 9 1851… TOD GRAPES T B.S. e… YES M 2010-0… 2010-0… A
#> 10 1972… <NA> <NA> <NA> <NA> <NA> <NA> 2013-0… 2013-0… A
#> # … with 16 more variables: basic_name_prefix <chr>, basic_name_suffix <chr>,
#> # basic_organization_name <chr>, basic_organizational_subpart <chr>,
#> # basic_authorized_official_first_name <chr>,
#> # basic_authorized_official_last_name <chr>,
#> # basic_authorized_official_middle_name <chr>,
#> # basic_authorized_official_telephone_number <chr>,
#> # basic_authorized_official_title_or_position <chr>, …