radous
allows you to generate random user data from the
Random User Generator API which can
be useful in many situations :
You can generate up to 5000 observations in one query.
You can radous
from CRAN with:
install.packages("radous")
radous
is extremely simple to use and has one function:
get_data()
.
Suppose we want to generate 10 random user data:
library(radous)
get_data(n = 10)
# A tibble: 10 x 34
~ location_street~
gender name_title name_first name_last location_street<chr> <chr> <chr> <chr> <dbl> <chr>
1 male Mr Phoenix King 2697 West Quay
2 female Miss Venla Lehto 1288 Rotuaari
3 female Miss Ingeborg Hatlevik 549 Revierstredet
4 male Mr Salazar Duarte 3498 Rua Primeiro de~
5 female Miss Addison Johnson 7674 Hardy Street
6 male Mr Bartholom~ Bräuer 5196 Erlenweg
7 female Miss Elise Renard 7354 Rue Louis-Garra~
8 male Mr Stanley Sims 1481 Shady Ln Dr
9 female Miss Beatrice Gagné 2575 Maple Ave
10 female Miss Florence Garza 8604 New Street
# ... with 28 more variables: location_city <chr>, location_state <chr>,
# location_country <chr>, location_postcode <chr>,
# location_coordinates_latitude <dbl>, location_coordinates_longitude <dbl>,
# location_timezone_offset <chr>, location_timezone_description <chr>,
# email <chr>, login_uuid <chr>, login_username <chr>, login_password <chr>,
# login_salt <chr>, login_md5 <chr>, login_sha1 <chr>, login_sha256 <chr>,
# dob_date <dttm>, dob_age <dbl>, registered_date <dttm>,
# registered_age <dbl>, phone <chr>, cell <chr>, id_name <chr>,
# id_value <chr>, picture_large <chr>, picture_medium <chr>,
# picture_thumbnail <chr>, nat <chr>
If you want to generate always the same set of users, you can use the
seed
argument:
get_data(n = 5, seed = "1990")
# A tibble: 5 x 34
~ location_street~
gender name_title name_first name_last location_street<chr> <chr> <chr> <chr> <dbl> <chr>
1 female Mrs Bernice Duncan 9327 N Stelling Rd
2 female Mrs Kine Otnes 6127 Tjernveien
3 female Ms Keira King 7676 Mt Wellington H~
4 female Mrs Amelia Young 7477 Simcoe St
5 male Mr Júnio Santos 4240 Rua Pará
# ... with 28 more variables: location_city <chr>, location_state <chr>,
# location_country <chr>, location_postcode <chr>,
# location_coordinates_latitude <dbl>, location_coordinates_longitude <dbl>,
# location_timezone_offset <chr>, location_timezone_description <chr>,
# email <chr>, login_uuid <chr>, login_username <chr>, login_password <chr>,
# login_salt <chr>, login_md5 <chr>, login_sha1 <chr>, login_sha256 <chr>,
# dob_date <dttm>, dob_age <dbl>, registered_date <dttm>,
# registered_age <dbl>, phone <chr>, cell <chr>, id_name <chr>,
# id_value <chr>, picture_large <chr>, picture_medium <chr>,
# picture_thumbnail <chr>, nat <chr>
Let’s run the above code again to check if we get the same info:
get_data(n = 5, seed = "1990")
# A tibble: 5 x 34
~ location_street~
gender name_title name_first name_last location_street<chr> <chr> <chr> <chr> <dbl> <chr>
1 female Mrs Bernice Duncan 9327 N Stelling Rd
2 female Mrs Kine Otnes 6127 Tjernveien
3 female Ms Keira King 7676 Mt Wellington H~
4 female Mrs Amelia Young 7477 Simcoe St
5 male Mr Júnio Santos 4240 Rua Pará
# ... with 28 more variables: location_city <chr>, location_state <chr>,
# location_country <chr>, location_postcode <chr>,
# location_coordinates_latitude <dbl>, location_coordinates_longitude <dbl>,
# location_timezone_offset <chr>, location_timezone_description <chr>,
# email <chr>, login_uuid <chr>, login_username <chr>, login_password <chr>,
# login_salt <chr>, login_md5 <chr>, login_sha1 <chr>, login_sha256 <chr>,
# dob_date <dttm>, dob_age <dbl>, registered_date <dttm>,
# registered_age <dbl>, phone <chr>, cell <chr>, id_name <chr>,
# id_value <chr>, picture_large <chr>, picture_medium <chr>,
# picture_thumbnail <chr>, nat <chr>
If you need some user images, it’s easy to get:
library(dplyr)
<- get_data(n = 1) %>% select(picture_large) %>% pull()
random_image
::img(src = random_image, height = "150px", width = "150px") htmltools
Note that All randomly generated photos come from the authorized section of UI Faces.
radous
👨🏫The generated data has 34 variables (columns) with different types of information that you can play with. The data frame is particularly suited for teaching the tidyverse, here some examples:
Here we select columns that are related to users’ location:
library(tidyverse)
<- get_data(n = 500, seed = "123")
df
%>% select(contains("location"))
df # A tibble: 500 x 10
~ location_street~ location_city location_state
location_street<dbl> <chr> <chr> <chr>
1 9120 Rua Três Parnaíba Roraima
2 3420 Armagh Street Taupo West Coast
3 7871 Hämeentie Tyrnävä Åland
4 9456 Henry Street Kilkenny Wexford
5 8290 Bulevardi Tervo Tavastia Prop~
6 4794 Rue Bossuet Lamone Zug
7 1201 Richmond Road Brighton and~ Buckinghamshi~
8 2483 Cedar St Beaumont British Colum~
9 659 Vatan Cd Hakkâri Elazig
10 4841 Bagdat Cd Bursa Karaman
# ... with 490 more rows, and 6 more variables: location_country <chr>,
# location_postcode <chr>, location_coordinates_latitude <dbl>,
# location_coordinates_longitude <dbl>, location_timezone_offset <chr>,
# location_timezone_description <chr>
Getting the users that are US citizens:
%>% filter(nat == "US")
df # A tibble: 25 x 34
~ location_street~
gender name_title name_first name_last location_street<chr> <chr> <chr> <chr> <dbl> <chr>
1 female Miss Bella Palmer 6951 First Street
2 male Mr Joseph Gardner 8106 Eason Rd
3 female Mrs Marlene James 4385 Spring St
4 male Mr Raymond Day 6389 Spring Hill Rd
5 male Mr Lester Marshall 9574 White Oak Dr
6 male Mr Wyatt Stevens 3341 Ash Dr
7 female Ms Linda James 4549 Spring St
8 female Ms Darlene Lee 4457 Hunters Creek Dr
9 male Mr Nathaniel Henderson 6333 W Pecan St
10 male Mr Sean Stephens 3079 Dogwood Ave
# ... with 15 more rows, and 28 more variables: location_city <chr>,
# location_state <chr>, location_country <chr>, location_postcode <chr>,
# location_coordinates_latitude <dbl>, location_coordinates_longitude <dbl>,
# location_timezone_offset <chr>, location_timezone_description <chr>,
# email <chr>, login_uuid <chr>, login_username <chr>, login_password <chr>,
# login_salt <chr>, login_md5 <chr>, login_sha1 <chr>, login_sha256 <chr>,
# dob_date <dttm>, dob_age <dbl>, registered_date <dttm>,
# registered_age <dbl>, phone <chr>, cell <chr>, id_name <chr>,
# id_value <chr>, picture_large <chr>, picture_medium <chr>,
# picture_thumbnail <chr>, nat <chr>
Relocating the last column
nat
to the beginning:
%>% relocate(nat, before = gender)
df # A tibble: 500 x 34
~
nat gender name_title name_first name_last location_street<chr> <chr> <chr> <chr> <chr> <dbl>
1 BR male Mr Heldo Campos 9120
2 NZ female Mrs Peyton Jackson 3420
3 FI female Ms Viivi Ruona 7871
4 IE female Mrs Kaitlin Steward 9456
5 FI female Miss Linnea Pulkkinen 8290
6 CH female Madame Valentine Le Gall 4794
7 GB female Mrs Suzanna Miller 1201
8 CA male Mr Antoine Thompson 2483
9 TR female Miss Latife Kurutluo~ 659
10 TR male Mr Vedat Aydan 4841
# ... with 490 more rows, and 28 more variables: location_street_name <chr>,
# location_city <chr>, location_state <chr>, location_country <chr>,
# location_postcode <chr>, location_coordinates_latitude <dbl>,
# location_coordinates_longitude <dbl>, location_timezone_offset <chr>,
# location_timezone_description <chr>, email <chr>, login_uuid <chr>,
# login_username <chr>, login_password <chr>, login_salt <chr>,
# login_md5 <chr>, login_sha1 <chr>, login_sha256 <chr>, dob_date <dttm>,
# dob_age <dbl>, registered_date <dttm>, registered_age <dbl>, phone <chr>,
# cell <chr>, id_name <chr>, id_value <chr>, picture_large <chr>,
# picture_medium <chr>, picture_thumbnail <chr>
Calculating median age by gender:
%>% group_by(gender) %>%
df summarise(median_age = median(dob_age))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 2
gender median_age<chr> <dbl>
1 female 51.5
2 male 49
Getting the number of users per country of residence:
%>%
df count(location_country) %>%
arrange(desc(n))
# A tibble: 17 x 2
location_country n<chr> <int>
1 Ireland 36
2 Turkey 36
3 Spain 35
4 Canada 34
5 France 33
6 Australia 32
7 Finland 32
8 Iran 32
9 New Zealand 30
10 Norway 30
11 Germany 29
12 United Kingdom 26
13 United States 25
14 Netherlands 24
15 Switzerland 24
16 Brazil 22
17 Denmark 20
Filtering out the users that have a cell number that begins with 081:
%>% select(1:3, cell) %>%
df filter(str_detect(cell, "081"))
# A tibble: 36 x 4
gender name_title name_first cell <chr> <chr> <chr> <chr>
1 female Mrs Kaitlin 081-087-1612
2 male Mr Jason 081-584-4669
3 male Mr Arnold 081-470-7126
4 male Mr Brent 081-614-3193
5 female Mrs Fiona 081-779-4190
6 female Mrs Megan 081-511-0321
7 female Ms Izzie 081-850-4070
8 male Mr Leslie 081-172-5148
9 male Mr Edgar 081-206-3946
10 female Ms Deborah 081-984-3691
# ... with 26 more rows
Please note that the radous project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.