Before we start the process of getting and visualizing the twitter data. Lets go ahead and take a peak at the packed barchart that will be the output of the process.
Before we can visualize any data we’ll have to gather it. There are R twitter packages out there, such as twitteR
, but I prefer to use a custom function (shown below). If you’d like to use the custom function, you’ll first need to provide your api keys/secrets.
api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token = 'xxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
#custom function to get tweets by a username
get_user_tweets = function(user, n, api_key, api_secret, access_token, access_token_secret) {
#set up oauth
auth = httr::oauth_app("twitter", key=api_key, secret=api_secret)
sig = httr::sign_oauth1.0(auth, token=access_token, token_secret=access_token_secret)
#loop through GETs getting max of 200 per iteration
nLeft = n
i = 0
timeline = vector("list", n)
while (nLeft > 0) {
nToGet = min(200, nLeft)
i = i+1
#build GET URL
if (i == 1) {
GETurl = paste0("https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=",
user,"&count=", nToGet)
} else {
GETurl = paste0("https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=",
user,"&count=", nToGet,"&max_id=", max_id)
}
#actual GET and content extract
timelineRaw = httr::GET(GETurl, sig)
timelineContent = httr::content(timelineRaw)
#accumulate content
if (i==1) {
timeline = timelineContent
} else {
timeline = c(timeline, timelineContent)
}
max_id = min(vapply(timelineContent, function(ls) ls$id, numeric(1)))
nLeft = nLeft - nToGet
}
return(timeline)
}
We now have a working function to get our tweet data. In the below chunk we call the function and wrangle the data into a nice data.table
structure for plotting.
#call custom function to get tweets for a given user
my_tweets = get_user_tweets(user="ASpannbauer", n=1000,
api_key, api_secret, access_token, access_token_secret)
#parse out the information we want into a list of data.tables
tweet_dt_list = lapply(my_tweets, function(tweet) {
data.table::data.table(time = tweet$created_at,
text = tweet$text,
user = tweet$user$screen_name,
fav_n = tweet$favorite_count,
rt_n = tweet$retweet_count)
})
#combine list into single data.table
tweet_dt = data.table::rbindlist(tweet_dt_list)
#remove retweets
tweet_dt = tweet_dt[!grepl("^RT", text), ]
#combine count of favorites and retweets into single count
tweet_dt[,total_fav_rt := fav_n + rt_n]
#truncate tweet text for a preview of the tweet in the viz
tweet_dt[,text_preview := paste0(substr(text, 1, 20), "...")]
#order by descending popularity
tweet_dt = tweet_dt[order(-total_fav_rt), ]
#inspect head of data
head(tweet_dt[, -c("time","text")])
## user fav_n rt_n total_fav_rt text_preview
## 1: ASpannbauer 295 56 351 Most NSFW minute in ...
## 2: ASpannbauer 160 20 180 Analyzing emotes in ...
## 3: ASpannbauer 55 13 68 I transformed my res...
## 4: ASpannbauer 55 5 60 Playing around with ...
## 5: ASpannbauer 18 14 32 Trump Doesnt like Mo...
## 6: ASpannbauer 21 9 30 New festive post on ...
Before plotting with the packed barchart let’s take a peak at the distribution of the metric we’ll be plotting. As we see in the plot below, this data is very skewed. This type of distribution is a good case for the packed barchart’s intended design.
plot(tweet_dt$total_fav_rt, type = 'l', ylab = "Fav|RT Count")
At this point, we’re ready to use the packed barchart to see our twitter data in a new light. To do this we call the function rPackedBar::plotly_packed_bar
and specify our options.
input_data
- the name of the data.frame
type object containing the data to plotlabel_column
- the column in the input_data
contining labels for our plotted numeric datavalue_column
- the column in the input_data
contining the numeric data to plotnumber_rows
- the number of rows our packed barchart will containplot_title
- the main title to display over the chartxaxis_label
- the title to display for the xaxishover_label
- title to appear in the hover informationmin_label_width
- parameter to prevent text labels spilling over the bounds of the barscolor_bar_color
- color of the largest colored bars in the chartlabel_color
- color of the labels to appear over the colored bars in the chartp = rPackedBar::plotly_packed_bar(input_data = tweet_dt[total_fav_rt > 0, ],
label_column = "text_preview",
value_column = "total_fav_rt",
number_rows = 4,
plot_title = "Tweet Interactions",
xaxis_label = "Favorites & RTs",
hover_label = "Favs & RTs",
min_label_width = .1,
color_bar_color = "#00aced",
label_color = "white")
plotly::config(p, displayModeBar = FALSE)