iso8601
R-packageThe package has special functions for transforming ISO8601 (ISO 8601-1:2019) strings into dates, date-times and times. These functions transform the strings into the corresponding R objects: ‘Date’, ‘POSIXct’ and ‘Time’ (which is a subclass of ‘POSIXct’, see below):
library(iso8601)
iso8601todate("2019-08-17")
## [1] "2019-08-17"
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601totime("T16:15:14")
## [1] "T16:15:14"
For converting to ‘Date’ the package should accept all valid formats described by ISO8601 as shown below:
iso8601todate("2019-08-17")
## [1] "2019-08-17"
iso8601todate("2019-08")
## [1] "2019-08-01"
iso8601todate("2019")
## [1] "2019-01-01"
iso8601todate("20190817")
## [1] "2019-08-17"
iso8601todate("2019-W33-6")
## [1] "2019-08-17"
iso8601todate("2019-W33")
## [1] "2019-08-12"
iso8601todate("2019W336")
## [1] "2019-08-17"
iso8601todate("2019W33")
## [1] "2019-08-12"
iso8601todate("2019-229")
## [1] "2019-08-17"
iso8601todate("2019229")
## [1] "2019-08-17"
iso8601todate("−0009-123")
## [1] "-9-05-03"
iso8601todate("-0009")
## [1] "-9-01-01"
iso8601todate("+002019-229", ndigitsyear = 6)
## [1] "2019-08-17"
As can be seen from the examples above, for incomplete dates, the missing parts are substituted by 1 as the ‘Date’ object cannot handle incomplete dates. It is also possible to mix different formats in one character vector.
Date-time strings consist of a date and a time separated by the character ‘T’. For the date part all complete date strings mentioned above are allowed. The time part can be specified both in expanded format (with colon) and compact format and as complete or incomplete times. In the latter case the missing parts are substituted by 0:
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17T1615")
## [1] "2019-08-17 16:15:00 CEST"
iso8601todatetime("2019-08-17T16Z")
## [1] "2019-08-17 16:00:00 GMT"
iso8601todatetime("+002019-08-17T16:15:14Z", ndigitsyear = 6)
## [1] "2019-08-17 16:15:14 GMT"
Fractional times are also allowed:
iso8601todatetime("2019-08-17T16:15:14,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15,24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T1615.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T1615,24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16.2539Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16,2539Z")
## [1] "2019-08-17 16:15:14 GMT"
When the date and time are in extended format the ‘T’ can be omitted
iso8601todatetime("2019-08-17 16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15:14,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15:14.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17 16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17 16:15.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15,24Z")
## [1] "2019-08-17 16:15:14 GMT"
Time-zones can be indicated by ‘Z’ (as in the examples above) which indicates UTC or Zulu time; or by an offset in hours or hours and minutes. When there is no time-zone indicator it is assumed that the times are in local time. Which time zone that is should be communicated otherwise; the package assumes it is the local time of the system on which R is running. A positive offset indicates time zones east of the prime meridian whose times are ahead of UTC and negative offsets indicate time zones west of the prime meridian.
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14+01:00")
## [1] "2019-08-17 15:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14±00:00")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14-01")
## [1] "2019-08-17 17:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14−00:00")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14")
## [1] "2019-08-17 16:15:14 CEST"
As shown above, when all date-times have either an offset or are in UTC, the times are converted and shown in UTC (for which R uses the string ‘GMT’). Otherwise, the date-times are shown in the local time zone.
iso8601todatetime
returns a ‘POSIXct’ object that has an
additional ‘timezone’ attribute that contains the original time
zones:
<- iso8601todatetime(c(
t "2019-08-17T16:15:14+01:00",
"2019-08-17T16:15:14+00",
"2019-08-17T16:15:14Z",
"2019-08-17T16:15:14",
"2019-08-17T16:15:14-05:30"
)) print(t)
## [1] "2019-08-17 17:15:14 CEST" "2019-08-17 18:15:14 CEST"
## [3] "2019-08-17 18:15:14 CEST" "2019-08-17 16:15:14 CEST"
## [5] "2019-08-17 23:45:14 CEST"
attr(t, "timezone")
## [1] "+01:00" "GMT" "GMT" "" "-05:30"
The iso8601totime
converts times (without date). It
accepts the following formats:
iso8601totime("T16:15:14")
## [1] "T16:15:14"
iso8601totime("T16:15:14,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14.00")
## [1] "T16:15:14"
iso8601totime("T161514")
## [1] "T16:15:14"
iso8601totime("T161514.00")
## [1] "T16:15:14"
iso8601totime("T161514,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14.00")
## [1] "T16:15:14"
iso8601totime("T161514.00")
## [1] "T16:15:14"
iso8601totime("T161514,00")
## [1] "T16:15:14"
iso8601totime("T16:15.24")
## [1] "T16:15:14"
iso8601totime("T16:15,24")
## [1] "T16:15:14"
iso8601totime("T1615.24")
## [1] "T16:15:14"
iso8601totime("T1615,24")
## [1] "T16:15:14"
iso8601totime("T16.2539")
## [1] "T16:15:14"
iso8601totime("T16,2539")
## [1] "T16:15:14"
When calling iso8601totime
we know that we are dealing
with times, therefore, the ‘T’ can be omitted
iso8601totime("16:15:14")
## [1] "T16:15:14"
iso8601totime("16:15:14,00")
## [1] "T16:15:14"
iso8601totime("16:15:14.00")
## [1] "T16:15:14"
iso8601totime("16:15:14,00")
## [1] "T16:15:14"
iso8601totime("16:15:14.00")
## [1] "T16:15:14"
iso8601totime("16:15.24")
## [1] "T16:15:14"
iso8601totime("16:15,24")
## [1] "T16:15:14"
iso8601totime("161514")
## [1] "T16:15:14"
iso8601totime("161514,00")
## [1] "T16:15:14"
iso8601totime("161514.00")
## [1] "T16:15:14"
iso8601totime("1615")
## [1] "T16:15:00"
iso8601totime("1615")
## [1] "T16:15:00"
iso8601totime("1615.24")
## [1] "T16:15:14"
iso8601totime("1615,24")
## [1] "T16:15:14"
Time zones are ignored as these are meaningless without date.
The object returned is of class
c("Time", "POSIXct", "POSIXt")
. It is therefore a subclass
of ‘POSIXct’. As this object encodes date-times, the times are encoded
as times on 1970-01-01. The ‘Time’ class handles proper display of the
object. Otherwise, it can be handled are a regular ‘POSIXct’ object.
<- iso8601totime("T16:15:14Z")
t print(t)
## [1] "T16:15:14"
class(t)
## [1] "Time" "POSIXct" "POSIXt"
class(t) <- class(t)[-1]
print(t)
## [1] "1970-01-01 16:15:14 GMT"
The function iso8601todataframe
will parse ISO8601
strings and split these into the separate parts. Only the parts present
in any of the strings are returned.
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+00",
"2019229T161514",
"T16:15"
))## type year month day week weekday yearday hour minutes seconds
## 1 Date 2019 8 17 NA NA NA NA NA NA
## 2 Date 2019 NA NA 33 6 NA NA NA NA
## 3 Datetime 2019 8 17 NA NA NA 16 15 14
## 4 Datetime 2019 NA NA NA NA 229 16 15 14
## 5 Time NA NA NA NA NA NA 16 15 NA
## tzoffsethours tzoffsetminutes
## 1 NA NA
## 2 NA NA
## 3 0 0
## 4 NA NA
## 5 NA NA
The ‘type’ column contains the type of ISO8601 string. For parts not present in the string is returned.
It is also possible to transform the dates to one format: either year-month-day or year-day:
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
transformdate = "toyearmonthday")
), ## type year month day hour minutes seconds tzoffsethours tzoffsetminutes
## 1 Date 2019 8 17 NA NA NA NA NA
## 2 Date 2019 8 17 NA NA NA NA NA
## 3 Datetime 2019 8 17 16 15 14 1 0
## 4 Datetime 2019 8 17 16 15 14 NA NA
## 5 Time NA NA NA 16 15 NA NA NA
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
transformdate = "toyearday")
), ## type year yearday hour minutes seconds tzoffsethours tzoffsetminutes
## 1 Date 2019 229 NA NA NA NA NA
## 2 Date 2019 229 NA NA NA NA NA
## 3 Datetime 2019 229 16 15 14 1 0
## 4 Datetime 2019 229 16 15 14 NA NA
## 5 Time NA NA 16 15 NA NA NA
iso8601type
returns a character vector whose elements
indicate the type of ISO8601 string:
iso8601type(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
))## [1] "YMD" "YWD" "YMDTHMS±Z" "YDTHMS" "THM"
iso8601standardise
transforms the dates into one
standard extended format:
iso8601standardise(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17 16:15:14+01",
"2019229T161514",
"T16:15"
))## [1] "2019-08-17" "2019-08-17" "2019-08-17T15:15:14Z"
## [4] "2019-08-17T16:15:14" "T16:15:00"
The fillmissing
arguments fills in missing parts (1 for
dates and 0 for times), toymd
transforms all dates to
year-month-day and tozulu
applies any time zone offsets and
transforms the times to UTC (times local time zones are not
affected):
iso8601standardise(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17 16:15:14+01",
"2019229T161514",
"T16:15"
), fillmissing = TRUE, toymd = TRUE, tozulu = TRUE)
Other options for parsing date, time (none of the packages support time strings) and date-times strings are:
parsedate
has the function parse_iso_8601
that supports ISO8601 dates
and date-times. The performance of this function is significantly less
than those from iso8601
.anytime
has the functions anytime
and anydate
to
convert date-time and time strings to time and date classes. It supports
many of the year-month-day formats but not year-weer-day or ordinal
dates. Also (this can be an advantage of disadvantage) it accepts many
more formats.lubridate
supports many or most of the year-month-day formats.as.Date
, as.POXIXct
and
strptime
from base R will also support most of the ISO8601
formats by specifying a appropriate format string except for time-zones.
This, however, requires that all strings have the same format.