Fisseha Berhane, PhD

Data Scientist

443-970-2353 fisseha@jhu.edu CV Resume Linkedin GitHub twitter twitter

Integrating R and Tableau

Today, I started playing with Tableau, an interactive data visualization product. The fact that R and Tableau can be integrated gives the possibility of taking advantage of both tools. We can do data cleaning, advanced analytics, modeling and machine learning, etc, in R and visualize our results in Tableau. We can visualize our results in R as far as we are comfortable with the coding, however, using Tableau helps us to save time and get insights momentarily.

We can also build interactive dashboards in R with shiny and embed it in Tableau. This enables us to take advantage of the various statistical analysis and machine learning packages in R.

In this post, I will download quarterly datasets from the FDA Adverse Event Reporting System (FAERS) website, concatenate them, and do some data cleaning in R. Since the country names are two-letter codes, I will change them to the appropriate country names. Then, I will visualize the data in Tableau. Let's visualize advere event reports by country.

The data covers since the fourth quarter of 2012 to the second quarter of 2015 (at the time of writing this post).

Working with R

require(downloader)
library(dplyr)
year_start=2012
year_last=as.integer(format(Sys.Date(), "%Y"))


for (i in year_start:year_last){
    
    if (i==2012){
        url<-"http://www.nber.org/fda/faers/2012/demo2012q4.csv.zip"
        
        download(url,dest="data.zip")
        unzip ("data.zip")
    }
    
    else{
        if (i<year_last){
            j=c(1:4)
            
            for (m in j){
            
            url<-paste0("http://www.nber.org/fda/faers/",i,"/demo",i,"q",m,".csv.zip")
            
            download(url,dest="data.zip")
            unzip ("data.zip")
            }
        }
            
        else if (i==year_last){
            j=c(1:2)
            for (m in j){
                
                url<-paste0("http://www.nber.org/fda/faers/",i,"/demo",i,"q",m,".csv.zip")
                
            download(url,dest="data.zip")
            unzip ("data.zip")
            
            }
            }
            
        }
        
    }

After downloading the quartely data sets, let's concatenate them. The unzipped data set for the first quarter of 2013 is demo2013q1.csv

The demographic data contains various variables. "occr_country" is the country where the adverse event took place.

We can use the select function from the dplyr package.

demo = select(read.csv("demo2012q4.csv",stringsAsFactors=F),occr_country) 
 
 for (i in year_start+1:year_last){
          
        if (i<year_last){
             j=c(1:4)
             
             for (m in j){
                 
            demo=rbind(demo,select(read.csv(paste0("demo",i,"q",m,".csv"),stringsAsFactors=F),occr_country))
                 
             }
         }
         
         else if (i==year_last){
             j = c(1:2)
             
             for (m in j){
            demo=rbind(demo,select(read.csv(paste0("demo",i,"q",m,".csv"),stringsAsFactors=F),occr_country))
             }
         }
         
     }

The "occr_country" variable in the FAERS data is two letter country code. We can map the code to the country name using the appropriate codes from this website.

Let's change the variable name "occr_country" to "Code" to merge it with the country names and codes data.

demo=rename(demo,Code=occr_country)
c_codes<-read.csv("http://data.okfn.org/data/core/country-list/r/data.csv")

Now, let's merge the demography data with the c_codes, which contains country codes and full names.

demo=merge(demo, c_codes,by ="Code",all.x = TRUE) #left outer join
demo=rename(demo,Country=Name)

Let's calculate adverse event reports by country using the dplyr package.

demo<-demo%>%group_by(Country)%>%summarise(count = n())
demo=mutate(demo,log_count=log10(count))

Now, let's save our data as csv and use Tableau to visualize it.

write.csv(demo,'demography.csv')

Working with Tableau

comments powered by Disqus