443-970-2353
[email protected]
CV Resume
In this post, we will create a world map that shows world cities using ggplot2. We will get the cities and their attributies from Wikipedia. To scrape the data from Wikipedia, we will use the rvest R package, which is very user friendly for web scraping.
After we get the world cities data from Wikipedia, we will use the string manipulation packages stringi and stringr.
library(rvest)
library(stringi)
library(stringr)
library(ggplot2)
library(ggmap)
library(maptools)
library(maps)
wiki= read_html("https://en.wikipedia.org/wiki/List_of_cities_by_latitude")
cities=data.frame(c())
for(i in seq(3,19)){
table=wiki %>%
html_nodes("table") %>%
.[[i]]%>%
html_table(fill=T)
table=table[,1:5]
names(table)=c("Latitude","Longitude", "City","Province/State","Country")
cities=rbind(cities,table)
}
latlon=cities[,1:2]
latlon=str_replace_all(latlon, "[^[:alnum:]]", " ")
latlon=iconv(latlon, "latin1", "ASCII", sub="")
latlon=stri_sub(latlon,4)
lat=latlon[1]
lon=latlon[2]
lat=unlist(str_split(lat," "))
lon=unlist(str_split(lon," "))
z=as.data.frame(list(degree=c(),minute=c(),NS=c()))
for(i in 1:length(lat)){
a=lat[i]
a= unlist(str_split(a,"\\s+"))
if(length(a)>3){
if(nchar(a[1])==0){
a=a[2:length(a)]
}
}
b= as.data.frame(list(degree=as.numeric(as.character(a[1])),minute=as.numeric(as.character(a[2])),NS=a[3]))
z=rbind(z,b)
}
lat=z
z=as.data.frame(list(degree=c(),minute=c(),NS=c()))
for(i in 1:length(lon)){
a=lon[i]
a= unlist(str_split(a,"\\s+"))
if(length(a)>3){
if(nchar(a[1])==0){
a=a[2:length(a)]
}
}
a= as.data.frame(list(degree=as.numeric(as.character(a[1])),minute=as.numeric(as.character(a[2])),NS=a[3]))
z=rbind(z,a)
}
lon=z
lat[,2]=lat[,2]/60
lon[,2]=lon[,2]/60
lat[,1]=lat[,1]+lat[,2]
lon[,1]=lon[,1]+lon[,2]
for(i in 1:nrow(lat)){
if(lat[i,3]=="S"){
lat[i,1]=lat[i,1]*-1
}
}
for(i in 1:nrow(lon)){
if(lon[i,3]!="E"){
lon[i,1]=lon[i,1]*-1
}
}
Latitude=lat[,1]
Longitude=lon[,1]
mapWorld <- borders("world", colour="gray50", fill="lightblue") # create a layer of borders
wp<- ggplot() + mapWorld
wp
wp <- wp+ geom_point(aes(x=Longitude,y=Latitude) ,color="blue", size=3)
wp
wp+ ggtitle("World Cities")+theme(axis.text.y = element_blank(),
line = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.title = element_text(vjust=1.5,size = 30,colour="Red"),
panel.border = element_rect(colour = "gray70", fill=NA, size=1))
The Tableau visualization below is made using the data created above with the R code. The cities are represented by points. If you hover over the circles, you can read the city and other associated attributes.
In this blog post, we scrappped world cities with their latitude and longitude information from Wikipedia using the rvest package and used the packages stringi and stringr for string manipulation. Finally, we used the ggplot2 package to create the map of the world that shows world cities.