This blog is run by Jason Jon Benedict and Doug Beare to share insights and developments on open source software that can be used to analyze patterns and trends in in all types of data from the natural world. Jason currently works as a geospatial professional based in Malaysia. Doug lives in the United Kingdom and is currently Director of Globefish Consultancy Services which provides scientific advice to organisations that currently include the STECF [Scientific, Technical and Economic Committee for Fisheries, https://stecf.jrc.europe.eu/] and ICCAT, https://www.iccat.int/en/

Saturday, 19 July 2014

Examining seasonal & daily change in weather on Penang Island using circular plots


Key points of post

  • Circular plots are useful for summarizing weather data simultaneously exposing seasonal and daily changes;
  • The most uncomfortable (heat index) time of year for people living in Penang are the months of April and May although January and February are the hottest in absolute terms.

In this blog we demonstrate the use of circular plots which are handy for summarizing how any one variable is affected simultaneously by two others. The four circular plots below show how changes in average (based on the years 1980-2014) air temperature, wind-speed, relative humidity and heat index typically depend on both month and time of day. The circles are divided into 12 segments for each month, and into 24 concentric circles for each hour of the day, midnight being the center. The magnitudes of each variable are coded by color. The warmest months are February, March, and April, temperatures peaking between around 1pm and 4pm every day. The nights after about 11pm are clearly much cooler. Penang is only around 5 degrees north of the equator but there is nevertheless some variation around day-length and this is reflected in the relatively large light blue band on the outer most rings of the plot in October, November, and December. 

The patterns for wind speed are rather curious, presumably reflecting the Monsoons which typically switch between the North-East (December to February) and the South-West (April to August). Wind-speeds in Penang are apparently highest between January and March with a smaller peak seen in July and August.  They are also strongest between late morning and early afternoon in all seasons/months. Each day winds typically ‘get up’ around 11am blowing relatively strongly until around 5pm coinciding with the hottest part of the day.

Circular plots for selected weather variables showing how average (1980-2014) daily and seasonal patterns tend to covary


Typical seasonal daily changes in relative humidity are described by the third circular plot. Relative humidity is based on the difference between ‘dry bulb’ and ‘wet bulb’ air temperatures.  It describes how easily water can evaporate from any surface.  Since humans cool by sweating (the change in state of water as it evaporates from human skin causes a change in state from liquid to gas which requires a considerable amount of energy/heat, link to article on latent heat of evaporation) relative humidity is an important variable to consider.  High levels of relative humidity indicate that the air is already saturated with water vapour, and absorbing more will be difficult.  Similarly water will evaporate into air with low relative humidity more readily leading to improved function of the human air-conditioning system.  Air movement (or wind) also encourages evaporative loss from human skin improving cooling efficacy as anyone who has switched on a fan will know.

The combination of (dry bulb) air temperatures with data for relative humidity, and wind-speed facilitates the calculation of a ‘heat index’ which is more directly related to ‘human comfort’ than any one of its constituent variables. When the heat index is high the weather will be uncomfortable and, when low, more bearable.  In a previous blog post, we examined average air temperatures and rainfall in Penang suggesting that January and February were the worst times of year to visit Penang since they are usually the hottest and October the best since it is usually coolest. The heat index plotted in the circular plot suggests a slightly different story, however.  April and May have the highest heat indices and are probably potentially the most uncomfortable times of year to visit Penang while the entire period between July and December have similar heat indices.

To produce the plot above, we have again sourced the data from NOAA's National Climatic Data Centre with the hourly dataset coming from the Integrated Surface Database.

We also provide the R code as usual (below) if you would like to produce similar plots for your area of interest.

# Load package libraries
 
library(ggplot2)
library(scales)
library(plyr)
library(dplyr)
library(reshape2)
library(circular)
library(lubridate)
library(grid)
library(zoo)
library(gridExtra)
library(weathermetrics)
library(RColorBrewer)
 
# Setting work directory
 
setwd("d:\\ClimData")
 
# Reading and reformatting raw hourly data downloaded from NCDC
 
dat<-read.table("831677118578dat.txt",header=TRUE,fill=TRUE,na.strings=c("*","**","***","****","*****","******","0.00T*****"))
 
colnames(dat)<-tolower(colnames(dat))
 
Sys.setenv(TZ = "UTC")
dat$dates <- as.POSIXct(strptime(dat$yr..modahrmn,format="%Y%m%d%H%M"))  + 8 * 60 * 60
 
dat$year <- as.numeric(format(dat$dates,"%Y"))
 
dat$month <- as.numeric(format(dat$dates,"%m"))
 
dat$hour<-substring(as.character(dat$dates),12,13)
 
dat$min<-substr(dat$dates,15,16)
 
dat$time<-paste(dat$hour,dat$min,sep=":")
 
dat$tempc <- (dat$temp-32) * (5/9)
 
dat$dewpc <- (dat$dewp-32) * (5/9)
 
dat$tempc[dat$tempc<=10] <- NA
 
dat$tempc[dat$tempc>=40] <- NA
 
dat$dir[dat$dir == 990.0] <- NA 
 
dat$wspd <- (dat$spd)*0.44704
 
# Convert precipitation from inches to mms
 
dat$rain  <- dat$pcp24*25.4
 
# Calculate relative humidity & heat index using weathermetrics package
 
dat$rh <- dewpoint.to.humidity(t = dat$tempc, dp = dat$dewpc, temperature.metric = "celsius")
 
dat$hi <- heat.index(t = dat$tempc,rh = dat$rh,temperature.metric = "celsius",output.metric = "celsius",round = 2)
 
# Commands to reformat dates
 
dat$year <- as.numeric(as.POSIXlt(dat$dates)$year+1900)
dat$month <- as.numeric(as.POSIXlt(dat$dates)$mon+1)
dat$monthf <- factor(dat$month,levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE)
dat$weekday <- as.POSIXlt(dat$dates)$wday
dat$weekdayf <- factor(dat$weekday,levels=rev(0:6),labels=rev(c("Mon","Tue","Wed","Thu","Fri","Sat","Sun")),ordered=TRUE)
dat$yearmonth <- as.yearmon(dat$dates)
dat$yearmonthf <- factor(dat$yearmonth)
dat$week <- as.numeric(format(as.Date(dat$dates),"%W"))
dat$hour <- as.numeric(format(strptime(dat$dates, format = "%Y-%m-%d %H:%M"),format = "%H"))
 
dat <- ddply(dat,.(yearmonthf),transform,monthweek=1+week-min(week))
 
# Extract data from 1980 onwards
 
dat1 <- subset(dat, year >= 1980 )
 
# Summarize data for weather variables by hour and month
 
dat2 <- ddply(dat1,.(monthf,hour),summarize, wspd = mean(wspd,na.rm=T))
 
dat3 <- ddply(dat1,.(monthf,hour),summarize, temp = mean(tempc,na.rm=T))
 
dat4 <- ddply(dat1,.(monthf,hour),summarize, hi = mean(hi,na.rm=T))
 
dat6 <- ddply(dat1,.(monthf,hour),summarize, rh = mean(rh,na.rm=T))
 
## Plot Temperature circular Plot
 
p1 = ggplot(dat3, aes(x=monthf, y=hour, fill=temp)) +
  geom_tile(colour="grey70") +
  scale_fill_gradientn(colours = c("#99CCFF","#81BEF7","#FFFFBD","#FFAE63","#FF6600","#DF0101"),name="Temperature\n(Degree C)\n")+
  scale_y_continuous(breaks = seq(0,23),
                     labels=c("12.00am","1:00am","2:00am","3:00am","4:00am","5:00am","6:00am","7:00am","8:00am","9:00am","10:00am","11:00am","12:00pm",
                              "1:00pm","2:00pm","3:00pm","4:00pm","5:00pm","6:00pm","7:00pm","8:00pm","9:00pm","10:00pm","11:00pm")) +
  coord_polar(theta="x") +
  ylab("HOUR OF DAY")+
  xlab("")+
  ggtitle("Temperature")+
  theme(panel.background=element_blank(),
  axis.title.y=element_text(size=10,hjust=0.75,colour="grey20"),
  axis.title.x=element_text(size=7,colour="grey20"),
  panel.grid=element_blank(),
  axis.ticks=element_blank(),
  axis.text.y=element_text(size=5,colour="grey20"),
  axis.text.x=element_text(size=10,colour="grey20",face="bold"),
  plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
  plot.margin = unit(c(-0.25,0.1,-1,0.25), "in"),
  legend.key.width=unit(c(0.2,0.2),"in"))
 
p1
 
 
## Plot Wind Speed circular plot
 
p2 = ggplot(dat2, aes(x=monthf, y=hour, fill=wspd)) +
  geom_tile(colour="grey70") +
  scale_fill_gradientn(colours = rev(topo.colors(7)),name="Wind Speed\n(meter/sec)\n")+
  scale_y_continuous(breaks = seq(0,23),
                     labels=c("12.00am","1:00am","2:00am","3:00am","4:00am","5:00am","6:00am","7:00am","8:00am","9:00am","10:00am","11:00am","12:00pm",
                              "1:00pm","2:00pm","3:00pm","4:00pm","5:00pm","6:00pm","7:00pm","8:00pm","9:00pm","10:00pm","11:00pm")) +
  coord_polar(theta="x") +
  ylab("HOUR OF DAY")+
  xlab("")+
  ggtitle("Wind Speed")+
  theme(panel.background=element_blank(),
  axis.title.y=element_text(size=10,hjust=0.75,colour="grey20"),
  axis.title.x=element_text(size=7,colour="grey20"),
  panel.grid=element_blank(),
  axis.ticks=element_blank(),
  axis.text.y=element_text(size=5,colour="grey20"),
  axis.text.x=element_text(size=10,colour="grey20",face="bold"),
  plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
  plot.margin = unit(c(-0.25,0.25,-1,0.25), "in"),
  legend.key.width=unit(c(0.2,0.2),"in"))
 
p2
 
 
# Plot Heat Index circular plor
 
p3 = ggplot(dat4, aes(x=monthf, y=hour, fill=hi)) +
  geom_tile(colour="grey70") +
  scale_fill_gradientn(colours = c("#99CCFF","#81BEF7","#FFFFBD","#FFAE63","#FF6600","#DF0101"),name="Temperature\n(Degree C)\n")+
  scale_y_continuous(breaks = seq(0,23),
                     labels=c("12.00am","1:00am","2:00am","3:00am","4:00am","5:00am","6:00am","7:00am","8:00am","9:00am","10:00am","11:00am","12:00pm",
                              "1:00pm","2:00pm","3:00pm","4:00pm","5:00pm","6:00pm","7:00pm","8:00pm","9:00pm","10:00pm","11:00pm")) +
  coord_polar(theta="x") +
  ylab("HOUR OF DAY")+
  xlab("")+
  ggtitle("Heat Index")+
  theme(panel.background=element_blank(),
  axis.title.y=element_text(size=10,hjust=0.75,colour="grey20"),
  axis.title.x=element_text(size=7,colour="grey20"),
  panel.grid=element_blank(),
  axis.ticks=element_blank(),
  axis.text.y=element_text(size=5,colour="grey20"),
  axis.text.x=element_text(size=10,colour="grey20",face="bold"),
  plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
  plot.margin = unit(c(-0.5,0.25,-0.25,0.25), "in"),
  legend.key.width=unit(c(0.2,0.2),"in"))
 
p3
 
 
# Plot Relative Humidity circular plor
 
col<-brewer.pal(11,"Spectral")
 
p4 = ggplot(dat6, aes(x=monthf, y=hour, fill=rh)) +
  geom_tile(colour="grey70") +
  scale_fill_gradientn(colours=col,name="Relative Humidity\n(%)\n")+
  scale_y_continuous(breaks = seq(0,23),
                     labels=c("12.00am","1:00am","2:00am","3:00am","4:00am","5:00am","6:00am","7:00am","8:00am","9:00am","10:00am","11:00am","12:00pm",
                              "1:00pm","2:00pm","3:00pm","4:00pm","5:00pm","6:00pm","7:00pm","8:00pm","9:00pm","10:00pm","11:00pm")) +
  coord_polar(theta="x") +
  ylab("HOUR OF DAY")+
  xlab("")+
  ggtitle("Relative Humidity")+
  theme(panel.background=element_blank(),
  axis.title.y=element_text(size=10,hjust=0.75,colour="grey20"),
  axis.title.x=element_text(size=7,colour="grey20"),
  panel.grid=element_blank(),
  axis.ticks=element_blank(),
  axis.text.y=element_text(size=5,colour="grey20"),
  axis.text.x=element_text(size=10,colour="grey20",face="bold"),
  plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
  plot.margin = unit(c(-0.5,0.1,-0.25,0.25), "in"),
  legend.key.width=unit(c(0.2,0.2),"in"))
 
p4
 
## Plot and export to png file
 
png(file="Weather_Variables_Circular_Plot.png",width = 12, height = 10, units = "in",
    bg = "white", res = 400, family = "", restoreConsole = TRUE,
    type = "cairo")
 
grid.arrange(p1,p2,p4,p3,nrow=2,main=textGrob("\nWeather Variables by Month and Hour\n(Bayan Lepas Weather Station)",
gp=gpar(fontsize=18,col="grey20",fontface="bold")),sub=textGrob("Source: NOAA National Climatic Data Centre (NCDC)\n",
                                                                                                              gp=gpar(fontsize=9,col="grey20",fontface="italic")))
 
dev.off()
Created by Pretty R at inside-R.org