This blog is run by Jason Jon Benedict and Doug Beare to share insights and developments on open source software that can be used to analyze patterns and trends in in all types of data from the natural world. Jason currently works as a geospatial professional based in Malaysia. Doug lives in the United Kingdom and is currently Director of Globefish Consultancy Services which provides scientific advice to organisations that currently include the STECF [Scientific, Technical and Economic Committee for Fisheries, https://stecf.jrc.europe.eu/] and ICCAT, https://www.iccat.int/en/

Friday, 30 January 2015

The recent deluge in Malaysia - using raincpc to map extreme rainfall events

Key point of post


  • To describe an application of raincpc to map the rainfall that led to the recent (December 2014) floods in Malaysia

Raincpc (http://cran.r-project.org/web/packages/raincpc/vignettes/raincpc_demo.html) is a new library for R that exploits the Climate Prediction Center’s (CPC, www.cpc.ncep.noaa.gov) global (1979 to present, 50km resolution) datasets for precipitation. It renders CPC’s rainfall data more readily available for plotting and visualization, allowing any user to conveniently side-step problems relating to changing data formats, file-naming conventions etc. And all this free of charge!

We thought it would be fun to demonstrate the use of raincpc, focusing on Malaysia which experienced devastating floods over the Christmas and New Year Period, leading to the evacuation of 1000s of people. The damage has been estimated to have cost ~ 2 billion RM. Please see the following links - http://en.wikipedia.org/wiki/2014%E2%80%9315_Malaysia_floods and http://reliefweb.int/report/malaysia/asean-flash-update-northeast-monsoon-flood-24-december-2014.



In the plot below we used raincpc to show the amount of rain that fell over south-east Asia between 17th and 24th December 2014. It confirms that rainfall was indeed particularly heavy along the east coast of peninsular Malaysia; but also over northern Sumatera. Penang was certainly wet during December but the island had nothing like the amount of rainfall endured by communities on Malaysia’s east coast.

We do not know what caused the extreme rainfall that led to the flooding. Meteorologists think that it is related to the 'Madden-Julian Oscillation' (http://www.themalaysianinsider.com/malaysia/article/malaysia-could-see-more-severe-floods-like-in-kelantan-say-experts) and it's interaction with the north-east Monsoon. Very heavy rain is of course common in the tropics, but it doesn't neccesarily lead to flooding if drainage is adequate. Some experts think that rampant deforestation in Malaysia has led to more siltation of rivers, in effect blocking Malaysia's drains, and this exacerbates the impact of rainfall events (https://www.youtube.com/watch?v=r_eZUxgoxCw)

As usual the R-code for producing the map is outlined below.
## Load package libraries
 
library(raincpc)
library(SDMTools)
library(raster)
library(ggplot2)
library(rgdal)
library(grid)
library(maptools)
 
# Set working directory
 
setwd("D:/ClimData/")
 
## Get raw CPC rain data - data has a 2 day lag
 
cpc_get_rawdata(2014,12,17,2014,12,24) 
 
## Read raw CPC rain data into raster grids
 
rain1 <- cpc_read_rawdata(2014, 12, 17)
rain2 <- cpc_read_rawdata(2014, 12, 18)
rain3 <- cpc_read_rawdata(2014, 12, 19)
rain4 <- cpc_read_rawdata(2014, 12, 20)
rain5 <- cpc_read_rawdata(2014, 12, 21)
rain6 <- cpc_read_rawdata(2014, 12, 22)
rain7 <- cpc_read_rawdata(2014, 12, 23)
rain8 <- cpc_read_rawdata(2014, 12, 24)
 
# Combine multiple day rasters
 
rain_tot <- rain1 + rain2 + rain4 + rain5 + rain6 + rain7 + rain8
 
# Get summary of raster grid
 
print(rain_tot)
 
raster_ggplot <- function(rastx) {
 
require(SDMTools)
 
  stopifnot(class(rastx) == "RasterLayer")
 
  gfx_data <- getXYcoords(rastx)
  # lats need to be flipped
  gfx_data <- expand.grid(lons = gfx_data$x, lats = rev(gfx_data$y), 
                          stringsAsFactors = FALSE, KEEP.OUT.ATTRS = FALSE)
  gfx_data$rain <- rastx@data@values
 
  return (gfx_data)
}
 
rain_gg <- raster_ggplot(rain_tot)
 
# Read shapefile of country boundaries (shapefiles can be downloaded from http://thematicmapping.org/downloads/world_borders.php)
 
bounds <- readOGR(dsn="D:/Data/World_Borders", layer="TM_WORLD_BORDERS-0.3")
 
## Extents of ggplot map
 
xmin<-95
xmax<-120
ymin<--10
ymax<-15
 
interval <-(xmax-xmin)/5
 
lon_vals <- seq(xmin, xmax, 0.5)
lat_vals <- seq(ymin, ymax, 0.5)
 
 
# Set theme options
 
theme_opts <- list(theme(panel.grid.minor = element_blank(),
                         panel.grid.major = element_blank(),
                         panel.background = element_rect(fill="grey95"),
                         panel.border = element_rect(colour="black"),
                         axis.line = element_blank(),
                         axis.text.x = element_blank(),
                         axis.text.y = element_blank(),
                         axis.ticks = element_blank(),
                         axis.title.x = element_blank(),
                         legend.key.size=unit(0.35,"in"),
                         legend.key.width=unit(0.15,"in"),
                         legend.text=element_text(size=14,family="Myriad Pro Cond"),
                         legend.title=element_text(size=16,family="Myriad Pro Cond"),
                         plot.title = element_text(size=23,face="bold",vjust = -10,hjust=0.96,family="Graph Black"),
                         legend.position = c(0.17, 0), 
                         legend.justification = c(1, 0), 
                         legend.background = element_blank(),
                         axis.title.y = element_blank()))
 
# Plot rainfall map
 
rf <-  ggplot()+
       geom_raster(data=rain_gg,aes(x=lons,y=lats,fill=rain),alpha=0.8) +
       scale_fill_gradientn(colours=c("#FFFFFF","#ADD8E6","#95BBE9","#7E9EEC","#6781F0","#5064F3","#3948F6","#222BFA","#0B0EFD","#0A02FE","#1F06FC","#350AFA","#4A0EF8","#6013F6","#7517F3"),limits=c(0,1200),na.value=NA, name="Rainfall (mm)\n")+
       geom_polygon(data=bounds, aes(long,lat, group=group),colour="grey30",fill="transparent",size=0.35)+
       coord_equal(xlim=c(xmin,xmax),ylim=c(ymin,ymax)) + theme_bw()+
       ggtitle("Extreme rainfall event over Malaysia\n(17th to 24th of December 2014)\n")+
       xlab("") + ylab("")+ theme_opts +
       annotate("text", x = 115.00, y = -9.5, label = "Data source: Climate Prediction Center - NOAA (2014)",size=5,family="Myriad Pro Cond") 
 
plot(rf)
 
# Save rainfall map to png file
 
ggsave(rf,file="D:/ClimData/CPC_Extreme_Rainfall_Event_MYS_Dec2014.png",dpi=500,w=10,h=10,unit="in",type="cairo-png")
Created by Pretty R at inside-R.org

Thursday, 8 January 2015

2014’s weather in Penang: a brief summary

Key points of post:

  • 2014 was the 2nd warmest year in Penang since 1975.
  • During 2014 average daily air temperatures exceeded record highs (since 1975) on 54 days.
  • Nearly 50% of days in June 2014 broke record temperature highs
  • Two unusually cold days were experienced in late December 2014.
  • 2014 was also the driest since 2005.

We’ve been blogging about weather in Penang on and off for a year now and thought it would be interesting to review the past year as we move into 2015.

In March 2014, we described the unusually dry start to 2014, the forest fires on Penang Island, and the weak negative relationship between rainfall and temperature.  In a subsequent blog post, we speculated on whether the total cumulative rainfall that fell in Penang in 2014 would be able to catch up with more ‘normal’ amounts.  Since then, we’ve experienced a rather wet December in Penang which reflects more substantial precipitation elsewhere.

The plot below summarizes the seasonal change in air temperature at Penang International Airport based on daily observations.  The 365 vertical black lines are the average air temperatures each day, ± 2 standard deviations, and represent the ‘normal’ temperature ranges for Penang based on 40 years of observations (ie. 1975 to 2014).  The buff colored lines denote the range between the record temperature highs and lows each day. 


The solid black line describes the average daily air temperature for 2014. The red circles on this line represent days in 2014 when the average temperatures exceeded the 40 year record high, ie. they were exceptionally warm days. Similarly the blue ones denote days which days were extraordinarily ‘cold’.

Overall 2014 was very warm; the black line being well above the ‘normal’ range for most months. Both June and July 2014 were unusually hot this year in Penang.  [Luckily I was in the UK on leave at that time!]  To put this into perspective, during June 2014 average temperatures on 14 out of 30 days broke record highs, that’s to say nearly 50% out of a possible 30 days.  Overall in 2014 average air temperatures were higher than the 40 year average on an amazing 54 days (~15% of all days).

Those of us who were in Penang during Christmas 2014 will recall the unusual amount of rain which, certainly in the tropics, tends to depress air temperatures. This feature of Penang’s recent climate is also well captured by our graphic, the black line being below the normal range for the last 2 weeks of December when 2 days also experienced record lows. Indeed I was on Monkey Beach on 23rd December. It felt more like Scotland and I wished I’d bought a substantial anorak.

Cumulative rainfall for the last decade is plotted below. It shows that 2014 never caught up and was an unusually dry year overall. The average rainfall in Penang ranges from 2250 to 2900 mm annually but 2014's annual rainfall was well below 2000 mm! The high temperatures observed in June were also coincident with no rain, see flat line during June in plot below.


The temperature plot produced above is based on Tufte's illustration of New York's weather in 2003 published in the New York Times, January 4, 2004 and also his classic book Visual Display of Quantitative Information, 2nd Ed. (page 30).

The code presented below that was used to produce the temperature plot has been modified slightly from the code included in the post published on RPubs by Brad Boehmke. The temperature and precipitation data used for the plots above are acquired from the usually dependable NOAA NCDC's Global Summary of the Day (GSOD)

# Code to produce Temperature Plot
 
# Load required libraries
 
library(dplyr)
library(tidyr)
library(ggplot2)
 
# Load font
 
windowsFonts(GraphBlack="TT Graph Black")
 
# Set working directory
 
setwd("D:/ClimData/")
 
# Read weather data downloaded from NOAA NCDC GSOD
 
dat<-read.table("CDO1553156579351.txt",header=F,skip=1)
 
# Rename columns
 
colnames(dat)<-c("stn","wban","yearmoda","temp","tempc","dewp","dewpc","slp","slpc","stp","stpc","visib","visibc","wdsp","wdspc","mxspd","gust","maxtemp","mintemp","prcp","sndp","frshtt")
 
# Reformat columns
 
dat$yearmoda <- strptime(dat$yearmoda,format="%Y%m%d")
dat$tempdc <- (dat$temp-32) * (5/9)
 
dat$year <- as.numeric(format(dat$yearmoda,"%Y"))
dat$month <- as.numeric(format(dat$yearmoda,"%m"))
dat$day <- as.numeric(format(dat$yearmoda,"%d"))
 
temp <- dat[c(23,24,25,26)]
 
names(temp) <- c("temp", "year", "month", "day")
 
temp <- temp %>% group_by(year) %>% mutate(daynum = seq_along(year))
 
# Set up plain chart with min-max range and 95% CI
 
(p <- ggplot(temp, aes(x = daynum, y = temp)) + 
      stat_summary(geom = "linerange", 
      fun.ymin = min, 
      fun.ymax = max, 
      color = "wheat2") +
      stat_summary(geom = "linerange", 
      fun.ymin = function(x) mean(x) - 2 * sd(x)/sqrt(length(x)),
      fun.ymax = function(x) mean(x) + 2 * sd(x)/sqrt(length(x)), 
      color = "wheat4") + 
      geom_line(data = filter(temp, year == 2014)))
 
# Data frame containing all days in 2014 with extreme weather
 
df_maxmin <- temp %>%
             group_by(daynum) %>%
             mutate(max_temp = max(temp), 
             min_temp = min(temp)) %>%
             filter(year == 2014, (temp %in% c(max_temp, min_temp))) %>%
             mutate(max_min = temp == max_temp) # Dummy variable to be mapped to color
 
head(df_maxmin)
 
# Data frame with x-axis breaks and labels
 
df_xaxis <- temp %>% filter(year == 2014, month != lead(month)) %>%     # Last day of month
            mutate(days_in_month = daynum - lag(daynum),                # Days in month
            midpoint = lag(daynum) + days_in_month / 2)                 # Month midpoints
 
df_xaxis$midpoint[1] <- 31 / 2
 
head(df_xaxis)
 
 
(p <- p  +
      geom_vline(xintercept = 0, color = "wheat4", size = 1) +
      geom_hline(yintercept = seq(22, 32, 2), color = "white") +
      geom_vline(xintercept = df_xaxis$daynum, 
      color = "wheat4", linetype = "dotted", size = 0.5) +
      geom_point(data = df_maxmin, aes(color = max_min), show_guide = FALSE))
 
 
(p <- p +
      scale_x_continuous(expand = c(0,0), labels = month.name,
      breaks = c(df_xaxis$midpoint, df_xaxis$daynum[11] + (31/2))) +
      scale_y_continuous(expand = c(0,0), breaks = seq(22, 32, 2),
      labels = function(x) parse(text = paste0(x, "*degree"))) +
      scale_color_manual(values = c("blue3", "firebrick3")))
 
 
(p <- p + theme(axis.ticks = element_blank(), 
      panel.grid = element_blank(),
      panel.background = element_blank(),
      panel.border = element_blank(),
      axis.text = element_text(color = "gray30"),
      plot.title = element_text(face = "bold", hjust = 0.012, 
      vjust = 0.8, color = "#3C3C3C", size = 25,family="Graph Black")) +
      labs(x = NULL, y = NULL, title = "Penang's Weather in 2014"))
 
 
 
desc <- "Data represents average daily temperatures. Temperature data used starts from 
January 1, 1975. Average temperature for the year was 28.4° making 2014
the 2nd warmest year since 1975" %>% 
 
strwrap(width = 0.75 * getOption("width")) %>% 
paste0(collapse = "\n")
 
# Data frame with annotations
 
df_annotate <- data_frame(
               daynum = c(17, 287), temp = c(24.5, 30.5), max_min = c(FALSE, TRUE), 
               label = c("We had 4 days that were the\ncoldest since 1975", 
               "We had 54 days that were\nthe hottest since 1975"))
 
(p <- p + 
      annotate("text", x = 5, y = 31.7, size = 4, fontface = "bold", 
      hjust = 0, vjust = 0, label = "Temperature",family="Clear Sans") +
      annotate("text", x = 5, y = 31.6, size = 3, color = "gray30", 
      hjust = 0, vjust = 1, label = desc,fontface = "bold",family="Clear Sans") +
      geom_segment(data = df_annotate,aes(x = c(15, 285), xend = c(10, 282), 
      y = c(24.5, 30.5), yend = c(25.5, 29.8),
      color = c(FALSE, TRUE)), show_guide = FALSE) + 
      geom_text(data = df_annotate, aes(color = max_min, label = label), 
      size = 3, hjust = 0, ,show_guide = FALSE,family="Clear Sans",fontface="bold"))
 
# Data frame with legend label coordinates
 
df_leg_text <- data_frame(daynum = c(186, 145, 184, 184), 
               temp = c(23.5, 23.5, 24,23), 
               label = c("NORMAL RANGE", "2014 TEMPERATURE", 
               "RECORD HIGH", "RECORD LOW"))
 
# Data frame with legend shape coordinates
 
df_leg_seg <- data_frame(daynum = c(181, 181, 183, 183, 185), 
              xend = c(181, 181, 185, 185, 185),
              temp = c(23, 23.25, 23.75, 23.25, 23.25),
              yend = c(24, 23.75, 23.75, 23.25, 23.75), 
              size = c(3, 3, 0.5, 0.5, 0.5), 
              color = c("wheat2", rep("wheat4", 4)))
 
p1 <- p + 
      geom_segment(data = df_leg_seg, aes(xend = xend, yend = yend), 
      size = df_leg_seg$size, color = df_leg_seg$color) +
      geom_line(data = data_frame(daynum = seq(175, 182), temp = rnorm(8,23.5,0.15))) +
      geom_text(data = df_leg_text, aes(label = label), hjust = 0, size = 2,fontface = "bold",family="Clear Sans")
 
p1
 
# Save plot to png
 
ggsave(p1,file="PenangTemps_1975-2014.png",dpi=500,w=12,h=6,unit="in",type="cairo-png")
Created by Pretty R at inside-R.org


#####################################################################

# Code to produce cumulative precipitation plot
# Load required libraries
 
library(plyr)
library(ggplot2)
library(lubridate)
library(date)
library(ggthemes)
 
# Setting work directory
 
setwd("d:\\ClimData")
 
# Reading and reformatting GSOD raw data downloaded from NCDC
 
dat<-read.table("CDO1553156579351.txt",header=F,skip=1)
 
colnames(dat)<-c("stn","wban","yearmoda","temp","tempc","dewp","dewpc","slp","slpc","stp","stpc","visib","visibc","wdsp","wdspc","mxspd","gust","maxtemp","mintemp","prcp","sndp","frshtt")
 
dat$yearmoda <- strptime(dat$yearmoda,format="%Y%m%d")
 
dat$prcp <- as.character(dat$prcp)
dat$prcp1 <-as.numeric(substr(dat$prcp,1,4))
dat$prcpflag <- substr(dat$prcp,5,5)
 
# Convert precipitation from inches to mms
 
dat$rain  <- dat$prcp1*25.4
 
# Remove erronous values
 
dat$rain[dat$rain > 1000 ] <- NA
 
dat$year <- as.numeric(format(dat$yearmoda,"%Y"))
dat$month <- as.numeric(format(dat$yearmoda,"%m"))
dat$day <- as.numeric(format(dat$yearmoda,"%d"))
 
# Getting cumulative sum of rain/year
 
dat$date<-as.Date(dat$yearmoda)
 
# Subsetting required period
 
dat2 <- subset(dat, year >= 2005 )
 
# Extracting required columns for transforming data
 
dat3 <- dat2[, c(25,29)]
 
# Replace na's with 0's for ddply function
 
dat3$rain[is.na(dat3$rain)] <- 0
 
dat3 <- ddply(dat3,.(year(date)),transform, cumRain = cumsum(rain))
 
dat4 <- ddply(dat3,.(date,year(date)),summarize, max = max(cumRain))
 
dat5 <- dat4[c(diff(as.numeric(substr(dat4$date, 9, 10))) < 0, TRUE), ]
 
dat5$year <- as.numeric(format(dat5$date,"%Y"))
dat5$month <- as.numeric(format(dat5$date,"%m"))
dat5$day <- as.numeric(format(dat5$date,"%d"))
 
# Calculate julian day
 
dat5$jday <- strptime(dat5$date, "%Y-%m-%d")$yday+1
 
# Data frame with x-axis breaks and labels
 
dataxis <- dat5 %>% group_by(year) %>% mutate(daynum = seq_along(year))
 
df_xaxis <- dataxis %>% filter(year == 2014, month != lead(month)) %>%  # Last day of month
            mutate(days_in_month = daynum - lag(daynum),                # Days in month
            midpoint = lag(daynum) + days_in_month / 2)                 # Month midpoints
 
df_xaxis$midpoint[1] <- 31 / 2
 
# Plot cumulative rainfall
 
cr<-  ggplot(dat3, aes(x = yday(date), y = cumRain, color = factor(year(date)))) +
      geom_line(size=0.5,linetype='solid') + geom_point(size=1.2) + theme_bw() +
      ggtitle("Penang's Cumulative Rainfall by Year (2005 - 2014)") + 
      guides(color = guide_legend(title = "Year", title.position = "top")) +  
      geom_hline(yintercept = seq(0,3000, by=500), color = "wheat4",linetype="dotted",size=0.5) +
      geom_vline(xintercept = df_xaxis$jday, color = "wheat4", linetype = "dotted", size = 0.5) +
      geom_vline(xintercept = 0, color = "grey20", size = 1) + 
      scale_x_continuous(expand = c(0, 0), limits=c(0,380),
      breaks = c(15,45,75,105,135,165,195,228,258,288,320,350),
      labels = c("January", "February", "March", "April",
      "May", "June", "July", "August", "September",
      "October", "November", "December"))+
      scale_y_continuous(breaks=seq(0,3000,by=500))+ 
      xlab("") + ylab("Rainfall (mms)\n")+ 
      theme(panel.border = element_rect(colour="grey20",size=0.5),
      panel.grid.major = element_blank(),
      panel.grid.minor = element_blank(),
      axis.ticks.x=element_blank(),
      legend.position="right",
      axis.title.y=element_text(size=14,face="plain",family="Clear Sans"),
      axis.text.x=element_text(size=12,face="plain",family="Clear Sans"),
      axis.text.y=element_text(size=12,face="plain",family="Clear Sans"),
      legend.text=element_text(size=10,face="plain",family="Clear Sans"),
      legend.title=element_text(size=10,face="bold",family="Clear Sans"),
      plot.title=element_text(size=20,face="bold",family="Clear Sans",hjust = 0.012, vjust = 1.2),
      legend.key=element_blank()) 
 
cr <- cr + geom_text(data = subset(dat5, jday > 350 ), (aes(x = yday(date), y = max, label = year(date))),size=4,vjust=-0.2, hjust=-0.2,fontface="bold",family="Clear Sans")
 
cr 
 
# Save plot to png
 
ggsave(cr, file="Cumulative_RF_Penang_r1.png", dpi=500,width=15, height=7,type = "cairo-png")
Created by Pretty R at inside-R.org