This blog is run by Jason Jon Benedict and Doug Beare to share insights and developments on open source software that can be used to analyze patterns and trends in in all types of data from the natural world. Jason currently works as a geospatial professional based in Malaysia. Doug lives in the United Kingdom and is currently Director of Globefish Consultancy Services which provides scientific advice to organisations that currently include the STECF [Scientific, Technical and Economic Committee for Fisheries, https://stecf.jrc.europe.eu/] and ICCAT, https://www.iccat.int/en/

Showing posts with label Monsoon. Show all posts
Showing posts with label Monsoon. Show all posts

Friday, 6 March 2015

Will I choke? – when is the best time for outdoor activities on Penang Island

Key points of post
  • To assess whether air pollution index (API) levels in Penang during 2014 are unhealthy
  • To assess how API varies seasonally and diurnally

Last year our blog post on air quality on Penang Island was particularly popular. In that post we used a comparatively crude ‘visibility index’ collected at Penang International Airport since 2001.  The data suggested that pollution (visibility) had gotten worse since 2001 and that it tended to be worse in June and July.

We discovered recently, however, that the Malaysian Government’s Department of the Environment (DOE) collects much more detailed data on air pollution at 52 stations across the country, see http://bit.ly/1wUMxYG.  Conveniently, one of the stations is located at Universiti Sains Malaysia (USM) and we've managed to acquire this data scraped off the DOE site by Keith Rozario (nice work Keith!) using python and MySQL. He provides a nice summary of the nationwide API data from August 2013 to February 2015 and the link to all the scripts he used (on github) to scrape the data on his site - please see http://bit.ly/1EnwTLs. He was also generous enough to make a zipped copy of the dataset available for the public to use under the Creative Commons Attribution 4.0 license.

Inspired by an analysis (http://bit.ly/1B6o2eM) by ‘Typhoon Tony’ from Taiwan, we sought to investigate the magnitude of air pollution on Penang Island, and how it varies seasonally and diurnally

Air pollution is typically quantified using the air pollution index (API) which is calculated from the following five major air pollutants: sulphur dioxide, nitrogen dioxide, carbon monoxide, particulate matter, and ozone, for details see http://bit.ly/1wUMxYG. APIs between 101 and 200 are considered to be ‘unhealthy’ especially for those with existing heart and/or lung problems.

So how does Penang stack up? We’re always hearing how terrible pollution is on the island, but is it actually true ? In the ‘calendar’ plot below, we merged the API data with hourly observations on wind-speed and direction from Weather Underground.  The seasonality in API is clear. We had high API values in February and March 2014, but the highest were in June and July which corresponds to the south-west monsoon transporting smoke from forest fires in Sumatera.  


We summarize diurnal variability in API (and to a certain extent seasonality too) in the ‘rose’ plot below.  It suggests that only rarely can air pollution on the island be considered ‘unhealthy’ since it is typically below 100.  If you are, however, concerned your best bet for strenuous outdoor activities is during the night or in the morning up until 1pm when levels of API start to increase.  The least healthy time of day to be out is rush hour between 4pm and 6pm. You may also want to avoid February, March, June and July!



Do note, however, that the API monitoring station at USM is relatively isolated from the island’s urban pell-mell and may not be a good overall reflection/summary of air pollution on the island. 

The R code used to produce the plots shown above are provided below. We are currently working on trying to scrape the API data off the Malaysian DOE website using R (which would make things a lot more convenient!) and once we've managed to do that, we will update the code accordingly.

# Load required libraries
 
library(ggplot2)
library(reshape)
library(lubridate)
library(zoo)
library(openair)
library(scales)
library(grid)
library(extrafont)
library(weatherData)
 
loadfonts()
 
# Set working directory
 
setwd("D:/Projects/AP/ReadingsByRegion/")
 
# Read data from csv file 
 
df <- read.table('USM.csv',header=F,sep=":")
 
# Rename columns
 
colnames(df) <- c("station_id","station_name","dates","hour","api","unk","unk1")
 
# Remove last 2 columns - not relevant information
 
df <- df[-c(6,7)]
 
# Add minute column
 
df$minute <- as.character("00")
 
df$time <- as.character(paste(df$hour,df$minute,sep = ":" ))
 
df$date <- as.POSIXct(paste(df$dates,df$time, format = "%Y-%m-%d %H:%M" ))
 
# Convert 0 to NA
 
df$api[df$api == 0] <- NA
 
# Date and time columns
 
df$year<-as.numeric(as.POSIXlt(df$date)$year+1900)
df$month<-as.numeric(as.POSIXlt(df$date)$mon+1)
df$monthf<-factor(df$month,levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE)
df$weekday = as.numeric(format(as.POSIXlt(df$date),"%u"))
df$yearmonth<-as.yearmon(df$dates)
df$yearmonthf<-factor(df$yearmonth)
df$week <- as.numeric(format(as.POSIXlt(df$date),"%W"))
df$day = strftime(df$date,'%A')
df$jday <- strptime(df$date, "%Y-%m-%d")$yday+1
df$hour <- as.numeric(format(strptime(df$date, format = "%Y-%m-%d %H:%M"),format = "%H"))
 
 
##### Creating roseplot of pollutant magnitude over 24 hours #####
 
# Subset data for 2014
 
df2014 <- subset(df,year==2014)
 
# Set Color Table
 
colortable = c("#99FFFF", "#00FFFF", "#00FF00", "#CCFF33", "#FFFF00",
               "#FFCC00", "#FF6600", "#FF3333", "#FF33CC", "#660033")
 
# Cut data into ten parts
 
STN = "USM"
Time_of_Day = df2014$hour[df2014$station_name==STN]
mag = cut_number(round(df2014$api,100)[df2014$station_name==STN],n = 10)
rosedata = data.frame(dir=Time_of_Day,mag=mag)
 
# Plot rose chart
 
rc <- ggplot(rosedata,aes(x=Time_of_Day,fill=mag))+  geom_bar(binwidth = 1, drop = TRUE) +
      coord_polar() + xlab("") + ylab("") + 
      scale_x_continuous(breaks = seq(0,23),
      labels=c("12.00am","1:00am","2:00am","3:00am","4:00am","5:00am","6:00am",
      "7:00am","8:00am","9:00am","10:00am","11:00am","12:00pm","1:00pm",
      "2:00pm","3:00pm","4:00pm","5:00pm","6:00pm","7:00pm","8:00pm","9:00pm",
      "10:00pm","11:00pm")) +
      ggtitle("\nDaily API readings in 2014") +  scale_fill_manual(values=colortable,name="API Reading")+
      theme(panel.background=element_blank(),
      axis.title.y=element_text(size=10,hjust=0.75,colour="grey20"),
      axis.title.x=element_text(size=7,colour="grey20"),
      legend.title=element_text(size=13,colour="grey20",face="bold",family="Myriad Pro"),
      legend.text=element_text(size=11,colour="grey20",face="bold",family="Myriad Pro"),
      panel.grid=element_blank(),
      axis.ticks=element_blank(),
      axis.text.y=element_text(size=10,colour="grey20",family="Clear Sans"),
      axis.text.x=element_text(size=10,colour="grey20",face="bold",family="Clear Sans"),
      plot.title = element_text(lineheight=1, face="bold",size = 20, colour = "grey20",family="Graphik-Black"),
      plot.margin = unit(c(-0.5,0.25,-0.25,0.25), "in"),
      legend.key.width=unit(c(0.2,0.2),"in"))
 
rc
 
# Save plot to png
 
ggsave(file="D:/rc.png",dpi=300,w=10,h=8,unit="in",type="cairo-png")
 
 
### Reading weather data from Weather Underground using weatherData library ####
 
# Get weather data for 2014
 
we <- getWeatherForDate("WMKP", "2014-01-01","2015-12-31", opt_detailed=T, opt_custom_columns=T, custom_columns=c(1,6,7,8,9,13,14))
 
# Rename columns
 
colnames(we) <- c("date","date_myt","vis_km","wda","ws","gspd_kmh","wd","date_utc")
 
# Create date and time columns
 
we$date<-as.POSIXct(we$date)
we$hour <- as.numeric(format(strptime(we$date, format = "%Y-%m-%d %H:%M"),format = "%H"))
we$min <- as.numeric(format(strptime(we$date, format = "%Y-%m-%d %H:%M"),format = "%M"))
 
# Only use data on the hour
 
we_df<- subset(we, min == 0)
 
# Remove duplicate data
 
we_df$date[duplicated(we_df$date)]
 
# Merge API and weather data
 
we_api <- merge(we_df, df2014, by=c("date","date")) 
 
# Reformat yearmonth column for use in openair library
 
we_api$yearmonth <- as.factor(we_api$yearmonth)
 
# Reformat wind speed column
 
we_api$ws[we_api$ws == "Calm"] <- 0
we_api$ws <- as.numeric(we_api$ws)
 
# Plot 'calendar plot' and save as png
 
png(filename = "D:/Projects/AP/USM_API_2014_CalendarPlot.png",height=8,width=10,
    bg = "white",units='in', res = 400, family = "",  type = "cairo-png")
 
calendarPlot(we_api, pollutant = "api",year="2014",
             main = "Daily average API readings at USM\n
             with wind-speed scaled wind direction overlay (2014)\n",
             key.header = "API\n(Air Pollutant Index)",
             cols="heat", annotate = "ws")
 
dev.off()
Created by Pretty R at inside-R.org

Saturday, 8 November 2014

Come hell with high water - Sea level rises at two stations in Bangladesh

Key points of post
  • Bangladesh is low-lying
  • Many people live along its river and coasts
  • These communities are vulnerable to rises in sea-level which is predicted to occur as a consequence of climate change

Coastal Bangladesh is extremely low-lying and its coastal communities in particular are vulnerable from any future rises in sea level which can cause flooding. Jason and I wondered, therefore, if there were any time-series on sea-level for Bangladesh available, and what information they might contain.

Via Google we discovered the University of Hawaii Sea Level Centre, which monitors seawater levels at stations throughout the world, exploiting its network and relationships with a range of local partners.  In this case it’s the Bangladesh Inland Water Transport Authority who provided the original data.


Here we plot average daily sea level available on the UHSLC site for two stations (please map below) at Charchanga (at the mouth of the Meghna River) and Hiron Point (in the Sundarbans). [Note there are data for 5 other stations in Bangladesh but the time-series data available were, unfortunately, too short a duration]. The data are also color-coded to correspond to the six seasons of Bangladesh (winter, spring, summer, monsoon, autumn, & late autumn). 

MapID1f246f024be6
Data: stations.google • Chart ID: MapID1f246f024be6googleVis-0.5.6
R version 3.1.2 (2014-10-31) • Google Terms of UseDocumentation and Data Policy

The first interesting feature, noted at both locations, is the pronounced seasonality with peaks in sea-level being recorded in the  monsoon, autumn and late autumn periods.  Winter and spring correspond to lowest average daily water levels at both locations.  The seasonal variation in sea-level is due primarily to the effect of the Monsoon; ie. the south-westerly winds tend to cause the seas to ‘pile-up’ along the coast of Bangladesh, while the flow from the major rivers pouring through the delta into the Bay of Bengal increases substantially also as a result of the Monsoon. 

The most worrying aspect of these data is the substantial change in long-term trend summarized here with a simple linear model (red line).  At Charchanga average daily sea-levels recorded rose by approximately 7 mm's per year between 1980 and 2000 while at Hiron Point the figure was approximately 5 mm's per year (1977 to 2003). Unfortunately more recent data were unavailable from the Sea Level Center and we are currently unsure if these trends are continuing. But please if you know of more data for Bangladesh let us know.

Certain aspects of the series are, however, rather curious and it’s not a straightforward story of increasing trends.  At Hiron Point, for example, extreme sea-level events (say above 2600 mm's) were relatively frequently between 1980 and 1995 whereas, since 1996 they have been observed less often. Contrast this with Charchanga where there were some exceptional peaks recorded in 1996, ’97, and ’98. 

All such changes could have profound implications for the livelihoods of coastal Bangladeshis.

Details of how to obtain the data and plot the graphs here in R are demonstrated below.  

As mentioned earlier in the post, the data was obtained from the University of Hawaii Sea Level Centre and we have used the research quality dataset which can be found at this link - http://uhslc.soest.hawaii.edu/data/download/rq

You can download the dataset in various formats which include csv and NetCDF and the associated metadata for each station is also there for you to check of any inconsistencies in the data collection or station relocation and instrumentation issues. The data can also be downloaded at two different time scales, which is hourly and daily. 

# Load required libraries
 
library(ggplot2)
library(scales)
library(grid)
library(plyr)
library(lubridate)
library(zoo)
 
# Set working directory
 
setwd("D:/ClimData/SeaLevel")
 
# Read csv file
 
sl<-read.csv("rqd0138a.csv",header=FALSE)
 
# Rename columns
 
colnames(sl)<-c("year","month","day","sl_mm")
 
# Format date columns
 
sl$date <- as.Date(paste(sl$year,sl$month,sl$day),format="%Y%m%d")
sl$month <- as.numeric(format(sl$date,"%m"))
sl$year <- as.numeric(format(sl$date,"%Y"))
sl$monthf <- factor(sl$month,levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE)
sl$mday <- strptime(sl$date, "%Y-%m-%d")$mday
sl$jday <- strptime(sl$date, "%Y-%m-%d")$yday+1
sl$daymth <- as.character(paste(sl$month,sl$day,sep="-"))
sl$daymth <-as.Date(sl$daymth,format="%m-%d")
 
# Classify data into seasons
 
sl$season <- "Season"
 
sl$season[sl$month == 1 & sl$mday >= 1 |  sl$month == 2 & sl$mday <= 13| sl$month == 1]<-'Winter'
sl$season[sl$month == 2 & sl$mday >= 14 |  sl$month == 4 & sl$mday <= 14 | sl$month == 3]<-'Spring'
sl$season[sl$month == 4 & sl$mday >= 15 |  sl$month == 6 & sl$mday <= 14 | sl$month == 5]<-'Summer'
sl$season[sl$month == 6 & sl$mday >= 15 |  sl$month == 8 & sl$mday <= 17 | sl$month == 7]<-'Monsoon'
sl$season[sl$month == 8 & sl$mday >= 18 |  sl$month == 10 & sl$mday <= 18| sl$month == 9]<-'Autumn'
sl$season[sl$month == 10 & sl$mday >= 19 |  sl$month == 12 & sl$mday <= 16| sl$month == 11]<-'Late Autumn'
sl$season[sl$month == 12 & sl$mday >= 17 |  sl$month == 12 & sl$mday <= 31| sl$month == 1]<-'Winter'
 
sl$season = factor(sl$season, c("Winter", "Spring", "Summer", "Monsoon","Autumn","Late Autumn"))
 
## Plot Sea Level
 
hp_sl <-   ggplot(sl, aes(date, sl_mm,colour=season))+
           #geom_line(size=0.5)+
           geom_point(shape=5,size=1)+
           geom_smooth(method="lm",size=0.5,col="red")+
           scale_x_date(name="\n\n\n Source: University of Hawaii Sea Level Centre / Bangladesh Inland Water Transport Authority (BIWTA) - 2014",labels=date_format("%Y"),breaks = date_breaks("2 years"))+
           ylab("Milimetres (mm)\n")+
           xlab("\nYear")+
           theme_bw()+
           ggtitle("Sea Level at Charchanga - Bangladesh (1980-2000)\n")+
           theme(plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
           panel.border = element_rect(colour = "black",fill=F,size=1),
           panel.grid.major = element_line(colour = "grey",size=0.25,linetype='longdash'),
           panel.grid.minor = element_blank(),
           axis.title.y=element_text(size=11,colour="grey20"),
           axis.title.x=element_text(size=9,colour="grey20"),
           panel.background = element_rect(fill = NA,colour = "black"))
 
hp_sl
 
# Get gradient and add to plot
 
m <- lm(sl_mm~year, data=sl )
ms <- summary(m)
 
slope <- coef(m)[2]
lg <- list(slope = format(slope, digits=3))
eq <- substitute(italic(Gradient)==slope,lg)
eqstr <-as.character(paste(as.expression(eq),"/year"))
hp_sl <- hp_sl + annotate(geom="text",as.Date(-Inf, origin = '1970-01-01'), y = Inf, 
         hjust = -0.1, vjust = 2, label = eqstr,parse = TRUE,size=3)
 
hp_sl
 
# Save plot to png
 
ggsave(hp_sl, file="Charchanga_SeaLevel_Plot_Seasons.png", width=10, height=6,dpi=400,unit="in",type="cairo")
 
# Code to produce html code of embedded sea level stations map using googleVis
 
# Load libraries
 
library(RCurl)
library(XML)
library(leafletR)
library(googleVis)
 
# Convert html table into data frame
 
theurl <- "http://uhslc.soest.hawaii.edu/data/download/rq"
tables <- readHTMLTable(theurl)
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
 
tbl <- tables[[which.max(n.rows)]]
 
bgd.tbl <- subset(tbl, Country =="Bangladesh")
 
bgd.tbl$Latitude <- as.numeric(levels(bgd.tbl$Latitude)[bgd.tbl$Latitude])
bgd.tbl$Longitude <- as.numeric(levels(bgd.tbl$Longitude)[bgd.tbl$Longitude])
 
google.location <- paste(bgd.tbl$Latitude, bgd.tbl$Longitude, sep = ":")
stations.google <- data.frame(bgd.tbl, google.location)
 
# Plot map
 
map <- gvisMap(data = stations.google, locationvar = "google.location",tipvar = "Location",
       options=list(showTip=TRUE, enableScrollWheel=TRUE,mapType='terrain', useMapTypeControl=TRUE,width=100,height=400,
       icons=paste0("{","'default': {'normal': 'http://i.imgur.com/f3q6Oaj.gif',\n",
       "'selected': 'http://i.imgur.com/f3q6Oaj.gif'","}}")))
 
plot(map)
Created by Pretty R at inside-R.org

Friday, 8 August 2014

Blowing in the wind - How many times must we keep experiencing the haze before the Governments do something about it?

Key points of post

  • Visibility in Penang is affected by the seasonal monsoon winds
  • Visibility has got worse since 2001.
  • The governments of south-east Asia must urgently solve this potentially serious health issue.
Every year South-East Asians suffer from the effects of fires burning in the forests of Indonesia. The more intense fires are lit deliberately to clear land for palm oil plantations. It is illegal to use fire to clear land in Indonesia, but the practice is unfortunately widespread, leading to habitat destruction and loss of biodiversity. A satellite image captured by the Moderate Resolution Imaging Spectroradiomater (MODIS) from NASA below for March 2014 illustrates when the fires were particularly bad (fires shown in red). 

Source: NASA Earth Observatory (2014) - http://1.usa.gov/1kqfdYA

Most fires occur during the dry season between April and October. The fires in 2014 started in February, and actually forced Indonesia to declare a state of emergency because of poor air quality. 

But what about the situation for us in Penang ? Our island is subject to two seasonal monsoon winds: the north-east and the south-west. These patterns are clear from the wind-rose diagram below which shows the average monthly wind-speeds and directions for the period 1949-2014. 


Given the location of Indonesia relative to Penang, it follows that visibility will be worst during the south-west monsoon which blows hardest, and most consistently, between May and August each year. Conversely we might expect visibility to be best during the north-east monsoon. This is indeed exactly what we observe. Average visibility readings from Bayan Lepas International Airport by week and year 2001-2014 are plotted below. The white squares represent good visibility and cleaner air; the dark squares poor visibility and high levels of air pollution.


It is clear that visibility has gotten worse since 2001, ie. the numbers of white squares have decreased. The pattern is also strongly seasonal which reflects: (a) the monsoons; and (b) the fact that fires are used more in Indonesia’s dry season. The relatively long periods of good visibility (>10 km's) that Penangites must have enjoyed especially during January, November and December seem now to be a thing of the past. I’m (Doug) a keen hiker regularly gazing down on the Island from the heights of Tea Station 5, 89, and Penang Hill. I’ve only been here since January 2012 and, judging from these data, I guess I've probably never seen a really good view!

Let’s hope the Government of Malaysia / Penang will continue to apply pressure and work with the Indonesians and other countries in the region to solve this terrible problem.

If you're keen on producing similar plots like the ones we've included in this blog post, you can use the code provided below. The wind-rose plot was produced using the 'openair' package for R which is an open-source air pollution analysis tool. There are a number of useful plots that can be produced using this package to show magnitude and direction of wind / air pollutants which you can further explore on your own. We also provide you with the code to produce the 'heatmap' plot of the visibility data which was pulled into R from Weather Underground (using the 'weatherData' package).  

# Load libraries
 
library(ggplot2)
library(RColorBrewer)
library(openair)
library(weatherData)
library(mgcv)
library(scales)
library(plyr)
library(reshape2)
library(circular)
library(gridExtra)
library(lubridate)
library(weathermetrics)
library(zoo)
 
# Setting work directory
 
setwd("d:\\ClimData")
 
# Reading and reformatting raw daily data downloaded from NCDC
 
dat<-read.table("2422706962434dat.txt",header=TRUE,fill=TRUE,na.strings=c("*","**","***","****","*****","******","0.00T*****"))
 
colnames(dat)<-tolower(colnames(dat))
 
Sys.setenv(TZ = "UTC")
dat$date <- as.POSIXct(strptime(dat$yr..modahrmn,format="%Y%m%d%H%M"))  + 8 * 60 * 60
 
dat$dir[dat$dir == 990.0] <- NA 
 
# Convert windspeed in miles/hour to metres/second
 
dat$wspd <- (dat$spd)*0.44704
 
# Create columns for year, month and dat
 
dat$year <- as.numeric(format(dat$date,"%Y"))
 
dat$month <- as.numeric(format(dat$date,"%m"))
 
dat$day <- as.numeric(format(dat$date,"%d"))
 
# Subset year of interest
 
datsub <- subset(dat, year >= 1949 )
 
# Remove NA's
 
datsub$dir[is.na(datsub$dir)] <- 0
 
# Plot and produce png file of Wind Rose
 
png(filename = "Monthly_WindRose_Penang_1949-2014.png",height=10,width=10,
    bg = "white",units='in', res = 600, family = "", restoreConsole = TRUE,
    type = "cairo-png")
 
windRose(datsub,ws="wspd",wd="dir",bias.corr = TRUE,border=TRUE,type = "month",width = 0.5,grid.line = 10,
         statistic = "prop.mean",offset = 10, paddle =FALSE,cols="hue",annotate=FALSE,auto.text=FALSE,
         ws.int=2,breaks=c(0,2,4,6,8,10,12),key = TRUE, key.footer = "(meter/second)", key.position = "bottom",na.action=NULL,
         key.header = "Wind Speed",main="       Average Monthly Wind Rose plots for Penang (Bayan Station)\n from years 1949 - 2014 \n",font.main=1,
         sub="\nData source: Integrated Surface Database (ISD) - National Climatic Data Centre (NCDC)",font.sub=2)
 
dev.off()
 
###########################################
#### Produce visibility 'heatmap' plot ####
###########################################
 
### Find required 4 letter station code using function below
 
# Example getStationCode("Penang") or getStationCode("Honiara")
 
# Information can also be acquired from the following link - http://weather.rap.ucar.edu/surface/stations.txt
 
getStationCode("Butterworth")
 
# Use station code below to get required plot data and parameters
 
station.id="WMKP"
 
s<-getStationCode(station.id)
 
s1<-(strsplit(s,split= " "))[[1]]
 
station.name<-paste(s1[4],s1[5])
 
### Getting summarized weather data for WU
 
ws2001<-getSummarizedWeather(station.id, "2001-01-01", "2001-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2002<-getSummarizedWeather(station.id, "2002-01-01", "2002-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2003<-getSummarizedWeather(station.id, "2003-01-01", "2003-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2004<-getSummarizedWeather(station.id, "2004-01-01", "2004-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2005<-getSummarizedWeather(station.id, "2005-01-01", "2005-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2006<-getSummarizedWeather(station.id, "2006-01-01", "2006-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2007<-getSummarizedWeather(station.id, "2007-01-01", "2007-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2008<-getSummarizedWeather(station.id, "2008-01-01", "2008-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2009<-getSummarizedWeather(station.id, "2009-01-01", "2009-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2010<-getSummarizedWeather(station.id, "2010-01-01", "2010-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2011<-getSummarizedWeather(station.id, "2011-01-01", "2011-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2012<-getSummarizedWeather(station.id, "2012-01-01", "2012-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2013<-getSummarizedWeather(station.id, "2013-01-01", "2013-12-31", opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
ws2014<-getSummarizedWeather(station.id, "2014-01-01", Sys.Date(), opt_custom_columns=T,
custom_columns=c(1:23), opt_verbose=T)
 
ws<-rbind(ws2001,ws2002,ws2003,ws2004,ws2005,ws2006,ws2007,ws2008,ws2009,ws2010,ws2011,ws2012,ws2013,ws2014) 
 
summary(ws)
 
str(ws)
 
### List of variables from Wunderground ###
 
# [1] "MYT"                        "Max_TemperatureC"           "Mean_TemperatureC"          "Min_TemperatureC"          
# [5] "Dew_PointC"                 "MeanDew_PointC"             "Min_DewpointC"              "Max_Humidity"              
# [9] "Mean_Humidity"              "Min_Humidity"               "Max_Sea_Level_PressurehPa"  "Mean_Sea_Level_PressurehPa"
# [13] "Min_Sea_Level_PressurehPa"  "Max_VisibilityKm"           "Mean_VisibilityKm"          "Min_VisibilitykM"          
# [17] "Max_Wind_SpeedKm_h"         "Mean_Wind_SpeedKm_h"        "Max_Gust_SpeedKm_h"         "Precipitationmm"           
# [21] "CloudCover"                 "Events"                     "WindDirDegrees"            
 
colnames(ws)<-c("date","date1","maxtemp","meantemp","mintemp","dewp","meandewp","maxdewp","maxhum","meanhum","minhum","maxslp","meanslp",
                "minslp","maxvsb","meanvsb","minvsb","maxwspd","meanwspd","maxgust","prcp","cc","events","wd")
 
## Adding date columns

ws$dates <- as.Date(ws$date)
ws$year <- as.numeric(as.POSIXlt(ws$dates)$year+1900) ws$month <- as.numeric(as.POSIXlt(ws$dates)$mon+1) ws$monthf <- factor(ws$month,levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE) ws$weekday <- as.POSIXlt(ws$dates)$wday ws$weekdayf <- factor(ws$weekday,levels=rev(0:6),labels=rev(c("Mon","Tue","Wed","Thu","Fri","Sat","Sun")),ordered=TRUE) ws$yearmonth <- as.yearmon(ws$date) ws$yearmonthf <- factor(ws$yearmonth) ws$week <- as.numeric(format(as.Date(ws$dates),"%W")) ws$weekf<- factor(ws$week) ws$jday<-yday(ws$dates)   # Define colour palette   col<-c("black","grey10","grey20","grey50","white")   # Plot 'heatmap'   v <- ggplot(data=ws,aes(x=week,y=year,fill=meanvsb))+ geom_tile(colour="black",size=0.65)+ theme_bw()+ scale_fill_gradientn(colours=col,name="Visibility\n(km)\n")+coord_equal(ratio=1)+ ylab("YEAR\n")+ xlab("\nWEEK OF YEAR\n\nSource: Weather Underground (2014)")+ scale_y_continuous(expand = c(0,0),breaks = seq(2000, 2015, 1)) + scale_x_discrete(expand = c(0,0),breaks = seq(0,52,2))+ ggtitle("Average weekly visibility in Penang\n")+ theme(panel.background=element_rect(fill="transparent"), panel.border=element_blank(), axis.title.y=element_text(size=10,colour="grey20"), axis.title.x=element_text(size=10,colour="grey20"), axis.text.y=element_text(size=10,colour="grey20",face="bold"), axis.text.x=element_text(size=10,colour="grey20",face="bold"), plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"), panel.grid.major = element_blank(), legend.key.width=unit(c(0.2,0.2),"in"))   v   # Save plot to png   ggsave(v, file="Penang_Weekly_Average_Visibility.png", width=15, height=5,dpi=400,type = "cairo-png")
Created by Pretty R at inside-R.org