This blog is run by Jason Jon Benedict and Doug Beare to share insights and developments on open source software that can be used to analyze patterns and trends in in all types of data from the natural world. Jason currently works as a geospatial professional based in Malaysia. Doug lives in the United Kingdom and is currently Director of Globefish Consultancy Services which provides scientific advice to organisations that currently include the STECF [Scientific, Technical and Economic Committee for Fisheries, https://stecf.jrc.europe.eu/] and ICCAT, https://www.iccat.int/en/

Tuesday, 16 December 2014

13 decades on...has Penang gotten warmer?

  • Here we compare recent temperature data for Penang with a report written in 1885.
  • We find that air temperatures are higher now in the afternoon and evening
  • Morning air temperatures seem to be about the same now as they were 130 years ago.
  • There are sources of bias which may affect the interpretation, but about which we can do nothing.

Last week we received the email below from Mike Gibby (mikegibby@gmail.com).

“Hi Jason, Doug

I found your website / blog on Temperature Change in Penang while searching for info on rain, and more especially on rain gauges in Penang, and wondered if you could give any advice or help?

I'm part of a team who are researching for a guide to hiking trails in Penang; most of these trails were developed by farmers for access, but there are some trails that only 'stitch together' the sites of rain gauges, so they must have been created for data collection in the old, pre-digital days. An example here would be what is described as 'the longest trail in Penang', that runs from Telok Bahang to Penang Hill, via Bukit Laksamana.

So, my question is, can you point me at any sources of information, quite probably of the colonial era, on the setting up, siting etc of rain gauges?
In case you don't know it, I attach an article on the climate in Penang in 1885.

I look forward to hearing from you- best wishes !

Mike Gibby”

We replied to Mike that we didn’t know anything about the rain gauges or the old-colonial data collection protocols or policies.

We were, however, very interested in the 1885 weather report that Mike attached. We’ve since discovered that the report is available from the Malaysian Branch of the Royal Asiatic Society.  The 1885 report was compiled by Mr Thomas Irvine Rowell (Principal Civil Medical Officer) in Singapore in early 1886, about the same time Karl Benz patented the first gasoline-driven automobile and covers Singapore, Penang, ‘Province Wellesley’ and Malacca. 

Given our interest in Penang’s current and past weather, we decided to compare recent air temperature measurements with those taken 130 years previously in 1885 (please see below).



In the figure below we’ve plotted annual patters for six different Temperature parameters (mean monthly, maximum monthly, minimum monthly, 9am, 3pm, & 9pm) for both 1885 (blue lines) and the period 2003-2013 (red lines).  Monthly mean temperatures are around 28°C. March and April Penang average temperatures appeared to be a little hotter in 1885 than we’ve seen more recently, but for the rest of the year air Temperatures were fairly similar.  Recent maximum air temperatures are higher than they were in 1885 and minimum temperatures lower. We think, however, that this is probably caused by confounding/bias, ie. in 1885 data were only collected 3 times daily (9am, 3pm, & 9pm) whereas nowadays we can make observations, continuously, at 5 minute intervals and publish them instantly around the world! 



Given such problems of bias, the most meaningful comparisons we can make between these datasets are those observations taken at 9am, 3pm, and 9pm.  At 9am, temperatures in 1885 were similar to current measurements.  In the afternoon (3pm, typically around the hottest time of day in Penang) and evening (9pm), however, Temperatures are higher now in all months by about 1°C

So, in conclusion, if we assume that the measuring equipment is accurate, current air temperatures are around 1°C warmer in Penang in the afternoons and evenings, while Mr Rowell would probably have experienced very similar morning temperatures to us now!

These differences are less pronounced than we’d expected.

[Note: We’ve not addressed the potentially serious issue of ‘spatial bias’ and the locations that weather observations were made by us are not the same as those used by Mr Rowell].

The R code we used to produce the above line plots is outlined below and the 2003-2013 temperatures were downloaded from the NOAA NCDC's Global Summary of the Day (GSOD).

# Load required libraries
 
library(zoo)
library(lubridate)
library(plyr)
library(ggplot2)
library(grid)
library(XLConnect)
library(reshape2)
library(extrafont)
 
# Set working directory
 
setwd("D:/ClimData")
 
# Read table of 1885 Temperatures
 
temp1885<-readWorksheet(loadWorkbook("1885_Temps.xlsx"),sheet=1)
 
temps1885 <- temp1885[c(1,2,10,11,12,13,14,15)]
 
colnames(temps1885) <- c("month","monthf","temp_9am","temp_3pm","temp_9pm","mon_mean","mon_max","mon_min")
 
temps1885$year <- "1885"
 
temps1885_df <- temps1885[c(1,2,9,3:8)]
 
 
# Read data from NCDC
 
dat<-read.table("2110827004508dat.txt",header=TRUE,fill=TRUE,na.strings=c("*","**","***","****","*****","******","0.00T*****"))
 
colnames(dat)<-tolower(colnames(dat))
 
# Convert hourly data from UTC to local time zone
 
Sys.setenv(TZ = "UTC")
dat$dates <- as.POSIXct(strptime(dat$yr..modahrmn,format="%Y%m%d%H%M"))  + 8 * 60 * 60
 
# Convert temperatures in Degree Fahrenheit to Degree Celcius
 
dat$tempc <- (dat$temp-32) * (5/9)
 
dat$tempc[dat$tempc<=10] <- NA
dat$tempc[dat$tempc>=40] <- NA 
 
# Extract times and dates and reformat data
 
dat$year <- as.numeric(as.POSIXlt(dat$dates)$year+1900)
dat$month <- as.numeric(as.POSIXlt(dat$dates)$mon+1)
dat$monthf <- factor(dat$month,levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE)
dat$jday <- yday(dat$dates)
dat$hour <- as.numeric(format(strptime(dat$dates, format = "%Y-%m-%d %H:%M"),format = "%H"))
dat$minute <- as.numeric(format(strptime(dat$dates, format = "%Y-%m-%d %H:%M"),format = "%M"))
 
dat_01 <- subset(dat, year >= 2003 & year <= 2013 )
 
dat_02 <- dat_01[c(34,35,36,37,38,39,40,41)]
 
dat_9am <- subset(dat_02,hour == 9 & minute == 0 )
dat_3pm <- subset(dat_02,hour == 15 & minute == 0 )
dat_9pm <- subset(dat_02,hour == 21 & minute == 0 )
 
dat_9am_melt<- ddply(dat_9am, .(month,monthf), summarize, temp_9am=mean(tempc,na.rm=T))
dat_3pm_melt<- ddply(dat_3pm, .(month,monthf), summarize, temp_3pm=mean(tempc,na.rm=T))
dat_9pm_melt<- ddply(dat_9pm, .(month,monthf), summarize, temp_9pm=mean(tempc,na.rm=T))
 
dat_hourly <- cbind(dat_9am_melt,dat_3pm_melt[3],dat_9pm_melt[3])
 
dat_melt<- ddply(dat_01, .(month,monthf), summarize, mon_mean=mean(tempc,na.rm=T),mon_max=max(tempc,na.rm=T),mon_min=min(tempc,na.rm=T))
 
dat_yrs <- cbind(dat_melt,dat_hourly[-c(1,2)]) 
 
dat_yrs$year <- "10 Year Average (2003-2013)"
 
tempcomb <- rbind(dat_yrs,temps1885_df)
 
tempcomb_df <- melt(tempcomb, id.vars=c("month","monthf","year"),variable.name="type",value.name="temps")
 
# Plot data
 
p <- ggplot(data=tempcomb_df, aes(x=monthf,temps, colour=year))+
     geom_point(data=subset(tempcomb_df,year!=1885),size=3, alpha=0.5,aes(group=year))+ 
     geom_line(data=subset(tempcomb_df,year!=1885),size=1, alpha=0.5,aes(group=year))+
     geom_point(data=subset(tempcomb_df,year==1885),size=3, alpha=0.5,aes(group=year))+ 
     geom_line(data=subset(tempcomb_df,year==1885),size=1, alpha=0.5,aes(group=year))+
     geom_text(aes(label=round(temps,1)),hjust=1, vjust=-1,size=2.5)+
     facet_wrap(~type)+
     xlab("")+ ylab ("Temperature (Degree C)\n") +
     labs(title="Comparison of year 1885 and 10 year monthly average (2003-2013) temperatures in Penang\n")+
     theme(panel.background = element_rect(fill= "transparent"),
     plot.title = element_text(lineheight=1.2, face="bold",size = 13, colour = "grey20",family="Clear Sans"),
     panel.border = element_rect(colour = "grey90",fill=F,size=0.5),
     panel.grid.major.x = element_blank(),panel.grid.minor = element_blank(),
     axis.line=element_blank(),axis.title.x=element_blank(),
     axis.text.x=element_text(face="bold",size = 10, colour = "grey20",family="Clear Sans"),
     axis.title.y=element_text(face="bold",size = 10, colour = "grey20",family="Clear Sans"),
     axis.text.y=element_text(face="bold",size = 8, colour = "grey20",family="Clear Sans"),
     legend.title = element_text(colour="black", size=12, face="bold",family="Clear Sans"),
     legend.text = element_text(colour="black", size=10, family="Clear Sans"),
     legend.position="bottom",
     axis.ticks=element_line(size=0.5,colour="grey20"),panel.grid.major.y = element_line(colour="grey90",linetype="longdash",size=1))+
     guides(color = guide_legend(title = "Year", title.position = "top"))
 
p
 
 
# Save plot to png
 
ggsave(p,file="PenangTemps_1885_10yrAverage.png",dpi=500,w=12,h=8,unit="in",type="cairo-png")
Created by Pretty R at inside-R.org

Monday, 8 December 2014

Twittering about climate change and Bangladesh

Points of post:
  • To describe how twitteR can be used, together with Word Clouds (wordcloud), to reveal what people are twittering about.
  • To show how rapidly interest in various subjects can change.

Twitter (https://twitter.com/) is a social media network allowing users to send short (140 characters) messages called ‘tweets’. Twitter started up in 2006 and has had phenomenal success with 340 million tweets sent each day in 2012. 

In an earlier post on our blog, we described how archives of Google internet searches can tell us what people are interested in, and how these interests change over time. It is also possible to assess what factors might be affecting this interest, e.g. a disease outbreak or climatic disaster. Similarly Twitter might be viewed as a massive, passive opinion poll. 

Again wonderful R has libraries to analyze the worlds’ tweets (twitteR) which can be useful to perform social media mining and to reveal what people are thinking (twittering) about; whether it’s Brad Pitts’ facial hair (http://www.dailymail.co.uk/tvshowbiz/article-2792839/brad-pitt-continues-wear-rakish-moustache-long-sideburns-grew-sea-new-york.html) or climate variability and change. 

Jason and I are very interested in how climate change will affect the people of Bangladesh, in particular, and we’ve already explored the temporal and spatial dependence of various weather variables such as temperature and sea-level in the country. 

Hence we experimented by inserting the two keywords, ‘climate-change’ and ‘Bangladesh’ as our search parameters into twitteR. Our aim was to reveal with what other words or phrases these keywords were associated. ‘Word Clouds’ are a useful way to summarize the results. In Word cloud graphical representations, font sizes are scaled according to the frequencies at which words occur; in this case how often they are ‘tweeted’. Below is a Word cloud for September 2014 which shows that ‘URBANISATION’ and ‘ROUND TABLE’ occurred most often in tweets.



By December 2014, however, when we tried again, the identical keywords were associated with ‘FLOATING GARDENS’, and the increasingly nebulous ‘ADAPTATION’ and ‘RESILIENCE’.


Clearly social media such as Twitter, combined with R’s software packages and plotting facilities, are fantastic tools for understanding the often capricious winds affecting the movement of global public interest and opinion.

To carry out your own social media mining using the twitteR package, you will first need to have an account on twitter and create an 'app'. Please see this and this link to get a better idea on how to get that done.

You will also find information in the above links on the 'twitter authentication' process which is also required to be able to perform the keyword searches using twitteR.

The code used to produce the above word clouds is outlined below.

# Load required libraries
library(RCurl)
library(stringr)
library(tm)
library(wordcloud)
library(RColorBrewer)
library(twitteR)
library(streamR)
library(grid)
library(ggplot2)
library(wesanderson)
 
# Load credentials
load("D:/ClimData/Twitter/twitter authentification.Rdata")
registerTwitterOAuth(twitCred)
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
 
tweets <- searchTwitter("Climate Change Bangladesh", n=1500, lang="en") 
 
tweets.text <- sapply(tweets, function(x) x$getText())
 
# Remove non alphanumeric characters
tweets.text <- gsub("[^a-zA-Z0-9 ]","",tweets.text)
 
# Convert all text to lower case
tweets.text <- tolower(tweets.text)
 
# Replace blank space (“rt”)
tweets.text <- gsub("rt", "", tweets.text)
 
# Replace @UserName
tweets.text <- gsub("@\\w+", "", tweets.text)
 
# Remove punctuation
tweets.text <- gsub("[[:punct:]]", "", tweets.text)
 
# Remove links
tweets.text <- gsub("http\\w+", "", tweets.text)
 
# Remove tabs
tweets.text <- gsub("[ |\t]{2,}", "", tweets.text)
 
# Remove blank spaces at the beginning
tweets.text <- gsub("^ ", "", tweets.text)
 
# Remove blank spaces at the end
tweets.text <- gsub(" $", "", tweets.text)
 
# remove word amp
tweet.text <- gsub("amp","",tweets.text)
 
# Create corpus
tweets.text.corpus <- Corpus(VectorSource(tweets.text))
 
# Clean up by removing stop words
tweets.text.corpus <- tm_map(tweets.text.corpus, function(x)removeWords(x,stopwords()))
 
# Create document term matrix applying some transformations
tdm = TermDocumentMatrix(tweets.text.corpus,
      control = list(removePunctuation = TRUE,
      stopwords = c("climate change","bangladesh","climate", "change"), stopwords("english"),
      removeNumbers = TRUE, tolower = TRUE))
 
# Define tdm as matrix
m = as.matrix(tdm)
 
# Get word counts in decreasing order
word_freqs = sort(rowSums(m), decreasing=TRUE) 
 
# Create a data frame with words and their frequencies
dm = data.frame(word=names(word_freqs), freq=word_freqs)
 
# Plot and save the image in png format
png("BGD_ClimateChange_Dec2014.png", width=5, height=5, units="in", res=500)
 
wordcloud(dm$word, dm$freq, random.order=FALSE, min.freq = 2,scale=c(4,0.5),max.words = 100,colors=wes.palette(5,"Darjeeling"))
 
dev.off()
Created by Pretty R at inside-R.org

Friday, 5 December 2014

Typhoon Hagupit fast approaching the Philippines

Point of post:
  • rNOMADS can be used to map near real-time meteorological phenomena such as storms (typhoons/hurricanes/cyclones)

In our last post, we demonstrated how the rNOMADS package can be used to describe the path of tropical cyclone that hit Philippines around a year ago. The data archives underpinning rNOMADS are, however, also available in near real-time allowing us to easily follow weather events as they actually unfold. The path of Typhoon Hagupit (Ruby) which is bearing down on the Philippines now (5 December 2014) is animated below.


Given the devastation caused by Typhoon Haiyan that occured around this time in 2013, we are very concerned for the safety of people in the area. Evacuations are already taking place and we are keeping our fingers crossed for the communities potentially in its path.

As for now, meteorologists are still unsure (cannot be predicted accurately) where exactly Hagupit will make landfall.

Note: The code to produce the plots used in the animation above is the same as the ones we provided in our last post with only a change in the time periods the data was extracted from NOMADS. 

Tuesday, 2 December 2014

Mapping Typhoon Haiyan/Yolanda with rNOMADS


The purpose of this article is to:

  • Promote the utility of the recently published rNOMADS software package which links directly to the climate model archive maintained by NOAA.
  • Demonstrate the use of rNOMADS for plotting global surface temperatures and wind-speeds.
  • Show how we used rNOMADS to map the extent and track of Typoon Yolanda/Haiyan.

rNOMADS allows R users to connect directly with the National Oceanic and Atmospheric Administration's (NOAA) Operational Model Archive and Distribution System (NOMADS). It allows rapid and efficient downloading of a vast range of different types of meteorological data from both global and regional weather models. Data for the past are available in addition to data for all sorts of forecasts.

Global surface temperatures from the NOAA archives (details of actual model used) are shown in the map below for 18:00 GMT on 29th October 2014. The band of high sea surface temperatures associated with the tropics is clear. It is also interesting to note the influence of the ‘upwelling zones’ on the western coasts of the southern American and Africa continents which tend to force cold water northwards. On land the relatively cool areas associated with the Andes and the Ethiopian Plateau are also visible.


We follow up global temperatures with the map of global windspeeds at ground level below. The ‘Roaring Forties’, ‘Furious Fifties’ and ‘Screaming Sixties’ are very clearly discernible. So too the ‘Doldrums’, or areas of calm wind which hug the equator . You can also make out Cyclone Nilofar in the north Indian Ocean which was about to strike the west coast of India.


The gif animation below utilizes the same data source to summarize the path of Typhoon Haiyan/Yolanda which hit the Philippines in November 2013 causing great devastation and loss of life.  The animation depicts wind-speed (km/hr) at 3 time points: 15:00 GMT and 21:00 GMT on 7th November 2013 and then early morning (03:00 GMT) on 8th November. Each raster grid is roughly 50 km's by 50 km's (0.5 degrees by 0.5 degrees) in extent, so the whole storm was around 650 km's across.  At its height wind-speeds exceeded 150 km/hr and the ‘eye of the storm’ is also clearly visible as the blue dot in the very centre.  


As usual, the R code is provided below if you are interested in producing similar plots for an area or time of interest. You can get plenty of information and code examples for the rNOMADS package on CRAN (link provided earlier in the post) as well as at the developers' blog posts on the package at the following link - https://bovineaerospace.wordpress.com/tag/rnomads/

# Load libraries
 
library(rNOMADS) 
library(GEOmap)
library(aqfig)
library(rgdal)
 
# Define levels
 
levels <- c("surface", "2 m above ground", "10 m above ground", "300 mb")
 
# Define variables - temperature and wind
 
variables <- c("TMP", "UGRD", "VGRD")
 
# Define additional model inputs
 
abbrev <- "gfsanl"
model.date <- 20141029
model.run <- 06
pred <- 3
 
# Get the data
 
grib.info <- ArchiveGribGrab(abbrev, model.date,model.run, pred, file.type = "grib2")
 
# Read data into R
 
grib.data <- ReadGrib(grib.info$file.name, levels, variables)
 
resolution <- c(0.5, 0.5) #Resolution of the model
 
# Make an array for easier manipulation
 
atmos <- ModelGrid(grib.data, resolution)
 
# Set up display
 
# Plot temperature at 2 m above ground
 
li <- which(atmos$levels == "2 m above ground")
vi <- which(atmos$variables == "TMP")
colormap <- rev(rainbow(500, start = 0 , end = 5/6))
 
# Read world boundaries
 
world <- readOGR(dsn="D:/Data/ne_10m_admin_0_countries", layer="ne_10m_admin_0_countries")
 
png(file = "world_surface_temp.png", width = 9, height = 6, res=400, type="cairo-png",units="in", bg="white",restoreConsole = TRUE)
 
image(atmos$x , sort(atmos$y), atmos$z[li,vi,,], col = colormap,
      xlab = "Longitude", ylab = "Latitude",
      main = paste("World Temperature at Ground Level (deg C):", atmos$fcst.date ,"GMT"))
 
# Plot coastlines
 
plot(world, border = "black", add = TRUE, MAPcol = NA)
 
# Plot legend, convert to Celsius
 
vertical.image.legend(col=colormap, zlim = range(atmos$z[li,vi,,] - 273.15))
dev.off()
 
# Plot wind magnitude at 10 m above ground
 
li <- which(atmos$levels == "10 m above ground")
vi.zonal <- which(atmos$variables == "UGRD") # East-West wind
vi.merid <- which(atmos$variables == "VGRD") # North-South wind
 
wind.mag <- sqrt(atmos$z[li,vi.zonal,,]^(2) + atmos$z[li,vi.merid,,]^(2))
colormap <- rev(rainbow(500, start = 0 , end = 5/6))
 
png(file = "world_surface_wind.png", width = 9, height = 6, res=400, type="cairo-png",units="in", bg="white",restoreConsole = TRUE)
 
image(atmos$x, sort(atmos$y), wind.mag, col = colormap,
      xlab = "Longitude", ylab = "Latitude",xlim=c(100,150),ylim=c(-5,20),
      main = paste("World Winds at Ground Level (km/hr):", atmos$fcst.date, "GMT"))
 
# Plot coastlines
 
plot(world, border = "black", add = TRUE, MAPcol = NA)
 
# Plot legend, convert to km/hr
 
vertical.image.legend(col=colormap, zlim = range(wind.mag * 3.6))
 
dev.off()
Created by Pretty R at inside-R.org

Wednesday, 26 November 2014

Trending climate-related keywords with GTrendsR





Key points of post

  • Google stores data on millions of internet searches throughout the world and these data are publicly available. 
  • Google trends (http://www.google.com/trends/?hl=en-GB)’ is a fascinating system that summarizes information from all these searches revealing what subjects are ‘trending’ worldwide.
  • Conveniently R has a library (GTrendsR) capable of extracting, analyzing and plotting these data. 
  • Here we explore time series data (2004-2014) summarizing Google searches for the keywords, ‘climate-change’, ‘global-warming’, ‘ocean-acidification’ and ‘sea-level rise’
  • All four series have trends and have either unimodal or bimodal seasonal cycles.
The other day Jason decided to play around (nerdy I know) with R’s relatively new GTrendR library and analyze the time-series statistics for the following four key words: ‘climate change’, ‘ocean acidification’, ‘global warming’, and ‘sea-level rise’. The results are displayed below for the last decade (details of how Google calculates these statistics are available here: https://support.google.com/trends/answer/4355164?hl=en). 


First let’s talk about what has happened to the ‘popularity’ of these searches over the long-term, ie. between 2004 and 2014. Searches for ‘climate change’ were fairly static between 2004 and 2008 and have since increased in popularity. ‘Global warming’ was similarly stable but then fell. In the past we think that ‘global warming’ was the more commonly used term to describe ‘anthropogenic heating’. As the wider consequences of increasing temperatures for the entire global weather and climate system became increasingly more recognized, however, it became clear that a more general term was needed and hence, ‘climate-change’.

Searches for ‘ocean-acidification (OA)’ – a product of CO2 pollution – and also known as the ‘ugly twin’ of climate-change have not changed substantially over the last decade or so, apart from a massive spike in mid-2013. My pal Prof. Hall-Spencer in Plymouth suggested the spike might be due to interest around this article (http://www.nbcnews.com/video/ann-curry-reports/54882960#54882960), and/or the fact that ocean acidification started getting discussed a lot in Washington DC in mid 2012. Furthermore some charitable organizations advertised research funding calls for OA research at that time.

Interest in sea-level rise (SLR) started to increase in 2011 and we have no idea why. Suggestions welcome!

The time-series displayed here are all available at weekly resolution. What particularly interests us in our time-series work is seasonality. Both the ocean-acidification and sea-level rise series are strongly seasonal with a single peak each year (unimodal) occurring between June and July. 

What causes this seasonality? Is it related to educational cycles in the northern hemisphere dictating when students undertake science related projects? Are there more climate change conferences in the northern hemisphere summer? Or perhaps it’s because there’s less football on TV distracting people from more pressing concerns? 

The other two series (‘climate change and ‘global warming’) are also seasonal but are different in having two peaks (bimodal). This is difficult to see in the plot above so we plotted the same data on another graph for a shorter time-period (2010-2014 below) which shows clearly that searches for climate-change most popular in March and October each year with troughs in January and July ? Does anyone have any idea why ? 


Could it be a confounding between seasonal educational cycles in the northern hemispheres (NH) and southern hemispheres (SH)? The argument might go like this: (i) Numbers of internet searches are far greater in developed countries; (ii) ‘The majority of developed countries are in the northern hemisphere; (iii) ‘Climate-change’ and ‘global-warming’ are much more popularly searched than the other two terms; (iv) The overwhelming predominance of northern-hemisphere searches somehow ‘drowns-out’ any seasonal signal from the south; and (v) The bimodal seasonal signals we see for OA and SLR are a combination of separate unimodal cycles from the NH and SHs. 

Clearly we need to subset and examine these data by location. This will form the subject of future blog articles.

In the meantime please feel free to suggest any alternative explanations.

As usual the R-code describing the extraction and plotting of the data is outlined below.

# Load required libraries
 
library(gtrend)
library(dplyr)
library(ggplot2)
library(scales)
library(ggthemes)
 
# Define terms to use for trend searches
 
terms <- c("Climate Change","Ocean Acidification","Sea Level Rise","Global Warming")
 
out <- gtrend_scraper("youremail@gmail.com", "yourpassword", terms) ## replace with your own google username and password
 
out %>%  trend2long() %>% plot() 
 
# Get plot of trends
 
a <- out %>%
     trend2long() %>%
     ggplot(aes(x=start, y=trend, color=term)) +
     xlab("\nYear") + ylab("Trend\n")+
     geom_line() + theme_stata()+
     facet_wrap(~term) + ggtitle("Phrase Search Trends on Google\n")+
     guides(color=FALSE)
 
a
 
# Save file to png
 
ggsave(a,file="GoogleTrend_Climate.png",dpi=500,w=10,h=6,unit="in",type="cairo-png")
 
 
# Extract just trends of 'climate change' search for years 2010 to 2014 and plot
# to observe seasonality
 
 
dat <- out[[1]][["trend"]]
colnames(dat)[3] <- "trend"
 
dat2 <- dat[dat[["start"]] > as.Date("2010-01-01"),]
 
rects <- dat2  %>% 
mutate(year=format(as.Date(start), "%y")) %>%
group_by(year) %>%
summarize(xstart = as.Date(min(start)), xend = as.Date(max(end)))
 
c <- ggplot() +
     geom_rect(data = rects, aes(xmin = xstart, xmax = xend, ymin = -Inf, 
     ymax = Inf, fill = factor(year)), alpha = 0.4) + theme_bw()+
     ylab ("Trend\n") + xlab("\nDate")+
     ggtitle("Search Trends of 'Climate Change' on Google (2010-2014)\n")+
     geom_line(data=dat2, aes(x=start, y=trend), size=0.75,colour="grey20",type="dashed") + 
     geom_point(data=dat2, aes(x=start, y=trend), size=1.25,colour="grey20") + 
     scale_x_date(labels = date_format("%b-%Y"), 
     breaks = date_breaks("3 months"),
     expand = c(0,0), 
     limits = c(as.Date("2010-01-01"), as.Date("2014-12-31"))) +
     stat_smooth(data=dat2,aes(x=start, y=trend), col="blue",method = "loess",span=0.1,se=T,size=1,linetype='twodash',alpha=0.5,fill="grey60")+
     scale_fill_discrete(guide = FALSE)+
     theme(axis.text.x = element_text(angle = -30, hjust =0,size=10),
     plot.title = element_text(lineheight=1.2, face="bold",size = 14, colour = "grey20"),
     panel.border = element_rect(colour = "black",fill=F,size=0.5),
     panel.grid.major = element_line(colour = "grey",size=0.25,linetype='longdash'),
     panel.grid.minor = element_blank(),
     axis.title.y=element_text(size=10,colour="grey20"),
     axis.title.x=element_text(size=10,colour="grey20"),
     axis.text.y=element_text(size=10,colour="grey20"),
     panel.background = element_rect(fill = NA,colour = "black"))
 
c
 
# Save plot to file
 
ggsave(c,file="GoogleTrend_ClimateChange2010-14.png",dpi=500,w=10,h=6,unit="in",type="cairo-png")
Created by Pretty R at inside-R.org