#21 – Find significant relationships in data with a CoCo Matrix


Screen Shot 2013-08-19 at 08.47.17

The CoCo Matrix (correlation coefficient matrix) is a script for R that takes a table headed with multiple variables and calculates the correlation coefficients between each of the variables, determines which are statistically significant, and represents them visually in a grid-plot. I created the CoCo Matrix to cross correlate a table with a large number of variables to quickly assess where important correlations could be found.

Screen Shot 2013-08-19 at 08.47.27

Using the CoCo Matrix

The R file can be downloaded here or copied from the textbox at the end of this post.

  1. If you know the number of samples in your dataset (n) then degrees of freedom (df) = n-2. Use this table to find the R value above which significant values lie. In the code, at the top you should change the value of “p” as per the value you just looked up. If you don’t know the value for n then run the code once and type “n” into the console.
  2. If you want, customise the colours in the customisation area of the code
  3. Run the code. A dialogue box will request a file. Alternatively replace the code to direct to the file you want to use.
  4. Voila!

This is a very rough script I wrote, and I intend to make it a lot better at some point when I have the time. If you have any suggestions  for improvements then please comment below or get in touch with me.

# CoCo Matrix version 1.0
# Written by Darren J. Wilkinson
# wilkinsondarren.wordpress.com
# d.j.wilkinson@ed.ac.uk
#
# The "CoCo Matrix" visualises the correlation coefficients for a given set of data.
# Like-Like correlations are given NA values (e.g. Height vs Height = NA). For the moment
# duplicates such as Height vs. Weight and Weight vs. Height remain. At some point I'll 
# provide an update that removes duplicates like that.
#
# Please feel free to edit the code, and if you make any improvements please let me know
# either on wilkinsondarren.wordpress.com or send me an email at d.j.wilkinson@ed.ac.uk

# Packages -------
library (cwhmisc)
library (ggplot2)
library (grid)
library (scales)
# ----------------

# Plot Customisation ----------------------------------------------------------
# (for good colour suggestions visit colourlovers.com)
col.significant = "#556270"			# Colour used for significant correlations
col.notsignificant = "lightgrey"		# Colour used for non-significant correlations
col.na = "white"						# Colour used for NA values
e1 = c("nb", "ta", "ba", "rb", "hf", "zr", "yb", "y", "th", "u")   #  p) {s = "Significant"}
		if (temp < p) {s = "Not Significant"}
		if (temp == 1) {s = NA}
		if (temp == 1) {temp = NA}
		results[h,i] = temp
		plot.data[r,4] = s
		plot.data[r,3] = temp
		plot.data[r,2] = h
		plot.data[r,1] = i
	}

}

# Open new quartz window
dev.new (
	width = 12, 
	height = 9
	)

# Plot the matrix
ggplot (data = plot.data, aes (x = x, y = y)) + 

geom_point (aes (colour = sig), size = 20) + 

scale_x_continuous (labels = e1, name = "", breaks = c(1:n.e1)) +

scale_y_continuous (labels = e1, name = "", breaks = c(1:n.e1)) +

scale_colour_manual (values = c(col.notsignificant, col.significant, col.na)) +

labs (title = "CoCo Matrix v1.0")+

theme (
	plot.title = element_text (vjust = 3, size = 20, colour = "black"), #plot title
	plot.margin = unit (c(3, 3, 3, 3), "lines"), #adjust the margins of the entire plot
	plot.background = element_rect (fill = "white", colour = "black"),
	panel.border = element_rect (colour = "black", fill = F, size = 1), #change the colour of the axes to black
	panel.grid.major = element_blank (), # remove major grid
	panel.grid.minor = element_blank (),  # remove minor grid
	panel.background = element_rect (fill = "white"), #makes the background transparent (white) NEEDED FOR INSIDE TICKS
	legend.background = element_rect (colour = "black", size = 0.5, fill = "white"),
	legend.justification = c(0, 0),
	#legend.position = c(0, 0), # put the legend INSIDE the plot area
	legend.key = element_blank (), # switch off the rectangle around symbols in the legend
	legend.box.just = "bottom",
	legend.box = "horizontal",
	legend.title = element_blank (), # switch off the legend title
	legend.text = element_text (size = 15, colour = "black"), #sets the attributes of the legend text#
	axis.title.x = element_text (vjust = -2, size = 20, colour = "black"), #change the axis title
	axis.title.y = element_text (vjust = -0.1, angle = 90, size = 20, colour = "black"), #change the axis title
	axis.text.x = element_text (size = 17, vjust = -0.25, colour = "black"), #change the axis label font attributes
	axis.text.y = element_text (size = 17, hjust = 1, colour = "black"), #change the axis label font attributes#
	axis.ticks = element_line (colour = "black", size = 0.5), #sets the thickness and colour of axis ticks
	axis.ticks.length = unit(-0.25 , "cm"), #setting a negative length plots inside, but background must be FALSE colour
	axis.ticks.margin = unit(0.5, "cm") # the margin between the ticks and the text
	)

# Print data tables in the console
results
plot.data
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s