#21 – Find significant relationships in data with a CoCo Matrix


Screen Shot 2013-08-19 at 08.47.17

The CoCo Matrix (correlation coefficient matrix) is a script for R that takes a table headed with multiple variables and calculates the correlation coefficients between each of the variables, determines which are statistically significant, and represents them visually in a grid-plot. I created the CoCo Matrix to cross correlate a table with a large number of variables to quickly assess where important correlations could be found.

Screen Shot 2013-08-19 at 08.47.27

Using the CoCo Matrix

The R file can be downloaded here or copied from the textbox at the end of this post.

  1. If you know the number of samples in your dataset (n) then degrees of freedom (df) = n-2. Use this table to find the R value above which significant values lie. In the code, at the top you should change the value of “p” as per the value you just looked up. If you don’t know the value for n then run the code once and type “n” into the console.
  2. If you want, customise the colours in the customisation area of the code
  3. Run the code. A dialogue box will request a file. Alternatively replace the code to direct to the file you want to use.
  4. Voila!

This is a very rough script I wrote, and I intend to make it a lot better at some point when I have the time. If you have any suggestions  for improvements then please comment below or get in touch with me.

# CoCo Matrix version 1.0
# Written by Darren J. Wilkinson
# wilkinsondarren.wordpress.com
# d.j.wilkinson@ed.ac.uk
#
# The "CoCo Matrix" visualises the correlation coefficients for a given set of data.
# Like-Like correlations are given NA values (e.g. Height vs Height = NA). For the moment
# duplicates such as Height vs. Weight and Weight vs. Height remain. At some point I'll 
# provide an update that removes duplicates like that.
#
# Please feel free to edit the code, and if you make any improvements please let me know
# either on wilkinsondarren.wordpress.com or send me an email at d.j.wilkinson@ed.ac.uk

# Packages -------
library (cwhmisc)
library (ggplot2)
library (grid)
library (scales)
# ----------------

# Plot Customisation ----------------------------------------------------------
# (for good colour suggestions visit colourlovers.com)
col.significant = "#556270"			# Colour used for significant correlations
col.notsignificant = "lightgrey"		# Colour used for non-significant correlations
col.na = "white"						# Colour used for NA values
e1 = c("nb", "ta", "ba", "rb", "hf", "zr", "yb", "y", "th", "u")   #  p) {s = "Significant"}
		if (temp < p) {s = "Not Significant"}
		if (temp == 1) {s = NA}
		if (temp == 1) {temp = NA}
		results[h,i] = temp
		plot.data[r,4] = s
		plot.data[r,3] = temp
		plot.data[r,2] = h
		plot.data[r,1] = i
	}

}

# Open new quartz window
dev.new (
	width = 12, 
	height = 9
	)

# Plot the matrix
ggplot (data = plot.data, aes (x = x, y = y)) + 

geom_point (aes (colour = sig), size = 20) + 

scale_x_continuous (labels = e1, name = "", breaks = c(1:n.e1)) +

scale_y_continuous (labels = e1, name = "", breaks = c(1:n.e1)) +

scale_colour_manual (values = c(col.notsignificant, col.significant, col.na)) +

labs (title = "CoCo Matrix v1.0")+

theme (
	plot.title = element_text (vjust = 3, size = 20, colour = "black"), #plot title
	plot.margin = unit (c(3, 3, 3, 3), "lines"), #adjust the margins of the entire plot
	plot.background = element_rect (fill = "white", colour = "black"),
	panel.border = element_rect (colour = "black", fill = F, size = 1), #change the colour of the axes to black
	panel.grid.major = element_blank (), # remove major grid
	panel.grid.minor = element_blank (),  # remove minor grid
	panel.background = element_rect (fill = "white"), #makes the background transparent (white) NEEDED FOR INSIDE TICKS
	legend.background = element_rect (colour = "black", size = 0.5, fill = "white"),
	legend.justification = c(0, 0),
	#legend.position = c(0, 0), # put the legend INSIDE the plot area
	legend.key = element_blank (), # switch off the rectangle around symbols in the legend
	legend.box.just = "bottom",
	legend.box = "horizontal",
	legend.title = element_blank (), # switch off the legend title
	legend.text = element_text (size = 15, colour = "black"), #sets the attributes of the legend text#
	axis.title.x = element_text (vjust = -2, size = 20, colour = "black"), #change the axis title
	axis.title.y = element_text (vjust = -0.1, angle = 90, size = 20, colour = "black"), #change the axis title
	axis.text.x = element_text (size = 17, vjust = -0.25, colour = "black"), #change the axis label font attributes
	axis.text.y = element_text (size = 17, hjust = 1, colour = "black"), #change the axis label font attributes#
	axis.ticks = element_line (colour = "black", size = 0.5), #sets the thickness and colour of axis ticks
	axis.ticks.length = unit(-0.25 , "cm"), #setting a negative length plots inside, but background must be FALSE colour
	axis.ticks.margin = unit(0.5, "cm") # the margin between the ticks and the text
	)

# Print data tables in the console
results
plot.data

#12 Under Pressure


After digesting eclogite in my bombs recently, I’ve really started to question one assumption. One of the reasons that I’ve been using the bombs in the first place is that I know that the high temperature and confined pressure reduces the boiling point of the acids used for sample digestion, thus meaning that effective digestion can take place at higher temperatures without the acids gassing off before dissolution is complete. Having said that, however, I have no definitive idea about the pressures generated in the bombs. This is because there is no simple easy way to know, as the pressure attained is a function of not only the ‘air space’ left in the bomb (that is the capacity of the bomb minus the volume of acid and sample added), but also of the composition of the liquid being heated.

After hunting in the manual for the bombs, I found some patchy data on the pressures assumed to be generated in the bombs by a small selection of liquids when present as a 10 ml volume in a 23 ml space. They are:

Comparing the pressures generated in 23 ml bombs by 10 ml of different liquids.

I thought I might plot up a simple graph comparing the pressure (P) and temperature (T) for these liquids in the bombs (above). The exercise also proved to be a good exercise to practice my skills in R programming. As you can see, even between the four different liquids, there’s quite a large variation in the pressure to be expected. Missing from this plot, however, are the two acids (HF and HNO3) which I use in my bomb dissolution. Who knows how they behave in these bombs? Well nobody, apparently. For the life of me I can’t find any data on those acids so as yet I’m unsure as to the actual pressures generated.

I’m almost certain that HNO3 alone generates pressures >> 2000 psi, as when cleaning the bombs, often a significant proportion of the acid is lost, i.e. the pressure is so high the top of the bomb (which is spring loaded) opens and some gas is released until it reseals. This does not happen when a sample is present in a mix of both HF and HNO3, suggesting that this mix generates a pressure somewhat lower. How much lower, however is the golden question which I am as yet unable to answer. As far I can see at the moment, there can be no sure way of knowing without conducting direct pressure measurements in the bombs, but watch this space as I might be able to do some modelling if I got my hands on some data!