Looking at read depth per site using pileup
After doing some spot checking on individual libraries for the distribution of read depth per site, we wanted to do some more thorough quality checking by looking at stats for all the libraries. For what I am looking at here I am using mpileup data. For making the mpileup files I randomly subsampled 0.5% of the sites from all the TRUE TRUE D84A contigs that were over 2.5kb in length. Thus this there should be minimal influence of microbial DNA. I first looked at median read depth for all the clones. This is sorted by median read depth, and according to year/plate, where red is the SpFall2016 libraries, and turquoise is the Sp2017 libraries. You can see from this that six of the SpFall2016 libraries have very low read depths. I next worked on simulating poisson distributions for each sample. I used the number of sites for each sample and the median read depth as parameters. Here are some examples below of distributions of observed versus simulated read depth for a couple of cl...