Hi-C data analysis is still a relatively new field in genomics. The data itself is quite large and expensive to make, which means datasets and exploration of the data is still immature, compared to other technologies like RNA-seq. Here, I discuss aggregate peak analysis, a commonly-used and poorly-documented analytical technique to verify identified features in Hi-C data.
Differential analysis using sequencing data is, at its heart, a very simple idea that involves a lot of complicated statistics. It makes explaining the simple idea to newcomers in bioinformatics very difficult. Here, I want to break down the motivation behind differential analysis and explain where the complicated statistics come from.
The Central Limit Theorem is a pillar of statistics. We can apply the proof of the CLT to understand how different estimators converge in distribution with large sample sizes.
Mathematical notation is a signature of math. Almost anyone can recognize it instantly, even if they don't know what it is. I want to talk a bit of why notation is useful, why it can be confusing, and tackle some examples in statistics that are often confusin with some clear notation.
I made a command line tool for keeping track of financial statements for various accounts called Quill. Here's a breakdown of how I developed a solution.
Like many academics, I've started giving presentations and tutorial sessions remotely. Here are some brief tips from my experience and resources for giving good lectures.
Some thoughts about what I'd like students taking the biostatistics course I'm TA'ing to take away from the class.
A brief look at potential functions in 3 dimensions, and how Poincare's lemma can make it easier to solve for vector potentials.
Why do physicists talk about symmetries and conservation laws all the time? It's because that's what a good scientific theory looks like.
A cursory look at the economics of scientific software, and the implications on its usability and longevity.