Articles by tag: science
Science and statistics are hard. There are lots of reasons that can make things go wrong, and it's important to remember that when looking at p-values and hypothesis tests.
Relying on first hand evidence is important for building an argument. Here I discuss how citation limits in journals may entice readers to read the review papers and not the original articles, but that doing this diligent work is important for research.
Scientific writing is about communication. Colour is an effective visual communication tool, but tough to use well in scientific figures. Here are some notes on how to do it well, what tools to use, and what examples to use as inspiration.
I took many notes throughout my undergraduate degree in mathematical physics at the University of Waterloo. Once I finished my last exam, I decided to digitize all of my notes and discarding the physical copies. Below, you can find all my notes from all of my classes at the time.
Hi-C data analysis is still a relatively new field in genomics. The data itself is quite large and expensive to make, which means datasets and exploration of the data is still immature, compared to other technologies like RNA-seq. Here, I discuss aggregate peak analysis, a commonly-used and poorly-documented analytical technique to verify identified features in Hi-C data.
Differential analysis using sequencing data is, at its heart, a very simple idea that involves a lot of complicated statistics. It makes explaining the simple idea to newcomers in bioinformatics very difficult. Here, I want to break down the motivation behind differential analysis and explain where the complicated statistics come from.
The Central Limit Theorem is a pillar of statistics. We can apply the proof of the CLT to understand how different estimators converge in distribution with large sample sizes.
Like many academics, I've started giving presentations and tutorial sessions remotely. Here are some brief tips from my experience and resources for giving good lectures.
Some thoughts about what I'd like students taking the biostatistics course I'm TA'ing to take away from the class.
Why do physicists talk about symmetries and conservation laws all the time? It's because that's what a good scientific theory looks like.
A cursory look at the economics of scientific software, and the implications on its usability and longevity.
Working with annotated genomes is not always an easy process. Here, I detail how to easily create tabular annotation data from GENCODE that can be easily used in any analysis.
How incomprehensible machine learning models answer questions without providing the solutions we desire.
HiGlass is an interactive genome browser that's particularly useful for Hi-C data. Here, I describe how to create your own genome annotation file for HiGlass, allowing you to more easily display your work, regardless of the organism you work in.
On May 1, 2020, Harvard published a report about the relationship between Harvard faculty and Jeffrey Epstein, detailing the numerous interactions, gifts, and acts of questionable behaviour or outright misconduct surrounding the now-deceased "scientific philanthropist".
Journal articles are one way in which scientific research is disseminated. But they're not how one learns how to do science.
There are many reasons to be excited about scientific progress in the biological sciences, especially if you're a mathematician of almost any kind.
I offer 10 practical suggestions for designing robust, intuitive, and user-friendly software tools for bioinformatics.
Many academic posters look boring: white backgrounds, black text, some shade of neutral blue as an accent colour, etc. I've designed some posters with dark backgrounds, and I've learned a thing or two from making them that I'd like to share.
I'll go against the current trend and say that I don't think you should use Twitter as a tool for working in academia.
I want to highlight how clever the derivation of Tajima's statistic is, and a great idea he puts forward in his 1989 paper.
A cautionary tale of trusting your data from another source.
"Read coverage" in high throughput sequencing is a bit of an ambiguous term. Here, I make the argument for using the analogous term "support", coming from set theory and its interpretation.
Making high-quality bioinformatics software is hard. Installing and using it shouldn't be, though. Here's a detailed description of all the work I did to try and install the ChAMP package.
A quick method for keeping updated on works published by specific authors using PubMed's not so well known RSS feature.
Some thoughts about using Git's branching model for clean and clear scientific analysis.
I ported PubPeer's Chrome extension for Microsoft Edge.
A video project for my fluid dynamics class in undergrad.
A brief description and report of my work at the Institute for Quantum Computing during an undergraduate research term.