My (updated) DL workflow for 2023

Posted on Sun 04 June 2023 in Programming • Tagged with Programming, Machine Learning, Deep Learning

Remember last semester, where I said this?

That went pretty well. Much better than I expected it to go :) I also used the HPC DL environment a lot, across 4 courses that I did (COL380, COL772, COL775 and COD310). It was so used that at a point I had four …


Continue reading

Batch Normalization

Posted on Thu 16 February 2023 in Mathematics • Tagged with Mathematics, Machine Learning, Deep Learning

$$ \require{physics} \newcommand{B}{\mathcal{B}}$$

Batch Normalization was proposed by Ioffe and Szegedy in 2015, and it spawned several normalization techniques that are used in SOTA models today (layer norm, weight norm, etc). Batch normalization normalizes the output of each layer based on the mean and variance of the …


Continue reading

L2 regularization intuition

Posted on Sun 22 January 2023 in Mathematics • Tagged with Mathematics, Machine Learning, Deep Learning

A nice intuition for L2 regularization comes from having a prior on the distribution of parameters: the prior assumes that the parameters are close to zero. Let's assume that the prior is $\mathcal{N}(0, \Sigma)$. The MAP estimate of the parameters would then be

$$\begin{align} \theta_{\text{MAP …


Continue reading

Optimizers, Part 1

Posted on Mon 02 January 2023 in Programming • Tagged with Programming, Machine Learning, Deep Learning

Happy New Year! This is going to was supposed to be a long one, so sit back and grab a chocolate (and preferably view this on your laptop)

Some optimization algorithms. Click on a colour in the legend to hide/show it

Table of Contents

  1. Introduction

Continue reading

My DL workflow for 2023

Posted on Thu 29 December 2022 in Programming • Tagged with Programming, Machine Learning, Deep Learning

I've kind of zeroed down on Deep Learning at this point, and putting my money where my mouth is, will be taking both COL772 (Natural Language Processing) and COL775 (Deep Learning) next semester.

Along with Operating Systems, Parallel Programming and Theory of Computation.

Why a workflow?

I'll need to train …


Continue reading