We implement a multilayer perceptron (MLP) character-level language model. In this video we also introduce many basics of machine learning (e.g. model training, learning rate tuning, hyperparameters, evaluation, train/dev/test splits, under/overfitting, etc.).
- makemore on github:
- jupyter notebook I built in this video:
- collab notebook (new)!!!:
- Bengio et al. 2003 MLP language model paper (pdf):
Useful links:
- PyTorch internals ref
- E01: Tune the hyperparameters of the training to beat my best validation loss of 2.2
- E02: I was not careful with the intialization of the network in this video. (1) What is the loss you’d get if the predicted probabilities at initialization were perfectly uniform? What loss do we achieve? (2) Can you tune the initialization to get a starting loss that is much more similar to (1)?
- E03: Read the Bengio et al 2003 paper (link above), implement and try any idea from the paper. Did it work?
00:00:00 intro
00:01:48 Bengio et al. 2003 (MLP language model) paper walkthrough
00:09:03 (re-)building our training dataset
00:12:19 implementing the embedding lookup table
00:18:35 implementing the hidden layer internals of : storage, views
00:29:15 implementing the output layer
00:29:53 implementing the negative log likelihood loss
00:32:17 summary of the full network
00:32:49 introducing and why
00:37:56 implementing the training loop, overfitting one batch
00:41:25 training on the full dataset, minibatches
00:45:40 finding a good initial learning rate
00:53:20 splitting up the dataset into train/val/test splits and why
01:00:49 experiment: larger hidden layer
01:05:27 visualizing the character embeddings
01:07:16 experiment: larger embedding size
01:11:46 summary of our final code, conclusion
01:13:24 sampling from the model
01:14:55 google collab (new!!) notebook advertisement
3 days ago 00:03:32 1
[CNBC Television] We will likely see brutal, hopefully short-lived recession: Economist
3 days ago 00:03:29 3
[SterlienG] Как построить реактор/How to build a reactor|Гайд|Big Reactors|FastGuide|
3 days ago 00:00:14 3
Долли зум.Ставрополь, Россия.Собор Казанской иконы матери Божьей в Ставрополе.Время заката, вид с воздухом, выезд камеры
3 days ago 00:00:15 2
Essentuki, Россия.Yessentuki Balneo-Mud Baths.Самый известный архитектурный памятник курортного города, вид с воздухом
3 days ago 00:31:48 2
[Tigerus White] Наконец вышел апдейт! Часть 1 Музей! Derail Valley Билд 99
3 days ago 00:13:01 1
TAKE DOWN, TOOK DOWN, TAKES DOWN Фразовый глагол “Take Down“: значения и примеры
3 days ago 00:12:12 2
TAKE APART, TOOK APART, TAKES APART: Фразовый глагол “Take Apart“: значения и примеры
3 days ago 00:12:12 7
TAKE APART, TOOK APART, TAKES APART:ФРАЗОВЫЕ ГЛАГОЛЫ - Значение, примеры использования, предложения. Курс английского языка
3 days ago 00:23:24 3
[KaidGames2] TOP 3 NEW SOLO BUILDS AFTER NEW UPDATE! Albion Solo Build Guide
3 days ago 00:04:43 2
[CNBC Television] Consumer wallets are primed to return to gaming: Analyst
3 days ago 00:00:51 5
ВСУ берегут технику и отправляют солдат на ротацию пешком – Bild
4 days ago 00:00:07 1
Shutterstock/edify-3d Я немного пощупал новый генератор 3Д от Нвидия. Кода нет и вряд ли будет. Есть демо: Но
4 days ago 00:17:36 4
[CNC Kitchen] PET Bottle Recycling: Waste to 3D Printing Filament
4 days ago 00:19:20 194
Майнкрафт, но с 1000+ НОВЫМИ КРАФТАМИ...
4 days ago 01:05:19 6
4 days ago 01:04:30 10
Японская неко 🐈 Огненный чел 🔥 Соник.Txt 📃 и другие ❗ Сборник модов FNF
4 days ago 00:29:59 18
4 days ago 00:01:27 3
Установка силового бампера заднего BMS URBAN для Тойота Хайлюкс 2015-2024
4 days ago 00:02:08 1
[CNBC Television] Facebook rakes in political ad money — Here’s what’s happening
4 days ago 00:00:06 4
Иванов, Россия.Полет над центром города.Революционная площадь.Памятник бойцам революции 1905 года, гиперлапс воздушно...
4 days ago 00:02:10 3
Установка бампера силового переднего BMS URBAN для Тойота Хайлюкс 2023-2024
4 days ago 00:00:18 5
Россия, Красноярск.Федеральный университет Сибирского, многофункциональный комплекс, воздушный вид