All Posts
2024
Calculating the Cost of a Google Deepmind Paper
July 30, 2024DeepSeek Core Readings 0 - Coder
June 30, 2024DeepSeek Core Readings 1 - LLM
June 23, 2024Basic tips for remaining conscious
April 12, 20242023
2023
December 31, 2023Rough thoughts on Mixtral vs Open Source
December 13, 2023Knowing Enough About MoE to Explain Dropped Tokens in GPT-4
August 9, 2023Non-determinism in GPT-4 is caused by Sparse MoE
August 5, 2023Dumped Blog Ideas
July 2, 2023Why can TorToiSe be fine-tuned?
February 16, 2023