machine learning
2024
Calculating the Cost of a Google Deepmind Paper
July 30, 2024DeepSeek Core Readings 0 - Coder
June 30, 2024DeepSeek Core Readings 1 - LLM
June 23, 2024DeepSeek Core Readings
June 23, 20242023
Rough thoughts on Mixtral vs Open Source
December 13, 2023Knowing Enough About MoE to Explain Dropped Tokens in GPT-4
August 9, 2023Non-determinism in GPT-4 is caused by Sparse MoE
August 5, 2023Why can TorToiSe be fine-tuned?
February 16, 2023Why can't TorToiSe be fine-tuned?
February 11, 2023Fast (5x) Inference with TorToiSe-TTS
February 5, 2023