
introducing silk mulberry 1.5
silk mulberry 1.5 is the fastest, most cost-effective model in the silk family. it's designed for real-time voice applications, and its most important feature is the power of voice design.
by rumik research team

encoder free models and the bitter lesson
we keep seeing sutton's bitter lesson play out again and again across modality interfaces. the same thing keeps happening when we hand design some structure to get data into a model, an encoder or a codec or some special front end, and then once models get big enough we strip most of it away and feed the model something closer to the original signal.
by pulsating genius

optimizing snake1d activation kernel in triton
how far can an activation kernel really be pushed on modern hardware?
this worklog explores that question through the lens of snake1D, a core activation used in SNAC, a multi-scale neural audio codec. starting from a straightforward pytorch eager implementation.
by andy

How to Reverse Engineer Neural Networks
here’s a fun fact: nobody fully understands why large language models work. we know the math, we know the architecture, we can train them. but ask “why did it output this specific token?”
by suryansh

muP (maximal update parameterization) does not solve horizon scaling
many researchers assume that if they use muP (a method by Yang et al, 2022, for transferring hyperparameters across model sizes), they are safe from LR-tuning headaches.
turns out this is not so when scaling for token counts.
by andy


