0
I Use This!
New Project

Commits : Listings

Analyzed about 23 hours ago. based on code collected about 23 hours ago.
Feb 26, 2025 — Feb 26, 2026
Commit Message Contributor Files Modified Lines Added Lines Removed Code Location Date
update expected results due to numerical changes in swiglu
NGC92
as Erik Schultheis
More... 5 days ago
fast reciprocal for swiglu
NGC92
as Erik Schultheis
More... 5 days ago
refactor swiglu with utility function
NGC92
as Erik Schultheis
More... 5 days ago
explicit stream arg
NGC92
as Erik Schultheis
More... 5 days ago
update test targets due to new loss summation numerics
NGC92
as Erik Schultheis
More... 9 days ago
Fixes
NGC92
as Erik Schultheis
More... 9 days ago
Fixes
NGC92
as Erik Schultheis
More... 9 days ago
fix python interface
NGC92
as Erik Schultheis
More... 9 days ago
make python interface return full loss for now
NGC92
as Erik Schultheis
More... 9 days ago
logging loss@1k
NGC92
as Erik Schultheis
More... 9 days ago
allow inspecting the loss over a subset of sequence positions
NGC92
as Erik Schultheis
More... 9 days ago
grouped loss sum kernel
NGC92
as Erik Schultheis
More... 9 days ago
handle unknown devices safely in multi-gpu setup.
NGC92
as Erik Schultheis
More... 9 days ago
check that TMA kernel has been compiled for fused_classifier_dispatch
NGC92
as Erik Schultheis
More... 9 days ago
add B200 to SOl list
NGC92
as Erik Schultheis
More... 9 days ago
fix wandb watcher end condition
NGC92
as Erik Schultheis
More... 10 days ago
stricter tracking of data-loader state in checkpoints
NGC92
as Erik Schultheis
More... about 1 month ago
add option to re-initialize dataloader when continuing a training run, e.g., for mid-training when the dataset changes
NGC92
as Erik Schultheis
More... about 1 month ago
adjust tests
NGC92
as Erik Schultheis
More... about 1 month ago
better error when safetensor loading fails: show the file name
NGC92
as Erik Schultheis
More... about 1 month ago
bugfix for sharded optimizer states
NGC92
as Erik Schultheis
More... about 1 month ago
more flexible training continuation
NGC92
as Erik Schultheis
More... about 1 month ago
remove epoch logging -> log total progress instead
NGC92
as Erik Schultheis
More... about 1 month ago
use fp16 for rope frequencies to reduce rounding errors
NGC92
as Erik Schultheis
More... about 1 month ago
added llama3 shapes
NGC92
as Erik Schultheis
More... about 2 months ago
more optimizer generalization
NGC92
as Erik Schultheis
More... about 2 months ago
also handle non-block weights
NGC92
as Erik Schultheis
More... about 2 months ago
move buffer allocation to generic optimizer
NGC92
as Erik Schultheis
More... about 2 months ago
add a generic TensorContainer implementation
NGC92
as Erik Schultheis
More... about 2 months ago
fixes
NGC92
as Erik Schultheis
More... about 2 months ago