GitHub
Skills harvested from GitHub repositories
14810 skills availableverl
verl is a flexible, efficient, and production-ready RL training library for large language models from ByteDance's Seed ...
grpo-rl-training
Expert-level guidance for implementing Group Relative Policy Optimization (GRPO) using the Transformer Reinforcement Lea...
ray-data
Distributed data processing library for ML and AI workloads. Use Ray Data when: Processing large datasets (>100GB) for M...
langchain
The most popular framework for building LLM-powered applications. Use LangChain when: Building agents with tool calling ...
tensorrt-llm
NVIDIA's open-source library for optimizing LLM inference with state-of-the-art performance on NVIDIA GPUs. Use TensorRT...
instructor
Use Instructor when you need to: Extract structured data from LLM responses reliably Validate outputs against Pydantic s...
rigor-reviewer
You are an objective research reviewer for Agent-Native Research Artifacts. You receive an ARA directory path and produc...
pytorch-fsdp2
This skill teaches a coding agent how to add PyTorch FSDP2 to a training loop with correct initialization, sharding, mix...
mamba
Mamba is a state-space model architecture achieving O(n) linear complexity for sequence modeling. Installation: pip inst...
faiss
Facebook AI's library for billion-scale vector similarity search. Use FAISS when: Need fast similarity search on large v...
rwkv
RWKV (RwaKuv) combines Transformer parallelization (training) with RNN efficiency (inference). Installation: pip install...
ray-train
Ray Train scales machine learning training from single GPU to multi-node clusters with minimal code changes. Installatio...
openvla-oft
Fine-tuning and evaluation workflows for OpenVLA-OFT and OpenVLA-OFT+ from the official openvla-oft codebase. Covers bla...
phoenix
Open-source AI observability and evaluation platform for LLM applications with tracing, evaluation, datasets, experiment...
saelens
SAELens is the primary library for training and analyzing Sparse Autoencoders (SAEs) - a technique for decomposing polys...
speculative-decoding
Use Speculative Decoding when you need to: Speed up inference by 1.5-3.6× without quality loss Reduce latency for real-t...
gptq
Post-training quantization method that compresses LLMs to 4-bit with minimal accuracy loss using group-wise quantization...
pinecone
The vector database for production AI applications. Use when: Need managed, serverless vector database Production RAG ap...
nemo-evaluator
NeMo Evaluator SDK evaluates LLMs across 100+ benchmarks from 18+ harnesses using containerized, reproducible evaluation...
flash-attention
Flash Attention provides 2-4x speedup and 10-20x memory reduction for transformer attention through IO-aware tiling and ...
transformer-lens
TransformerLens is the de facto standard library for mechanistic interpretability research on GPT-style language models....
whisper
OpenAI's multilingual speech recognition model. Use when: Speech-to-text transcription (99 languages) Podcast/video tran...
research-manager
You are the Live PM — a post-task research recorder. You run ONLY at the END of a coding session, after the user's reque...
trl-fine-tuning
TRL provides post-training methods for aligning language models with human preferences. Installation: pip install trl tr...