14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
-
Updated
Mar 20, 2026 - Python
14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
[TMLR 2026] Survey: https://arxiv.org/pdf/2507.20198
📚 Collection of token-level model compression resources.
The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
You say it. AutoCode builds it. 38 professional skills, persistent memory, 60%+ dev cost savings. Zero dependencies. Free forever.
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
[ICLR 2026] MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding
😎 Awesome papers on token redundancy reduction
AI gateway with token compression for Claude Code, Codex, and more
This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3
[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.
Rust Local Token Compression Proxy for coding agents, built solo for GenAI Genesis 2026. 🏆 1st Google Sustainability Hack
Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model
[Arxiv 2025 Preprint] HiPrune, a training-free visual token pruning method for VLM acceleration.
hardened docker container & compose for openclaw
⚡ Compress Claude Code context by 60-90%. Six noise filters RTK doesn't have.
Add a description, image, and links to the token-compression topic page so that developers can more easily learn about it.
To associate your repository with the token-compression topic, visit your repo's landing page and select "manage topics."