OSS & Tools — Megadose

43 recent items

SenseTime releases SenseNova U1 models on HuggingFace
TestingCatalog · Apr 29, 2026

SenseTime released SenseNova-U1, open multimodal models unifying image understanding and generation using a novel architecture without visual encoders or VAEs, now available on HuggingFace.
DeepSeek released 'Thinking-with-Visual-Primitives' framework
r/LocalLLaMA · Apr 30, 2026

DeepSeek, Peking University, and Tsinghua release 'Thinking with Visual Primitives,' a multimodal reasoning framework that elevates spatial tokens—coordinates and bounding boxes—into minimal units of thought, enabling models to 'point' within images during chain-of-thought reasoning.
A Unified Framework of Hyperbolic Graph Representation Learning Methods
ArXiv · AI/CL/LG · Apr 30, 2026

A unified open-source framework integrates multiple hyperbolic graph embedding methods under a common optimization interface, enabling consistent training, visualization, and evaluation across methods that were previously fragmented.
Mistral Médium 3.5 is here
r/LocalLLaMA · Apr 29, 2026

Mistral released Medium 3.5 (128B) on HuggingFace, a mid-tier model slot in their lineup with availability for local deployment.
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
ArXiv · AI/CL/LG · Apr 29, 2026

Implements speculative decoding in NeMo-RL with vLLM backend to accelerate RL post-training rollouts for frontier language models, supporting both synchronous and asynchronous pipelines.
microsoft/VibeVoice
Simon Willison · Apr 27, 2026

Microsoft released VibeVoice, a Whisper-style open-source speech-to-text model with built-in speaker diarization, MIT licensed and available via MLX for local Mac inference.
OpenAI releases Symphony, an open-source spec for agent orchestration that turns a project-management board like Linear into a control plane for coding agents (OpenAI)
Techmeme · Apr 28, 2026

OpenAI releases Symphony, an open-source agent orchestration spec enabling tools like Linear to serve as control planes for coding agents, standardizing multi-agent coordination.
Microsoft Presents "TRELLIS.2": An Open-Source, 4b-Parameter, Image-To-3D Model Producing Up To 1536³ PBR Textured Assets, Built On Native 3D VAES With 16× Spatial Compression, Delivering Efficient, Scalable, High-Fidelity Asset Generation.
r/LocalLLaMA · Apr 27, 2026

Microsoft released TRELLIS.2, a 4B-parameter open-source image-to-3D model with native 3D VAEs achieving 16× spatial compression to generate high-fidelity PBR-textured assets up to 1536³ resolution.
Xiaomi open sources MiMo-V2.5 and MiMo-V2.5-Pro under the MIT License, saying both models are among the most efficient available for agentic "claw" tasks (Carl Franzen/VentureBeat)
Techmeme · Apr 27, 2026

Xiaomi open-sources MiMo-V2.5 and V2.5-Pro under MIT license, claiming both are among the most efficient models for agentic 'claw' (UI automation) tasks, positioning them as cost-effective alternatives for automation workloads.
An open-source spec for orchestration: Symphony
OpenAI Blog · Apr 27, 2026

OpenAI releases Symphony, an open-source spec for orchestrating Codex agents that integrates with issue trackers to function as always-on agent systems.
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
HF Daily Papers · Apr 27, 2026

Diffusion Templates is a unified open plugin framework decoupling base-model inference from controllable capability injection, enabling reusable infrastructure across diffusion backbones.
Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis
ArXiv · AI/CL/LG · Apr 27, 2026

SpecValidator is a parameter-efficient fine-tuned classifier detecting three types of defective task descriptions (lexical vagueness, under-specification, syntax-formatting) in code generation prompts, outperforming GPT-5-mini.
DeepSeek Launches New-Generation V4 Models
The Information · Apr 24, 2026

DeepSeek releases V4, its new generation open-source AI models with enhanced reasoning and coding capabilities, the first major release since January's R1 sensation.
TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction
HF Daily Papers · Apr 24, 2026

TexOCR is a 2B-parameter model for reconstructing scientific PDFs into compilable LaTeX, trained on a new benchmark (TexOCR-Train) and evaluated via RL with LaTeX unit tests enforcing compilability.
Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do
HN · Agents · Apr 25, 2026

An open-source memory layer lets AI agents retain conversation context and user preferences across sessions, matching Claude.ai and ChatGPT memory capabilities.
Learning Evidence Highlighting for Frozen LLMs
HF Daily Papers · Apr 24, 2026

HiLight trains a lightweight Emphasis Actor via reinforcement learning to highlight decisive evidence in context without modifying the frozen solver.
Tencent Releases Hy3 preview - Open Source 295B 21B Active MoE
r/LocalLLaMA · Apr 23, 2026

Tencent releases Hy3, a 295B open-source MoE model with 21B active parameters per token, entering the crowded open-weights frontier.
Alibaba launches Qwen3.6-27B, an open-weight dense model with 27B parameters, saying it surpasses Qwen3.5-397B-A17B on major coding benchmarks (Qwen)
Techmeme · Apr 22, 2026

Alibaba releases Qwen3.6-27B, an open-weight dense model that claims to outperform the larger MoE Qwen3.5-397B-A17B on major coding benchmarks.
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
Simon Willison · Apr 22, 2026

Qwen3.6-27B is a new 27B dense open-weight model from Alibaba that claims flagship-level coding performance competitive with its own 397B MoE predecessor, now available on Hugging Face in full and quantized formats.
Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows
ArXiv · AI/CL/LG · Apr 23, 2026

Tool Attention replaces MCP's stateless eager schema injection with gated tool-level attention, eliminating 10k-60k token overhead in multi-server agentic deployments.
Introducing OpenAI Privacy Filter
OpenAI Blog · Apr 22, 2026

OpenAI released an open-weight PII detection and redaction model with state-of-the-art accuracy, available for download and integration.
Deepseek has released DeepEP V2 and TileKernels.
r/LocalLLaMA · Apr 23, 2026

DeepSeek released DeepEP V2 and TileKernels, open-source infrastructure for distributed MoE training communication and kernel optimization.
Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing
ArXiv · AI/CL/LG · Apr 22, 2026

Auto-ART provides structured synthesis of adversarial robustness research (2020–2026) plus an open-source framework with 50+ attacks, 28 defenses, and compliance mapping to EU AI Act, NIST AI RMF, and OWASP LLM Top 10.
Qwen3.6-27B released!
r/LocalLLaMA · Apr 22, 2026

Qwen3.6-27B open-weights model released on HuggingFace, a mid-sized variant in the Qwen3 series available for local deployment.
VLA Foundry: A Unified Framework for Training Vision-Language-Action Models
ArXiv · AI/CL/LG · Apr 21, 2026

VLA Foundry releases an open-source unified training framework for LLM, VLM, and VLA models from scratch to action fine-tuning, with trained models on LBM Eval.
Moonshot introduces Kimi K2.6, an open-weight model that it says shows strong improvements in long-horizon coding tasks, available under a modified MIT License (Kimi AI)
Techmeme · Apr 20, 2026

Moonshot releases Kimi K2.6 open-weight model with improved coding, long-horizon execution, and agent swarm capabilities under a modified MIT license.
LLM Safety From Within: Detecting Harmful Content with Internal Representations
HF Daily Papers · Apr 20, 2026

SIREN identifies safety neurons via linear probing across internal layers and combines them with adaptive weighting, outperforming open-source guard models using 250x fewer parameters.
llama.cpp speculative checkpointing was merged
r/LocalLLaMA · Apr 19, 2026

llama.cpp merged speculative checkpointing via ngram drafts, yielding 0-50% speedups for coding tasks depending on acceptance rates.
CCCL: In-GPU Compression-Coupled Collective Communication
ArXiv · AI/CL/LG · Apr 19, 2026

CCCL is an in-GPU compression-coupled collective communication library that achieves up to 3x NVLink bandwidth by fusing compression kernels directly into NCCL without user-side code changes.
Cloudflare open-sources lossless LLM compression tool
r/LocalLLaMA · Apr 18, 2026

Cloudflare open-sourced Unweight, a lossless compression system for LLMs that achieves 15-22% model size reduction and saves ~3GB VRAM on Llama-3.1-8B/H100, with GPU kernels on GitHub and a technical paper.
Alibaba unveils Qwen3.6-35B-A3B, an open-weight MoE model with 35B total and 3B active parameters, saying it rivals larger dense models in agentic coding tasks (Qwen)
Techmeme · Apr 17, 2026

Alibaba released Qwen3.6-35B-A3B, an open-weight MoE model with 35B total/3B active parameters that claims to match larger dense models on agentic coding tasks.
ChemGraph-XANES: An Agentic Framework for XANES Simulation and Analysis
ArXiv · AI/CL/LG · Apr 17, 2026

ChemGraph-XANES is an agentic LangGraph/LangChain framework automating XANES spectroscopy workflows from natural language task specification through FDMNES execution to curated data outputs.
Mozilla Announces "Thunderbolt" As An Open-Source, Enterprise AI Client
r/LocalLLaMA · Apr 16, 2026

Mozilla announced Thunderbolt, an open-source enterprise AI client, expanding Mozilla's presence beyond Firefox into the AI tooling space.
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis
HF Daily Papers · Apr 16, 2026

OpenMobile releases an open-source framework for synthesizing mobile agent task instructions and trajectories, including a scalable pipeline using global environment memory and a policy-switching rollout strategy.
Cloudflare's AI Platform: an inference layer designed for agents
HN · Agents · Apr 16, 2026

Cloudflare launched an AI inference platform purpose-built for agents, positioning edge infrastructure as the backbone for autonomous AI workloads.
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
ArXiv · AI/CL/LG · Apr 15, 2026

TREX automates full LLM fine-tuning via multi-agent collaboration (Researcher + Executor) modeled as a search tree, covering literature research through training and evaluation.
Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents
ArXiv · AI/CL/LG · Apr 14, 2026

Researchers introduce dual-trace memory encoding for LLM agents, pairing facts with narrative scene traces, achieving 73.7% vs 53.5% accuracy on LongMemEval benchmark.
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
HF Daily Papers · Apr 13, 2026

ClawGUI is an open-source framework providing unified infrastructure for training, evaluating, and deploying GUI agents with validated RL training support and consistent evaluation protocols.
Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure
HF Daily Papers · Apr 13, 2026

Sema Code decouples AI coding agent engines from delivery layers, publishing as a standalone npm library that any runtime can drive programmatically.
Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters
HN · Show HN AI · Apr 8, 2026

Researchers fingerprint 178 AI models' writing styles and identify similarity clusters, enabling model attribution and detection capabilities.
SkVM: Compiling Skills for Efficient Execution Everywhere
HF Daily Papers · Apr 6, 2026

SkVM proposes treating LLM agent skills as compilable code, analyzing 118K skills to build capability profiles that enable portable, consistent execution across different model-harness pairs.
SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems
HF Daily Papers · Apr 6, 2026

SuperLocalMemory V3.3 introduces FRQAD and biologically-inspired forgetting for local-first agent memory systems, enabling zero-cloud LLM memory with multi-channel retrieval.
Gemma 4: Byte for byte, the most capable open models
Google DeepMind · Apr 2, 2026

Google DeepMind releases Gemma 4, its most capable open model family for advanced reasoning and agentic workflows.