OSS & Tools

Latest OSS & Tools on Megadose. AI news ranked, decayed, deduped.

43 recent items

  1. SenseTime releases SenseNova U1 models on HuggingFace
    TestingCatalog ·
    SenseTime released SenseNova-U1, open multimodal models unifying image understanding and generation using a novel architecture without visual encoders or VAEs, now available on HuggingFace.
  2. DeepSeek released 'Thinking-with-Visual-Primitives' framework
    r/LocalLLaMA ·
    DeepSeek, Peking University, and Tsinghua release 'Thinking with Visual Primitives,' a multimodal reasoning framework that elevates spatial tokens—coordinates and bounding boxes—into minimal units of thought, enabling models to 'point' within images during chain-of-thought reasoning.
  3. A Unified Framework of Hyperbolic Graph Representation Learning Methods
    ArXiv · AI/CL/LG ·
    A unified open-source framework integrates multiple hyperbolic graph embedding methods under a common optimization interface, enabling consistent training, visualization, and evaluation across methods that were previously fragmented.
  4. Mistral Médium 3.5 is here
    r/LocalLLaMA ·
    Mistral released Medium 3.5 (128B) on HuggingFace, a mid-tier model slot in their lineup with availability for local deployment.
  5. Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
    ArXiv · AI/CL/LG ·
    Implements speculative decoding in NeMo-RL with vLLM backend to accelerate RL post-training rollouts for frontier language models, supporting both synchronous and asynchronous pipelines.
  6. microsoft/VibeVoice
    Simon Willison ·
    Microsoft released VibeVoice, a Whisper-style open-source speech-to-text model with built-in speaker diarization, MIT licensed and available via MLX for local Mac inference.
  7. OpenAI releases Symphony, an open-source spec for agent orchestration that turns a project-management board like Linear into a control plane for coding agents (OpenAI)
    Techmeme ·
    OpenAI releases Symphony, an open-source agent orchestration spec enabling tools like Linear to serve as control planes for coding agents, standardizing multi-agent coordination.
  8. Microsoft Presents "TRELLIS.2": An Open-Source, 4b-Parameter, Image-To-3D Model Producing Up To 1536³ PBR Textured Assets, Built On Native 3D VAES With 16× Spatial Compression, Delivering Efficient, Scalable, High-Fidelity Asset Generation.
    r/LocalLLaMA ·
    Microsoft released TRELLIS.2, a 4B-parameter open-source image-to-3D model with native 3D VAEs achieving 16× spatial compression to generate high-fidelity PBR-textured assets up to 1536³ resolution.
  9. Xiaomi open sources MiMo-V2.5 and MiMo-V2.5-Pro under the MIT License, saying both models are among the most efficient available for agentic "claw" tasks (Carl Franzen/VentureBeat)
    Techmeme ·
    Xiaomi open-sources MiMo-V2.5 and V2.5-Pro under MIT license, claiming both are among the most efficient models for agentic 'claw' (UI automation) tasks, positioning them as cost-effective alternatives for automation workloads.
  10. An open-source spec for orchestration: Symphony
    OpenAI Blog ·
    OpenAI releases Symphony, an open-source spec for orchestrating Codex agents that integrates with issue trackers to function as always-on agent systems.
  11. Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
    HF Daily Papers ·
    Diffusion Templates is a unified open plugin framework decoupling base-model inference from controllable capability injection, enabling reusable infrastructure across diffusion backbones.
  12. Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis
    ArXiv · AI/CL/LG ·
    SpecValidator is a parameter-efficient fine-tuned classifier detecting three types of defective task descriptions (lexical vagueness, under-specification, syntax-formatting) in code generation prompts, outperforming GPT-5-mini.
  13. DeepSeek Launches New-Generation V4 Models
    The Information ·
    DeepSeek releases V4, its new generation open-source AI models with enhanced reasoning and coding capabilities, the first major release since January's R1 sensation.
  14. TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction
    HF Daily Papers ·
    TexOCR is a 2B-parameter model for reconstructing scientific PDFs into compilable LaTeX, trained on a new benchmark (TexOCR-Train) and evaluated via RL with LaTeX unit tests enforcing compilability.
  15. Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do
    HN · Agents ·
    An open-source memory layer lets AI agents retain conversation context and user preferences across sessions, matching Claude.ai and ChatGPT memory capabilities.
  16. Learning Evidence Highlighting for Frozen LLMs
    HF Daily Papers ·
    HiLight trains a lightweight Emphasis Actor via reinforcement learning to highlight decisive evidence in context without modifying the frozen solver.
  17. Tencent Releases Hy3 preview - Open Source 295B 21B Active MoE
    r/LocalLLaMA ·
    Tencent releases Hy3, a 295B open-source MoE model with 21B active parameters per token, entering the crowded open-weights frontier.
  18. Alibaba launches Qwen3.6-27B, an open-weight dense model with 27B parameters, saying it surpasses Qwen3.5-397B-A17B on major coding benchmarks (Qwen)
    Techmeme ·
    Alibaba releases Qwen3.6-27B, an open-weight dense model that claims to outperform the larger MoE Qwen3.5-397B-A17B on major coding benchmarks.
  19. Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
    Simon Willison ·
    Qwen3.6-27B is a new 27B dense open-weight model from Alibaba that claims flagship-level coding performance competitive with its own 397B MoE predecessor, now available on Hugging Face in full and quantized formats.
  20. Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows
    ArXiv · AI/CL/LG ·
    Tool Attention replaces MCP's stateless eager schema injection with gated tool-level attention, eliminating 10k-60k token overhead in multi-server agentic deployments.
  21. Introducing OpenAI Privacy Filter
    OpenAI Blog ·
    OpenAI released an open-weight PII detection and redaction model with state-of-the-art accuracy, available for download and integration.
  22. Deepseek has released DeepEP V2 and TileKernels.
    r/LocalLLaMA ·
    DeepSeek released DeepEP V2 and TileKernels, open-source infrastructure for distributed MoE training communication and kernel optimization.
  23. Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing
    ArXiv · AI/CL/LG ·
    Auto-ART provides structured synthesis of adversarial robustness research (2020–2026) plus an open-source framework with 50+ attacks, 28 defenses, and compliance mapping to EU AI Act, NIST AI RMF, and OWASP LLM Top 10.
  24. Qwen3.6-27B released!
    r/LocalLLaMA ·
    Qwen3.6-27B open-weights model released on HuggingFace, a mid-sized variant in the Qwen3 series available for local deployment.
  25. VLA Foundry: A Unified Framework for Training Vision-Language-Action Models
    ArXiv · AI/CL/LG ·
    VLA Foundry releases an open-source unified training framework for LLM, VLM, and VLA models from scratch to action fine-tuning, with trained models on LBM Eval.
  26. Moonshot introduces Kimi K2.6, an open-weight model that it says shows strong improvements in long-horizon coding tasks, available under a modified MIT License (Kimi AI)
    Techmeme ·
    Moonshot releases Kimi K2.6 open-weight model with improved coding, long-horizon execution, and agent swarm capabilities under a modified MIT license.
  27. LLM Safety From Within: Detecting Harmful Content with Internal Representations
    HF Daily Papers ·
    SIREN identifies safety neurons via linear probing across internal layers and combines them with adaptive weighting, outperforming open-source guard models using 250x fewer parameters.
  28. llama.cpp speculative checkpointing was merged
    r/LocalLLaMA ·
    llama.cpp merged speculative checkpointing via ngram drafts, yielding 0-50% speedups for coding tasks depending on acceptance rates.
  29. CCCL: In-GPU Compression-Coupled Collective Communication
    ArXiv · AI/CL/LG ·
    CCCL is an in-GPU compression-coupled collective communication library that achieves up to 3x NVLink bandwidth by fusing compression kernels directly into NCCL without user-side code changes.
  30. Cloudflare open-sources lossless LLM compression tool
    r/LocalLLaMA ·
    Cloudflare open-sourced Unweight, a lossless compression system for LLMs that achieves 15-22% model size reduction and saves ~3GB VRAM on Llama-3.1-8B/H100, with GPU kernels on GitHub and a technical paper.
  31. Alibaba unveils Qwen3.6-35B-A3B, an open-weight MoE model with 35B total and 3B active parameters, saying it rivals larger dense models in agentic coding tasks (Qwen)
    Techmeme ·
    Alibaba released Qwen3.6-35B-A3B, an open-weight MoE model with 35B total/3B active parameters that claims to match larger dense models on agentic coding tasks.
  32. ChemGraph-XANES: An Agentic Framework for XANES Simulation and Analysis
    ArXiv · AI/CL/LG ·
    ChemGraph-XANES is an agentic LangGraph/LangChain framework automating XANES spectroscopy workflows from natural language task specification through FDMNES execution to curated data outputs.
  33. Mozilla Announces "Thunderbolt" As An Open-Source, Enterprise AI Client
    r/LocalLLaMA ·
    Mozilla announced Thunderbolt, an open-source enterprise AI client, expanding Mozilla's presence beyond Firefox into the AI tooling space.
  34. OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis
    HF Daily Papers ·
    OpenMobile releases an open-source framework for synthesizing mobile agent task instructions and trajectories, including a scalable pipeline using global environment memory and a policy-switching rollout strategy.
  35. Cloudflare's AI Platform: an inference layer designed for agents
    HN · Agents ·
    Cloudflare launched an AI inference platform purpose-built for agents, positioning edge infrastructure as the backbone for autonomous AI workloads.
  36. TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
    ArXiv · AI/CL/LG ·
    TREX automates full LLM fine-tuning via multi-agent collaboration (Researcher + Executor) modeled as a search tree, covering literature research through training and evaluation.
  37. Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents
    ArXiv · AI/CL/LG ·
    Researchers introduce dual-trace memory encoding for LLM agents, pairing facts with narrative scene traces, achieving 73.7% vs 53.5% accuracy on LongMemEval benchmark.
  38. ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
    HF Daily Papers ·
    ClawGUI is an open-source framework providing unified infrastructure for training, evaluating, and deploying GUI agents with validated RL training support and consistent evaluation protocols.
  39. Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure
    HF Daily Papers ·
    Sema Code decouples AI coding agent engines from delivery layers, publishing as a standalone npm library that any runtime can drive programmatically.
  40. Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters
    HN · Show HN AI ·
    Researchers fingerprint 178 AI models' writing styles and identify similarity clusters, enabling model attribution and detection capabilities.
  41. SkVM: Compiling Skills for Efficient Execution Everywhere
    HF Daily Papers ·
    SkVM proposes treating LLM agent skills as compilable code, analyzing 118K skills to build capability profiles that enable portable, consistent execution across different model-harness pairs.
  42. SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems
    HF Daily Papers ·
    SuperLocalMemory V3.3 introduces FRQAD and biologically-inspired forgetting for local-first agent memory systems, enabling zero-cloud LLM memory with multi-channel retrieval.
  43. Gemma 4: Byte for byte, the most capable open models
    Google DeepMind ·
    Google DeepMind releases Gemma 4, its most capable open model family for advanced reasoning and agentic workflows.