Open Source

NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for...

0
NVIDIA has unveiled Parakeet TDT 0.6B, a state-of-the-art automatic speech recognition (ASR) model that is now fully open-sourced on Hugging Face. With 600 million...

Meta AI Releases Llama Prompt Ops: A Python Toolkit for Prompt...

0
Meta AI has released Llama Prompt Ops, a Python package designed to streamline the process of adapting prompts for Llama models. This open-source tool...

IBM AI Releases Granite 4.0 Tiny Preview: A Compact Open-Language Model...

0
IBM has introduced a preview of Granite 4.0 Tiny, the smallest member of its upcoming Granite 4.0 family of language models. Released under the...

JetBrains Open Sources Mellum: A Developer-Centric Language Model for Code-Related Tasks

0
JetBrains has officially open-sourced Mellum, a purpose-built 4-billion-parameter language model tailored for software development tasks. Developed from the ground up, Mellum reflects JetBrains’ engineering-first...

Meta and Booz Allen Deploy Space Llama: Open-Source AI Heads to...

0
In a significant step toward enabling autonomous AI systems in space, Meta and Booz Allen Hamilton have announced the deployment of Space Llama, a...

DeepSeek-AI Released DeepSeek-Prover-V2: An Open-Source Large Language Model Designed for Formal...

0
Formal mathematical reasoning has evolved into a specialized subfield of artificial intelligence that requires strict logical consistency. Unlike informal problem solving, which allows for...

Multimodal AI on Developer GPUs: Alibaba Releases Qwen2.5-Omni-3B with 50% Lower...

0
Multimodal foundation models have shown substantial promise in enabling systems that can reason across text, images, audio, and video. However, the practical deployment of...

Reinforcement Learning for Email Agents: OpenPipe’s ART·E Outperforms o3 in Accuracy,...

0
OpenPipe has introduced ART·E (Autonomous Retrieval Tool for Email), an open-source research agent designed to answer user questions based on inbox contents with a...

Meet Rowboat: An Open-Source IDE for Building Complex Multi-Agent Systems

0
As multi-agent systems gain traction in real-world applications—from customer support automation to AI-native infrastructure—the need for a streamlined development interface has never been greater....

AWS Introduces SWE-PolyBench: A New Open-Source Multilingual Benchmark for Evaluating AI...

0
Recent advancements in large language models (LLMs) have enabled the development of AI-based coding agents that can generate, modify, and understand software code. However,...

ByteDance Releases UI-TARS-1.5: An Open-Source Multimodal AI Agent Built upon a...

0
ByteDance has released UI-TARS-1.5, an updated version of its multimodal agent framework focused on graphical user interface (GUI) interaction and game environments. Designed as...

OpenAI Releases Codex CLI: An Open-Source Local Coding Agent that Turns...

0
Command-line interfaces (CLIs) are indispensable tools for developers, offering powerful capabilities for system management and automation. However, they require precise syntax and a thorough...

THUDM Releases GLM 4: A 32B Parameter Model Competing Head-to-Head with...

0
In the rapidly evolving landscape of large language models (LLMs), researchers and organizations face significant challenges. These include enhancing reasoning abilities, providing robust multilingual...

Together AI Released DeepCoder-14B-Preview: A Fully Open-Source Code Reasoning Model That...

0
The demand for intelligent code generation and automated programming solutions has intensified, fueled by a rapid rise in software complexity and developer productivity needs....

OpenAI Open Sources BrowseComp: A New Benchmark for Measuring the Ability...

0
Despite advances in large language models (LLMs), AI agents still face notable limitations when navigating the open web to retrieve complex information. While many...

Google Releases Agent Development Kit (ADK): An Open-Source AI Framework Integrated...

0
Google has released the Agent Development Kit (ADK), an open-source framework aimed at making it easier for developers to build, manage, and deploy multi-agent...

Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with...

0
LLMs have revolutionized artificial intelligence, transforming various applications across industries. Autoregressive (AR) models dominate current text generation, with leading systems like GPT-4, DeepSeek, and...

Reducto AI Released RolmOCR: A SoTA OCR Model Built on Qwen...

0
Optical Character Recognition (OCR) has long been a cornerstone of document digitization, enabling the transformation of printed text into machine-readable formats. However, traditional OCR...

Meta AI Just Released Llama 4 Scout and Llama 4 Maverick:...

0
Today, Meta AI announced the release of its latest generation multimodal models, Llama 4, featuring two variants: Llama 4 Scout and Llama 4 Maverick....

NVIDIA AI Released AgentIQ: An Open-Source Library for Efficiently Connecting and...

0
Enterprises increasingly adopt agentic frameworks to build intelligent systems capable of performing complex tasks by chaining tools, models, and memory components. However, as organizations...

Meet Open-Qwen2VL: A Fully Open and Compute-Efficient Multimodal Large Language Model

0
Multimodal Large Language Models (MLLMs) have advanced the integration of visual and textual modalities, enabling progress in tasks such as image captioning, visual question...

Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual...

0
Automatic speech recognition (ASR) technologies have advanced significantly, yet notable disparities remain in their ability to accurately recognize diverse languages. Prominent ASR systems, such...

Nomic Open Sources State-of-the-Art Multimodal Embedding Model

0
Nomic has announced the release of "Nomic Embed Multimodal," a groundbreaking embedding model that achieves state-of-the-art performance on visual document retrieval tasks. The new...

How to Build a Prototype X-ray Judgment Tool (Open Source Medical...

0
In this tutorial, we demonstrate how to build a prototype X-ray judgment tool using open-source libraries in Google Colab. By leveraging the power of...

Meet Open Deep Search (ODS): A Plug-and-Play Framework Democratizing Search with...

0
The rapid advancements in search engine technologies integrated with large language models (LLMs) have predominantly favored proprietary solutions such as Google's GPT-4o Search Preview...

Kyutai Releases MoshiVis: The First Open-Source Real-Time Speech Model that can...

0
​Artificial intelligence has made significant strides in recent years, yet integrating real-time speech interaction with visual content remains a complex challenge. Traditional systems often...

NVIDIA AI Open Sources Dynamo: An Open-Source Inference Library for Accelerating...

0
​The rapid advancement of artificial intelligence (AI) has led to the development of complex models capable of understanding and generating human-like text. Deploying these...

NVIDIA AI Just Open Sourced Canary 1B and 180M Flash –...

0
In the realm of artificial intelligence, multilingual speech recognition and translation have become essential tools for facilitating global communication. However, developing models that can...

NVIDIA Open-Sources cuOpt: An AI-Powered Decision Optimization Engine–Unlocking Real-Time Optimization at...

0
Every day, organizations face complex logistical challenges—from optimizing delivery routes and managing supply chains to streamlining production schedules. These tasks typically involve massive datasets...

IBM and Hugging Face Researchers Release SmolDocling: A 256M Open-Source Vision...

0
Converting complex documents into structured data has long posed significant challenges in the field of computer science. Traditional approaches, involving ensemble systems or very...

ByteDance Research Releases DAPO: A Fully Open-Sourced LLM Reinforcement Learning System...

0
Reinforcement learning (RL) has become central to advancing Large Language Models (LLMs), empowering them with improved reasoning capabilities necessary for complex tasks. However, the...

Groundlight Research Team Released an Open-Source AI Framework that Makes It...

0
Modern VLMs struggle with tasks requiring complex visual reasoning, where understanding an image alone is insufficient, and deeper interpretation is needed. While recent advancements...

HPC-AI Tech Releases Open-Sora 2.0: An Open-Source SOTA-Level Video Generation Model...

0
AI-generated videos from text descriptions or images hold immense potential for content creation, media production, and entertainment. Recent advancements in deep learning, particularly in...

Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open...

0
The rapid evolution of artificial intelligence (AI) has ushered in a new era of large language models (LLMs) capable of understanding and generating human-like...

Simular Releases Agent S2: An Open, Modular, and Scalable AI Framework...

0
In today’s digital landscape, interacting with a wide variety of software and operating systems can often be a tedious and error-prone experience. Many users...

Hugging Face Releases OlympicCoder: A Series of Open Reasoning AI Models...

0
In the realm of competitive programming, both human participants and artificial intelligence systems encounter a set of unique challenges. Many existing code generation models...

Reka AI Open Sourced Reka Flash 3: A 21B General-Purpose Reasoning...

0
In today’s dynamic AI landscape, developers and organizations face several practical challenges. High computational demands, latency issues, and limited access to truly adaptable open-source...

Defog AI Open Sources Introspect: MIT-Licensed Deep-Research for Your Internal Data

0
Modern enterprises face a myriad of challenges when it comes to internal data research. Data today is scattered across various sources—spreadsheets, databases, PDFs, and...

DeepSeek AI Releases Smallpond: A Lightweight Data Processing Framework Built on...

0
Modern data workflows are increasingly burdened by growing dataset sizes and the complexity of distributed processing. Many organizations find that traditional systems struggle with...

DeepSeek’s Latest Inference Release: A Transparent Open-Source Mirage?

0
DeepSeek’s recent update on its DeepSeek-V3/R1 inference system is generating buzz, yet for those who value genuine transparency, the announcement leaves much to be...

IBM AI Releases Granite 3.2 8B Instruct and Granite 3.2 2B...

0
Large language models (LLMs) leverage deep learning techniques to understand and generate human-like text, making them invaluable for various applications such as text generation,...

SongGen: A Fully Open-Source Single-Stage Auto-Regressive Transformer Designed for Controllable Song...

0
Creating songs from text is difficult because it involves generating vocals and instrumental music together. Songs are unique as they combine lyrics and melodies...

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit...

0
Access to high-quality textual data is crucial for advancing language models in the digital age. Modern AI systems rely on vast datasets of token...

DeepSeek AI Releases DeepGEMM: An FP8 GEMM Library that Supports both...

0
Efficient matrix multiplications remain a critical component in modern deep learning and high-performance computing. As models become increasingly complex, conventional approaches to General Matrix...

DeepSeek AI Releases DeepEP: An Open-Source EP Communication Library for MoE...

0
Large language models that use the Mixture-of-Experts (MoE) architecture have enabled significant increases in model capacity without a corresponding rise in computation. However, this...

Building a Legal AI Chatbot: A Step-by-Step Guide Using bigscience/T0pp LLM,...

0
In this tutorial, we will build an efficient Legal AI CHatbot using open-source tools. It provides a step-by-step guide to creating a chatbot using...

Moonshot AI and UCLA Researchers Release Moonlight: A 3B/16B-Parameter Mixture-of-Expert (MoE) Model...

0
Training large language models (LLMs) has become central to advancing artificial intelligence, yet it is not without its challenges. As model sizes and datasets...

Stanford Researchers Introduce OctoTools: A Training-Free Open-Source Agentic AI Framework Designed...

0
Large language models (LLMs) are limited by complex reasoning tasks that require multiple steps, domain-specific knowledge, or external tool integration. To address these challenges,...

Google DeepMind Research Releases SigLIP2: A Family of New Multilingual Vision-Language...

0
Modern vision-language models have transformed how we process visual data, yet they often fall short when it comes to fine-grained localization and dense feature...

Meet Baichuan-M1: A New Series of Large Language Models Trained on...

0
While LLMs have shown remarkable advancements in general-purpose applications, their development for specialized fields like medicine remains limited. The complexity of medical knowledge and...

NuminaMath 1.5: Second Iteration of NuminaMath Advancing AI-Powered Mathematical Problem Solving...

0
Mathematical reasoning remains one of the most complex challenges in AI. While AI has advanced in NLP and pattern recognition, its ability to solve...

Zyphra Introduces the Beta Release of Zonos: A Highly Expressive TTS...

0
Text-to-speech (TTS) technology has made significant strides in recent years, but challenges remain in creating natural, expressive, and high-fidelity speech synthesis. Many TTS systems...

Kyutai Releases Hibiki: A 2.7B Real-Time Speech-to-Speech and Speech-to-Text Translation with...

0
Real-time speech translation presents a complex challenge, requiring seamless integration of speech recognition, machine translation, and text-to-speech synthesis. Traditional cascaded approaches often introduce compounding...

Prime Intellect Releases SYNTHETIC-1: An Open-Source Dataset Consisting of 1.4M Curated...

0
In artificial intelligence and machine learning, high-quality datasets play a crucial role in developing accurate and reliable models. However, collecting extensive, verified data—particularly in...

4 Open-Source Alternatives to OpenAI’s $200/Month Deep Research AI Agent

0
OpenAI’s Deep Research AI Agent offers a powerful research assistant at a premium price of $200 per month. However, the open-source community has stepped...

Deep Agent Released R1-V: Reinforcing Super Generalization in Vision-Language Models with...

0
Vision-language models (VLMs) face a critical challenge in achieving robust generalization beyond their training data while maintaining computational resources and cost efficiency. Approaches, such...

Mistral AI Releases the Mistral-Small-24B-Instruct-2501: A Latency-Optimized 24B-Parameter Model Released Under...

0
Developing compact yet high-performing language models remains a significant challenge in artificial intelligence. Large-scale models often require extensive computational resources, making them inaccessible for...

The Allen Institute for AI (AI2) Releases Tülu 3 405B: Scaling...

0
Post-training techniques, such as instruction tuning and reinforcement learning from human feedback, have become essential for refining language models. But, open-source approaches often fall...

Yandex Develops and Open-Sources Perforator: An Open-Source Tool that can Save...

0
Yandex, a global tech company, develops and open-sources Perforator, an innovative tool for continuous real-time monitoring and analysis of servers and applications. Perforator helps developers...

YuE: An Open-Source Music Generation AI Model Family Capable of Creating...

0
Significant progress has been made in short-form instrumental compositions in AI and music generation. However, creating full songs with lyrics, vocals, and instrumental accompaniment...

NVIDIA AI Releases Eagle2 Series Vision-Language Model: Achieving SOTA Results Across...

0
Vision-Language Models (VLMs) have significantly expanded AI’s ability to process multimodal information, yet they face persistent challenges. Proprietary models such as GPT-4V and Gemini-1.5-Pro...

Qwen AI Releases Qwen2.5-VL: A Powerful Vision-Language Model for Seamless Computer...

0
In the evolving landscape of artificial intelligence, integrating vision and language capabilities remains a complex challenge. Traditional models often struggle with tasks requiring a...

DeepSeek-AI Releases Janus-Pro 7B: An Open-Source multimodal AI that Beats DALL-E...

0
Multimodal AI integrates diverse data formats, such as text and images, to create systems capable of accurately understanding and generating content. By bridging textual...

Qwen AI Releases Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M: Allowing Deployment with Context Length...

0
The advancements in large language models (LLMs) have significantly enhanced natural language processing (NLP), enabling capabilities like contextual understanding, code generation, and reasoning. However,...

Meet Open R1: The Full Open Reproduction of DeepSeek-R1, Challenging the...

0
Open Source LLM development is going through great change through fully reproducing and open-sourcing DeepSeek-R1, including training data, scripts, etc. Hosted on Hugging Face's...

DeepSeek-R1 vs. OpenAI’s o1: A New Step in Open Source and...

0
AI has entered an era of the rise of competitive and groundbreaking large language models and multimodal models. The development has two sides, one...

Meta AI Releases the First Stable Version of Llama Stack: A...

0
As the adoption of generative AI continues to expand, developers face mounting challenges in building and deploying robust applications. The complexity of managing diverse...

Berkeley Sky Computing Lab Introduces Sky-T1-32B-Flash: A New Reasoning Language Model...

0
Artificial intelligence models have advanced significantly in recent years, particularly in tasks requiring reasoning, such as mathematics, programming, and scientific problem-solving. However, these advancements...

LLaSA-3B: A Llama 3.2B Fine-Tuned Text-to-Speech Model with Ultra-Realistic Audio, Emotional...

0
Text-to-speech (TTS) technology has emerged as a critical tool for bridging the gap between human and machine interaction. The demand for lifelike, emotionally resonant,...

Plurai Introduces IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational...

0
Evaluating conversational AI systems powered by large language models (LLMs) presents a critical challenge in artificial intelligence. These systems must handle multi-turn dialogues, integrate...

Meet EvaByte: An Open-Source 6.5B State-of-the-Art Tokenizer-Free Language Model Powered by...

0
Tokenization, the process of breaking text into smaller units, has long been a fundamental step in natural language processing (NLP). However, it presents several...

Snowflake AI Research Open-Sources SwiftKV: A Novel AI Approach that Reduces...

0
Large Language Models (LLMs) have become pivotal in artificial intelligence, powering a variety of applications from chatbots to content generation tools. However, their deployment...

MiniMax-Text-01 and MiniMax-VL-01 Released: Scalable Models with Lightning Attention, 456B Parameters,...

0
Large Language Models (LLMs) and Vision-Language Models (VLMs) transform natural language understanding, multimodal integration, and complex reasoning tasks. Yet, one critical limitation remains: current...

UC Berkeley Researchers Released Sky-T1-32B-Preview: An Open-Source Reasoning LLM Trained for...

0
The rapid advancements in artificial intelligence have opened new possibilities, but the associated costs often limit who can benefit from these technologies. Large-scale models...

Good Fire AI Open-Sources Sparse Autoencoders (SAEs) for Llama 3.1 8B and Llama...

0
Large language models (LLMs) like OpenAI’s GPT and Meta’s LLaMA have significantly advanced natural language understanding and text generation. However, these advancements come with...

Meta AI Open-Sources LeanUniverse: A Machine Learning Library for Consistent and...

0
Managing datasets effectively has become a pressing challenge as machine learning (ML) continues to grow in scale and complexity. As datasets expand, researchers and...

Introducing Parlant: The Open-Source Framework for Reliable AI Agents

0
The Problem: Why Current AI Agent Approaches Fail If you have ever designed and implemented an LLM Model-based chatbot in production, you have encountered the...

Meet KaLM-Embedding: A Series of Multilingual Embedding Models Built on Qwen2-0.5B...

0
Multilingual applications and cross-lingual tasks are central to natural language processing (NLP) today, making robust embedding models essential. These models underpin systems like retrieval-augmented...

Microsoft AI Just Released Phi-4: A Small Language Model Available on...

0
Microsoft has released Phi-4, a compact and efficient small language model, on Hugging Face under the MIT license. This decision highlights a shift towards...

Researchers from USC and Prime Intellect Released METAGENE-1: A 7B Parameter...

0
In a time when global health faces persistent threats from emerging pandemics, the need for advanced biosurveillance and pathogen detection systems is increasingly evident....

Dolphin 3.0 Released (Llama 3.1 + 3.2 + Qwen 2.5): A...

0
Artificial intelligence has come a long way, transforming the way we work, live, and interact. Yet, challenges remain. Many AI systems rely heavily on...

PRIME: An Open-Source Solution for Online Reinforcement Learning with Process Rewards...

0
Large Language Models (LLMs) face significant scalability limitations in improving their reasoning capabilities through data-driven imitation, as better performance demands exponentially more high-quality training...

Meet Agentarium: A Powerful Python Framework for Managing and Orchestrating AI...

0
AI agents have become an integral part of modern industries, automating tasks and simulating complex systems. Despite their potential, managing multiple AI agents, especially...

Hugging Face Just Released SmolAgents: A Smol Library that Enables to...

0
Creating intelligent agents has traditionally been a complex task, often requiring significant technical expertise and time. Developers encounter challenges like integrating APIs, configuring environments,...

Meet SemiKong: The World’s First Open-Source Semiconductor-Focused LLM

0
The semiconductor industry enables advancements in consumer electronics, automotive systems, and cutting-edge computing technologies. The production of semiconductors involves sophisticated processes that demand unparalleled...

DeepSeek-AI Just Released DeepSeek-V3: A Strong Mixture-of-Experts (MoE) Language Model with...

0
The field of Natural Language Processing (NLP) has made significant strides with the development of large-scale language models (LLMs). However, this progress has brought...

Qwen Team Releases QvQ: An Open-Weight Model for Multimodal Reasoning

0
Multimodal reasoning—the ability to process and integrate information from diverse data sources such as text, images, and video—remains a demanding area of research in...

Microsoft Researchers Release AIOpsLab: An Open-Source Comprehensive AI Framework for AIOps...

0
The increasing complexity of cloud computing has brought both opportunities and challenges. Enterprises now depend heavily on intricate cloud-based infrastructures to ensure their operations...

Meet FineFineWeb: An Open-Sourced Automatic Classification System for Fine-Grained Web Data

0
Multimodal Art Projection (M-A-P) researchers have introduced FineFineWeb, a large open-source automatic classification system for fine-grained web data. The project decomposes the deduplicated Fineweb...

LightOn and Answer.ai Releases ModernBERT: A New Model Series that is...

0
Since the release of BERT in 2018, encoder-only transformer models have been widely used in natural language processing (NLP) applications due to their efficiency...

Meet Moxin LLM 7B: A Fully Open-Source Language Model Developed in...

0
The rapid development of Large Language Models (LLMs) has transformed natural language processing (NLP). Proprietary models like GPT-4 and Claude 3 have set high...

Patronus AI Open Sources Glider: A 3B State-of-the-Art Small Language Model (SLM) Judge

0
Large Language Models (LLMs) play a vital role in many AI applications, ranging from text summarization to conversational AI. However, evaluating these models effectively...

Hugging Face Releases Picotron: A Tiny Framework that Solves LLM Training...

0
The rise of large language models (LLMs) has transformed natural language processing, but training these models comes with significant challenges. Training state-of-the-art models like...

Microsoft AI Research Open-Sources PromptWizard: A Feedback-Driven AI Framework for Efficient...

0
One of the crucial factors in achieving high-quality outputs from these models lies in the design of prompts—carefully crafted input instructions that guide the...

Infinigence AI Releases Megrez-3B-Omni: A 3B On-Device Open-Source Multimodal Large Language...

0
The integration of artificial intelligence into everyday life faces notable hurdles, particularly in multimodal understanding—the ability to process and analyze inputs across text, audio,...

Technology Innovation Institute TII-UAE Just Released Falcon 3: A Family of...

0
The advancements in large language models (LLMs) have created opportunities across industries, from automating content creation to improving scientific research. However, significant challenges remain....

Meta AI Releases Apollo: A New Family of Video-LMMs Large Multimodal...

0
While multimodal models (LMMs) have advanced significantly for text and image tasks, video-based models remain underdeveloped. Videos are inherently complex, combining spatial and temporal...

Meet Maya: An 8B Open-Source Multilingual Multimodal Model with Toxicity-Free Datasets...

0
Vision-Language Models (VLMs) allow machines to understand and reason about the visual world through natural language. These models have applications in image captioning, visual...

LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level...

0
LG AI Research has released bilingual models expertizing in English and Korean based on EXAONE 3.5 as open source following the success of its...

DeepSeek AI Just Released DeepSeek-V2.5-1210: The Updated Version of DeepSeek-V2.5 with...

0
DeepSeek AI has made significant progress in advancing artificial intelligence, particularly in areas like reasoning, mathematics, and coding. Earlier versions of its models achieved...

Recent articles