Vineet Kumar, Author at MarkTechPost

Author: Vineet Kumar

115 POSTS0 COMMENTS

Vineet Kumar is a consulting intern at MarktechPost. He is currently pursuing his BS from the Indian Institute of Technology(IIT), Kanpur. He is a Machine Learning enthusiast. He is passionate about research and the latest advancements in Deep Learning, Computer Vision, and related fields.

Decoupling Tokenization: How Over-Tokenized Transformers Redefine Vocabulary Scaling in Language Models

AI Paper SummaryJanuary 30, 2025

Tokenization plays a fundamental role in the performance and scalability of Large Language Models (LLMs). Despite being a critical component, its influence on model...

Creating An AI Agent-Based System with LangGraph: A Beginner’s Guide

AI ShortsJanuary 29, 2025

What is an Agent? An agent is a Large Language Model (LLM)-powered system that can decide its own workflow. Unlike traditional chatbots, which operate on...

Unlocking Autonomous Planning in LLMs: How AoT+ Overcomes Hallucinations and Cognitive Load

AI Paper SummaryJanuary 27, 2025

Large language models (LLMs) have shown remarkable abilities in language tasks and reasoning, but their capacity for autonomous planning—especially in complex, multi-step scenarios—remains limited....

Plurai Introduces IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System

AI Paper SummaryJanuary 23, 2025

Evaluating conversational AI systems powered by large language models (LLMs) presents a critical challenge in artificial intelligence. These systems must handle multi-turn dialogues, integrate...

Create Portrait Mode Effect with Segment Anything Model 2 (SAM2)

Artificial IntelligenceJanuary 21, 2025

Have you ever admired how smartphone cameras isolate the main subject from the background, adding a subtle blur to the background based on depth?...

Google AI Introduces ZeroBAS: A Neural Method to Synthesize Binaural Audio from Monaural Audio Recordings and Positional Information without Training on Any Binaural Data

Artificial IntelligenceJanuary 18, 2025

Humans possess an extraordinary ability to localize sound sources and interpret their environment using auditory cues, a phenomenon termed spatial hearing. This capability enables...

Chat with Your Documents Using Retrieval-Augmented Generation (RAG)

Editors PickJanuary 16, 2025

Imagine having a personal chatbot that can answer questions directly from your documents—be it PDFs, research papers, or books. With Retrieval-Augmented Generation (RAG), this...

Revolutionizing Vision-Language Tasks with Sparse Attention Vectors: A Lightweight Approach to Discriminative Classification

AI Paper SummaryJanuary 15, 2025

Generative Large Multimodal Models (LMMs), such as LLaVA and Qwen-VL, excel in vision-language (VL) tasks like image captioning and visual question answering (VQA). However,...

MIT Researchers Propose Cross-Layer Attention (CLA): A Modification to the Transformer Architecture that Reduces the Size of the Key-Value KV Cache by Sharing KV...

AI Paper SummaryMay 25, 2024

The memory footprint of the key-value (KV) cache can be a bottleneck when serving large language models (LLMs), as it scales proportionally with both...

123...10 Page 2 of 10