Author: Aswin Ak

Aswin Ak
205 POSTS0 COMMENTS
Aswin AK is a consulting intern at MarkTechPost. He is pursuing his Dual Degree at the Indian Institute of Technology, Kharagpur. He is passionate about data science and machine learning, bringing a strong academic background and hands-on experience in solving real-life cross-domain challenges.

Google DeepMind Research Releases SigLIP2: A Family of New Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Modern vision-language models have transformed how we process visual data, yet they often fall short when it comes to fine-grained localization and dense feature...

Boosting AI Math Skills: How Counterexample-Driven Reasoning is Transforming Large Language Models

Mathematical Large Language Models (LLMs) have demonstrated strong problem-solving capabilities, but their reasoning ability is often constrained by pattern recognition rather than true conceptual...

Google DeepMind Releases PaliGemma 2 Mix: New Instruction Vision Language Models Fine-Tuned on a Mix of Vision Language Tasks

Vision‐language models (VLMs) have long promised to bridge the gap between image understanding and natural language processing. Yet, practical challenges persist. Traditional VLMs often...

KGGen: Advancing Knowledge Graph Extraction with Language Models and Clustering Techniques

Knowledge graphs (KGs) are the foundation of artificial intelligence applications but are incomplete and sparse, affecting their effectiveness. Well-established KGs such as DBpedia and...

ViLa-MIL: Enhancing Whole Slide Image Classification with Dual-Scale Vision-Language Multiple Instance Learning

Whole Slide Image (WSI) classification in digital pathology presents several critical challenges due to the immense size and hierarchical nature of WSIs.  WSIs contain...

Mistral AI Introduces Mistral Saba: A New Regional Language Model Designed to Excel in Arabic and South Indian-Origin Languages such as Tamil

As artificial intelligence (AI) continues to gain traction across industries, one persistent challenge remains: creating language models that truly understand the diversity of human...

Higher-Order Guided Diffusion for Graph Generation: A Coarse-to-Fine Approach to Preserving Topological Structures

Graph generation is a complex problem that involves constructing structured, non-Euclidean representations while maintaining meaningful relationships between entities.  Most current methods fail to capture...

Can Users Fix AI Bias? Exploring User-Driven Value Alignment in AI Companions

Large language model (LLM)--based AI companions have evolved from simple chatbots into entities that users perceive as friends, partners, or even family members. Yet,...

Anthropic AI Launches the Anthropic Economic Index: A Data-Driven Look at AI’s Economic Role

Artificial Intelligence is increasingly integrated into various sectors, yet there is limited empirical evidence on its real-world application across industries. Traditional research methods—such as...

Meet Huginn-3.5B: A New AI Reasoning Model with Scalable Latent Computation

Artificial intelligence models face a fundamental challenge in efficiently scaling their reasoning capabilities at test time. While increasing model size often leads to performance...

LLMDet: How Large Language Models Enhance Open-Vocabulary Object Detection

Open-vocabulary object detection (OVD) aims to detect arbitrary objects with user-provided text labels. Although recent progress has enhanced zero-shot detection ability, current techniques handicap...

Sundial: A New Era for Time Series Foundation Models with Generative AI

Time series forecasting presents a fundamental challenge due to its intrinsic non-determinism, making it difficult to predict future values accurately. Traditional methods generally employ...