Large Language Models (LLMs)

What is Large Language Models (LLMs)?

A Large Language Model (LLM) is a deep-learning system, typically a neural network with billions of parameters, trained on enormous volumes of text so that it can predict and generate language. Modern LLMs are built on the transformer architecture, introduced in the landmark 2017 paper "Attention Is All You Need" by Vaswani and colleagues at Google. The transformer's core innovation is the self-attention mechanism, which dynamically weighs how relevant each word (token) is to every other word, allowing the entire sequence to be processed in parallel rather than word-by-word. This made it feasible to scale models to the sizes seen today and seeded later families such as BERT, the GPT series and T5.

How LLMs Work — Key Features

Tokenisation: input text is broken into tokens, which are converted into numerical vectors.
Self-attention (Query-Key-Value): the model computes relationships between all tokens, capturing long-range context.
Parameters: learned weights (often tens to hundreds of billions) that encode the model's "knowledge".
Pre-training then fine-tuning: models learn general language patterns first, then are adapted (including instruction-tuning) for specific tasks.
Emergent abilities: at scale, models display reasoning, translation and summarisation not explicitly programmed.

Significance for India

LLMs are now a question of technological sovereignty. Most dominant models are foreign-built and under-represent India's 22 scheduled languages and cultural context, raising concerns of bias and dependence. To build indigenous capability, the IndiaAI Mission (Union Cabinet approval, 7 March 2024; five-year outlay Rs 10,372 crore) funds compute infrastructure, deep-tech start-ups and foundational-model development through the IndiaAI division under Digital India Corporation.

Indigenous model	Developer	Reported details (as of 2026)
Sarvam (sovereign LLM)	Sarvam AI	Selected under IndiaAI; multilingual models including a 105B-parameter flagship (128K-token context)
Param-2 (17B MoE)	BharatGen	Mixture-of-Experts model trained across all 22 scheduled Indian languages; ~Rs 900 crore funding

Current Status & Governance

Governance is moving in parallel with capability. MeitY's India AI Governance Guidelines (released 5 November 2025) favour a "light-touch", innovation-friendly approach using existing laws rather than a stand-alone AI statute, and propose an AI Governance Group, a Technology & Policy Expert Committee and an AI Safety Institute. On 22 October 2025, MeitY also issued draft amendments to the IT (Intermediary) Rules, 2021 requiring labelling of "synthetically generated information" such as deepfakes. Globally, the EU AI Act (in force 1 August 2024; fully applicable by 2 August 2026) is the first comprehensive, risk-based AI law and serves as a key contrast in answers.

UPSC Angle

For Mains GS3, frame LLMs around four pillars: opportunity (governance delivery, healthcare, agriculture, language inclusion), risk (deepfakes, misinformation, job displacement, energy and data costs), sovereignty (indigenous models, compute, data) and regulation (India's light-touch model versus the EU's binding law). For Prelims, focus on the transformer, generative AI, and the IndiaAI Mission's figures and institutions.

What is Large Language Models (LLMs)?

How LLMs Work — Key Features

Significance for India

Current Status & Governance

UPSC Angle

Related Science & Technology terms