What is the difference between AI and Machine Learning?

AI (Artificial Intelligence) is the broad field of creating intelligent machines. Machine Learning (ML) is a subset of AI where systems learn from data to improve performance without explicit programming. Deep Learning is a further subset using multi-layered neural networks.

What is a Large Language Model (LLM)?

A Large Language Model is a neural network with billions of parameters trained on massive text datasets. LLMs like GPT-4, Claude, and Gemini understand and generate human language, powering chatbots, code generation, and creative writing.

RAG (Retrieval-Augmented Generation) enhances LLM outputs by retrieving relevant information from external sources before generating a response. It reduces hallucinations and provides up-to-date, domain-specific answers.

What is the difference between LLM and SLM?

LLMs (Large Language Models) have billions of parameters and excel at diverse tasks. SLMs (Small Language Models) have fewer parameters (under 10B), are faster, cheaper, and designed for specific use cases or edge deployment.

AI Glossary — 100+ AI & ML Terms Explained

A 13 TERMS

Activation Function

A mathematical function applied to the output of a neural network node (neuron) that determines whether it should be activated. Common examples include ReLU, Sigmoid, and Tanh. Activation functions introduce non-linearity, enabling networks to learn complex patterns.

Agentic AI

AI systems designed to autonomously plan, execute, and iterate on complex multi-step tasks with minimal human guidance. Agentic AI combines LLMs with tool use, memory, and planning capabilities to accomplish goals like software development, research, and business automation.

AGI (Artificial General Intelligence)

A hypothetical form of AI that possesses the ability to understand, learn, and apply knowledge across any intellectual task that a human can perform. Unlike narrow AI, AGI would exhibit flexible reasoning, creativity, and adaptability without task-specific training.

AI Agent

An autonomous software entity that perceives its environment, makes decisions, and takes actions to achieve specific goals. Modern AI agents use large language models to plan multi-step tasks, call tools, browse the web, and execute code with minimal human intervention.

AI Alignment

The research discipline focused on ensuring that AI systems behave in ways consistent with human values, intentions, and ethics. Alignment addresses risks like reward hacking, goal misspecification, and emergent deceptive behavior in advanced AI models.

AI Ethics

The branch of ethics that examines the moral implications of artificial intelligence, including bias and fairness, privacy, accountability, transparency, and the societal impact of automation. AI ethics frameworks guide responsible development and deployment.

AI Hallucination

When an AI model generates plausible-sounding but factually incorrect, fabricated, or nonsensical information. Hallucinations are a known limitation of large language models caused by pattern completion rather than fact retrieval. Techniques like RAG and grounding help mitigate them.

AI Safety

An interdisciplinary field ensuring AI systems operate reliably, securely, and beneficially. AI safety research covers robustness to adversarial attacks, interpretability, containment, and long-term existential risk from increasingly capable systems.

Annotation

The process of labeling or tagging data (text, images, audio, video) with metadata that supervised machine learning models use for training. High-quality annotation is essential for model accuracy and is often performed by human annotators or semi-automated tools.

API (Application Programming Interface)

A set of protocols and tools that allows different software applications to communicate. In AI, APIs enable developers to integrate models like GPT-4, Claude, or Gemini into their own products via HTTP requests, handling input/output without managing model infrastructure.

Attention Mechanism

A technique in neural networks that allows a model to dynamically focus on relevant parts of the input data when producing output. Self-attention in Transformers computes weighted relationships between all positions in a sequence, enabling parallel processing and long-range dependency capture.

Autonomous AI

AI systems capable of operating independently with minimal human oversight. These systems perceive their environment, make decisions, and take actions to achieve goals—ranging from self-driving cars to AI coding agents that write, test, and deploy software autonomously.

Autoregressive Model

A model that generates output sequentially, predicting each token based on previously generated tokens. GPT-series models are autoregressive: they predict the next word one at a time, forming coherent text through iterative sampling.

B 4 TERMS

Backpropagation

The primary algorithm for training neural networks. It computes gradients of the loss function with respect to each weight by propagating errors backward through the network layers, then updates weights using gradient descent to minimize prediction errors.

Benchmark

A standardized test or dataset used to evaluate and compare AI model performance. Common benchmarks include MMLU (knowledge), HumanEval (coding), GSM8K (math), and ImageNet (vision). Benchmarks drive model development but can be gamed, so real-world evaluation remains important.

BERT (Bidirectional Encoder Representations from Transformers)

A pre-trained language model developed by Google that reads text bidirectionally (both left-to-right and right-to-left simultaneously). BERT revolutionized NLP tasks like question answering, sentiment analysis, and named entity recognition through transfer learning.

Bias in AI

Systematic errors or prejudices in AI model outputs caused by imbalanced training data, flawed assumptions, or societal biases embedded in datasets. AI bias can lead to discriminatory outcomes in hiring, lending, criminal justice, and healthcare applications.

C 10 TERMS

Chain-of-Thought (CoT) Prompting

A prompting technique where the model is encouraged to explain its reasoning step by step before arriving at an answer. CoT prompting significantly improves performance on math, logic, and multi-step reasoning tasks by making the model's thought process explicit.

Chatbot

A software application that simulates human conversation through text or voice. Modern AI chatbots powered by LLMs (like ChatGPT, Claude, Gemini) can handle complex queries, maintain context across turns, and perform tasks like coding, analysis, and creative writing.

Classification

A supervised learning task where the model assigns input data to predefined categories. Examples include email spam detection, image recognition (cat vs. dog), sentiment analysis (positive/negative), and medical diagnosis (malignant/benign).

CLIP (Contrastive Language-Image Pre-training)

A model developed by OpenAI that learns visual concepts from natural language descriptions. CLIP connects text and images in a shared embedding space, enabling zero-shot image classification and powering image generation models like DALL-E and Stable Diffusion.

Clustering

An unsupervised learning technique that groups similar data points together based on their features. Common algorithms include K-means, hierarchical clustering, and DBSCAN. Used in customer segmentation, image grouping, document organization, and anomaly detection.

CNN (Convolutional Neural Network)

A type of deep learning architecture designed for processing grid-structured data like images. CNNs use convolutional layers to detect features (edges, textures, shapes) hierarchically, excelling at image classification, object detection, and video analysis.

Computer Vision

A field of AI that trains machines to interpret and understand visual information from images and videos. Applications include object detection, facial recognition, autonomous driving, medical imaging, and quality control in manufacturing.

Constitutional AI

An AI alignment technique developed by Anthropic where models are trained to follow a set of principles (a "constitution") that defines acceptable behavior. The model critiques and revises its own outputs according to these principles, reducing the need for human feedback.

Context Window

The maximum amount of text (measured in tokens) that a language model can process in a single interaction. Larger context windows allow models to consider more information. GPT-4 supports 128K tokens, Claude 3.5 supports 200K tokens, and Gemini 1.5 Pro supports 1M+ tokens.

Conversational AI

Technology that enables machines to engage in natural, human-like dialogue. It combines NLP, speech recognition, dialogue management, and generation to power chatbots, virtual assistants (Siri, Alexa), and customer service automation.

D 7 TERMS

Data Augmentation

Techniques for artificially expanding training datasets by creating modified versions of existing data. For images, this includes rotation, flipping, cropping, and color jittering. For text, it includes paraphrasing, back-translation, and synonym replacement.

Data Labeling

The process of assigning meaningful tags, categories, or annotations to raw data so supervised machine learning models can learn from it. Accurate labeling is critical for model quality and is one of the most time-consuming and expensive parts of ML pipelines.

Deep Learning

A subset of machine learning that uses neural networks with many layers (hence "deep") to learn hierarchical representations of data. Deep learning powers breakthroughs in image recognition, speech processing, natural language understanding, and generative AI.

Deepfake

AI-generated synthetic media (typically video or audio) that realistically depicts people saying or doing things they never actually did. Deepfakes use deep learning techniques like GANs and autoencoders, raising concerns about misinformation, fraud, and consent.

Diffusion Model

A class of generative AI models that create data (usually images) by learning to reverse a gradual noising process. Starting from pure noise, the model iteratively denoises to produce high-quality outputs. DALL-E 3, Stable Diffusion, and Midjourney use diffusion models.

Discriminator

In a Generative Adversarial Network (GAN), the discriminator is the model that tries to distinguish between real data and fake data produced by the generator. The adversarial training between generator and discriminator drives both to improve.

Distillation

See Knowledge Distillation. The process of training a smaller, efficient model to replicate the behavior of a larger teacher model. Distilled models like DistilBERT and Gemma run faster and cheaper while retaining most of the original model's capabilities.

E 6 TERMS

Edge AI

The deployment of AI algorithms directly on local devices (smartphones, IoT sensors, cameras) rather than in the cloud. Edge AI reduces latency, enhances privacy, and enables offline inference. Examples include on-device voice assistants and real-time image recognition.

Embedding

A dense numerical vector representation of data (words, sentences, images, code) in a continuous vector space. Embeddings capture semantic meaning so similar items are close together. They power search engines, recommendation systems, RAG pipelines, and similarity comparisons.

Emergent Abilities

Capabilities that appear in large AI models at certain scale thresholds but are absent in smaller models. Examples include chain-of-thought reasoning, in-context learning, and code generation. Emergent abilities suggest that scaling up models can unlock qualitatively new behaviors.

Encoder-Decoder

A neural network architecture with two parts: an encoder that processes input into a compressed representation, and a decoder that generates output from that representation. Used in machine translation, text summarization, and image captioning.

Epoch

One complete pass through the entire training dataset during model training. Multiple epochs are typically needed for a model to converge. Too few epochs cause underfitting; too many cause overfitting. Learning rate schedulers often adjust across epochs.

Explainability (XAI)

The degree to which AI model decisions can be understood and interpreted by humans. Explainable AI (XAI) techniques like SHAP values, attention visualization, and feature importance help build trust, satisfy regulations, and debug model behavior.

F 7 TERMS

Fairness

In AI, fairness refers to ensuring models treat individuals and groups equitably regardless of protected attributes like race, gender, or age. Fairness metrics measure disparate impact, and bias mitigation techniques are applied during data collection, training, and post-processing.

Feature Engineering

The process of selecting, transforming, and creating input variables (features) from raw data to improve machine learning model performance. Good feature engineering requires domain expertise and can dramatically boost accuracy without changing the model architecture.

Federated Learning

A machine learning approach where models are trained across multiple decentralized devices or servers holding local data, without exchanging raw data. Only model updates (gradients) are shared, preserving privacy. Used by Google Keyboard, Apple Siri, and healthcare applications.

Few-Shot Learning

A machine learning paradigm where models learn to perform tasks from very few examples (typically 1-5). In the context of LLMs, few-shot learning means providing a handful of input-output examples in the prompt to guide the model's behavior without fine-tuning.

Fine-Tuning

The process of further training a pre-trained model on a smaller, task-specific dataset to adapt it for a particular use case. Fine-tuning adjusts the model's weights to improve performance on specific domains (legal, medical, coding) while leveraging general knowledge from pre-training.

Foundation Model

A large AI model trained on broad, diverse data that can be adapted to a wide range of downstream tasks. Examples include GPT-4, Claude, Gemini, and Llama. Foundation models represent a paradigm shift from task-specific models to general-purpose AI systems.

Function Calling

The ability of LLMs to generate structured tool/API calls in response to user queries. When a model determines it needs external data or actions, it outputs a function call specification (name, parameters) that the application executes. Enables LLMs to search the web, query databases, and control software.

G 8 TERMS

GAN (Generative Adversarial Network)

A framework consisting of two neural networks—a generator and a discriminator—trained in competition. The generator creates synthetic data while the discriminator evaluates authenticity. GANs produce realistic images, video, and audio, though they've been largely superseded by diffusion models for image generation.

Generative AI

AI systems that create new content—text, images, music, video, code, or 3D models—rather than just analyzing existing data. Powered by LLMs, diffusion models, and other architectures, generative AI includes ChatGPT, DALL-E, Midjourney, Stable Diffusion, and Suno.

Generator

In a GAN, the generator is the network that produces synthetic data (images, text, audio) from random noise. Its goal is to create outputs realistic enough to fool the discriminator. Generators learn to map from a latent space to the data distribution.

GPT (Generative Pre-trained Transformer)

A family of autoregressive language models developed by OpenAI. GPT models are pre-trained on massive text corpora and fine-tuned with RLHF. GPT-4 and GPT-4o power ChatGPT, offering capabilities in writing, coding, reasoning, vision, and voice interaction.

GPU (Graphics Processing Unit)

A specialized processor originally designed for rendering graphics but now essential for AI/ML workloads. GPUs excel at parallel computation, making them ideal for training and running deep learning models. NVIDIA's A100 and H100 GPUs dominate the AI training market.

Gradient Descent

An optimization algorithm used to minimize the loss function during neural network training. It iteratively adjusts model weights in the direction that reduces error. Variants include stochastic gradient descent (SGD), Adam, and AdaGrad.

Grounding

The technique of anchoring AI model outputs to verified, factual sources of information. Grounding helps reduce hallucinations by connecting the model's responses to search results, databases, or documents. Google's Gemini uses grounding with Google Search.

Guardrails

Safety mechanisms and constraints implemented in AI systems to prevent harmful, inappropriate, or off-topic outputs. Guardrails include content filters, topic boundaries, output validators, and constitutional AI principles that keep models within acceptable behavior limits.

H 1 TERM

Hyperparameter

Configuration settings for model training that are set before learning begins (as opposed to parameters learned during training). Examples include learning rate, batch size, number of layers, and dropout rate. Hyperparameter tuning significantly impacts model performance.

I 5 TERMS

Image Generation

The creation of new images from text descriptions (text-to-image), sketches, or other images using AI models. Leading tools include DALL-E 3, Midjourney, Stable Diffusion, and Adobe Firefly. Powered primarily by diffusion models and transformer architectures.

Image Recognition

The ability of AI to identify objects, people, text, scenes, and activities in images. Powered by CNNs and vision transformers, image recognition is used in autonomous vehicles, medical diagnostics, security surveillance, and retail.

In-Context Learning

The ability of large language models to learn and perform tasks from examples provided directly in the prompt, without updating model weights. This emergent capability allows LLMs to adapt to new tasks on-the-fly through few-shot or zero-shot prompting.

Inference

The process of using a trained AI model to make predictions or generate outputs on new, unseen data. Unlike training (which adjusts weights), inference runs the model forward. Inference speed, cost, and efficiency are critical for production AI applications.

Instruction Tuning

A fine-tuning process where models are trained on instruction-response pairs to better follow human directions. Instruction-tuned models (like ChatGPT, Claude) are more helpful, safe, and aligned than base models, responding appropriately to diverse user queries.

J 2 TERMS

Jailbreak

Techniques used to bypass the safety guardrails and content restrictions of AI models to elicit prohibited outputs. Jailbreak methods include role-playing prompts, encoding tricks, and multi-turn manipulation. AI developers continuously patch vulnerabilities as new jailbreaks are discovered.

JSON Mode

A model output configuration that guarantees responses in valid JSON format. JSON mode is essential for building applications that parse model outputs programmatically—APIs, data extraction, structured content generation, and function calling all benefit from reliable JSON output.

K 2 TERMS

Knowledge Distillation

A model compression technique where a smaller "student" model is trained to replicate the behavior of a larger "teacher" model. Distillation transfers knowledge by matching soft probability distributions, producing efficient models suitable for edge deployment without significant accuracy loss.

Knowledge Graph

A structured representation of real-world entities and the relationships between them, stored as a network of nodes and edges. Knowledge graphs power search engines (Google), recommendation systems, and AI assistants by providing organized, queryable factual knowledge.

L 5 TERMS

Latent Space

A compressed, abstract mathematical representation of data learned by AI models. In generative models, the latent space encodes key features of the data. Navigating latent space allows interpolation between concepts—e.g., smoothly transitioning between two image styles.

LLM (Large Language Model)

A neural network with billions of parameters trained on massive text datasets to understand and generate human language. LLMs like GPT-4, Claude 3.5, Gemini, and Llama 3 power chatbots, code generation, translation, summarization, and reasoning tasks. They form the backbone of modern generative AI.

LoRA (Low-Rank Adaptation)

An efficient fine-tuning technique that adds small trainable matrices to frozen pre-trained model weights instead of updating all parameters. LoRA dramatically reduces memory and compute requirements for fine-tuning LLMs, making customization accessible on consumer hardware.

Loss Function

A mathematical function that measures how far a model's predictions deviate from the actual target values. The training process aims to minimize this loss. Common loss functions include cross-entropy (classification), mean squared error (regression), and contrastive loss.

LSTM (Long Short-Term Memory)

A type of recurrent neural network designed to learn long-term dependencies in sequential data. LSTMs use gating mechanisms (forget, input, output gates) to control information flow. While effective for time series and early NLP, they've been largely replaced by transformers.

M 5 TERMS

Machine Learning (ML)

A subset of artificial intelligence where systems learn patterns from data to make predictions or decisions without being explicitly programmed. ML encompasses supervised, unsupervised, and reinforcement learning, and underpins applications from recommendation engines to autonomous vehicles.

Model Compression

Techniques for reducing AI model size and computational requirements while preserving accuracy. Methods include pruning (removing unimportant weights), quantization (reducing numerical precision), and knowledge distillation. Essential for deploying models on mobile and edge devices.

MoE (Mixture of Experts)

A neural network architecture that routes inputs to specialized sub-networks (experts) rather than processing through the entire model. Only a subset of experts activates per input, enabling massive model capacity while keeping compute costs manageable. GPT-4 and Mixtral use MoE.

Multimodal Learning

Training AI models to process and relate information from multiple modalities (text, images, audio, video) simultaneously. Multimodal models like GPT-4V, Gemini, and LLaVA can understand images alongside text, enabling richer interactions and more capable AI applications.

N 3 TERMS

Narrow AI

AI designed and trained for a specific, well-defined task. Unlike AGI, narrow AI cannot generalize beyond its domain. Examples include chess engines, spam filters, recommendation algorithms, and image classifiers. All currently deployed AI systems are forms of narrow AI.

Natural Language Processing (NLP)

The field of AI focused on enabling machines to understand, interpret, generate, and respond to human language. NLP powers chatbots, translation systems, sentiment analysis, text summarization, and voice assistants. Modern NLP is dominated by transformer-based models.

Neural Network

A computing system inspired by the biological neural networks of the human brain. Composed of interconnected layers of nodes (neurons) that process information, neural networks learn to recognize patterns through training. They form the foundation of modern deep learning and AI.

O 3 TERMS

Object Detection

A computer vision task that identifies and localizes objects within images or video frames, drawing bounding boxes around detected items. Used in autonomous driving, surveillance, retail analytics, and robotics. Popular models include YOLO, Faster R-CNN, and DETR.

Open Source AI

AI models and tools released with open licenses allowing anyone to use, modify, and distribute them. Open source models like Llama 3, Mistral, Stable Diffusion, and Whisper democratize AI access, enable customization, and foster community innovation.

Overfitting

When a model learns the training data too well—including noise and outliers—resulting in excellent training performance but poor generalization to new data. Overfitting is combated with techniques like regularization, dropout, data augmentation, and early stopping.

P 7 TERMS

Parameter

The internal variables (weights and biases) of a neural network that are learned during training. Model size is often described by parameter count—GPT-4 is estimated at 1.8 trillion parameters. More parameters generally enable greater model capability but require more compute.

Parameter-Efficient Fine-Tuning (PEFT)

A family of techniques that fine-tune large models by updating only a small fraction of parameters. Methods include LoRA, adapters, prefix tuning, and prompt tuning. PEFT makes customization of billion-parameter models feasible on limited hardware.

Perplexity (Metric)

A measurement of how well a language model predicts a sample of text. Lower perplexity indicates better prediction quality. While useful for comparing models on the same dataset, perplexity alone doesn't capture practical qualities like helpfulness, safety, or instruction-following ability.

Pre-Training

The initial phase of training a foundation model on a large, diverse dataset to learn general knowledge and language patterns. Pre-training is computationally expensive (millions of dollars for frontier models) and produces a base model that is then fine-tuned for specific tasks.

Privacy-Preserving AI

Techniques and approaches that enable AI training and inference while protecting sensitive personal data. Methods include federated learning, differential privacy, homomorphic encryption, and secure multi-party computation, addressing GDPR and privacy regulations.

Prompt Engineering

The art and science of crafting input instructions (prompts) to guide AI model outputs effectively. Techniques include zero-shot prompting, few-shot examples, chain-of-thought reasoning, role-playing, and structured output formats. Good prompts dramatically improve model performance.

Pruning

A model compression technique that removes less important weights or neurons from a trained neural network. Pruning reduces model size and inference time while maintaining most accuracy. Structured pruning removes entire filters/layers; unstructured pruning removes individual weights.

Q 1 TERM

Quantization

A model optimization technique that reduces the numerical precision of model weights (e.g., from 32-bit to 8-bit or 4-bit integers). Quantization shrinks model size and speeds up inference with minimal accuracy loss, enabling large models to run on consumer GPUs and mobile devices.

R 10 TERMS

RAG (Retrieval-Augmented Generation)

A technique that enhances LLM outputs by retrieving relevant information from external knowledge sources (databases, documents, web) before generating a response. RAG reduces hallucinations, provides up-to-date information, and enables domain-specific answers without full model retraining.

Reasoning (AI)

The ability of AI models to logically analyze problems, draw conclusions, and solve multi-step tasks. Advanced reasoning models like OpenAI o1/o3, Claude 3.5, and Gemini 2.0 use techniques like chain-of-thought and tree-of-thought to tackle math, science, and coding problems.

Recommendation System

An AI system that predicts and suggests items a user may be interested in based on their behavior, preferences, and similarities to other users. Powers personalization on Netflix, Spotify, Amazon, YouTube, and social media platforms.

Red Teaming

The practice of deliberately testing AI systems by attempting to elicit harmful, biased, or unsafe outputs. Red teams simulate adversarial attacks, jailbreaks, and edge cases to identify vulnerabilities before deployment, improving model safety and robustness.

Regression

A supervised learning task where the model predicts continuous numerical values (as opposed to categories). Examples include predicting house prices, stock returns, temperature, and customer lifetime value. Common algorithms include linear regression, random forests, and neural networks.

Reinforcement Learning (RL)

A machine learning paradigm where an agent learns to make sequential decisions by interacting with an environment and receiving rewards or penalties. RL powers game-playing AI (AlphaGo), robotics, autonomous driving, and is used to fine-tune LLMs via RLHF.

Reward Model

A model trained to predict human preferences between different AI outputs. In RLHF, the reward model scores model responses based on human feedback data, then guides the language model training toward outputs that humans rate more highly.

RLHF (Reinforcement Learning from Human Feedback)

A training technique that uses human preferences to fine-tune language models. Humans rank model outputs, and a reward model is trained on these rankings to guide the LLM toward more helpful, harmless, and honest responses. RLHF is central to ChatGPT and Claude's alignment.

RNN (Recurrent Neural Network)

A neural network architecture designed for sequential data where outputs from previous steps feed back as inputs. RNNs were historically used for language modeling and time series, but have been largely replaced by transformers due to limitations in handling long-range dependencies.

Robotics AI

The integration of AI with physical robots to enable autonomous perception, decision-making, and manipulation in the real world. AI-powered robots perform tasks in manufacturing, warehousing, surgery, agriculture, and domestic assistance.

S 13 TERMS

Scaling Laws

Empirical relationships showing how model performance improves predictably with increases in model size, training data, and compute. Discovered by researchers at OpenAI and DeepMind, scaling laws guide decisions about how to allocate resources for training frontier AI models.

Self-Attention

A mechanism within transformers that computes the relevance of each element in a sequence to every other element. Self-attention enables models to capture long-range dependencies and contextual relationships, forming the core computational block of GPT, BERT, and other transformer models.

Self-Supervised Learning

A training paradigm where models learn from unlabeled data by creating their own supervisory signals from the data structure itself. Examples include next-word prediction (GPT), masked word prediction (BERT), and contrastive learning (CLIP). Most modern foundation models use self-supervised learning.

Semantic Search

Search technology that understands the meaning and intent behind queries rather than just matching keywords. Powered by embeddings and vector databases, semantic search finds conceptually similar results even when exact terms differ. Used in AI-powered search engines like Perplexity.

Sentiment Analysis

An NLP technique that determines the emotional tone (positive, negative, neutral) of text. Used in social media monitoring, customer feedback analysis, brand reputation management, and market research. Modern sentiment analysis leverages transformer-based models for nuanced understanding.

Seq2Seq (Sequence-to-Sequence)

A model architecture that transforms one sequence into another, using an encoder to process the input and a decoder to generate the output. Originally developed for machine translation, seq2seq powers text summarization, chatbots, and code generation.

SLM (Small Language Model)

A language model with fewer parameters (typically under 10 billion) designed for efficiency, privacy, and on-device deployment. Examples include Phi-3, Gemma, and Mistral 7B. SLMs trade some capability for faster inference, lower cost, and better suitability for specific use cases.

Speech-to-Text (STT)

AI technology that converts spoken language into written text. Modern STT models like OpenAI Whisper, Google Speech-to-Text, and Deepgram achieve near-human accuracy across multiple languages. Used in transcription, voice assistants, captioning, and accessibility tools.

Stable Diffusion

An open-source text-to-image diffusion model developed by Stability AI. It generates high-quality images from text prompts and can be run locally, fine-tuned, and extended. Stable Diffusion popularized AI art creation and spawned a vast ecosystem of models, tools, and communities.

Superintelligence

A theoretical level of AI that vastly surpasses human intelligence in virtually all cognitive domains, including scientific creativity, general wisdom, and social skills. Superintelligence is a central topic in AI safety research and long-term risk assessment.

Supervised Learning

A machine learning approach where models are trained on labeled datasets—input-output pairs where the correct answer is known. The model learns to map inputs to outputs and generalizes to new data. Common tasks include classification, regression, and object detection.

Synthetic Data

Artificially generated data that mimics real-world data patterns without containing actual personal or sensitive information. Synthetic data is used to augment training datasets, protect privacy, address data scarcity, and test AI systems in scenarios where real data is unavailable or restricted.

System Prompt

A hidden instruction given to an AI model at the start of a conversation that defines its behavior, personality, capabilities, and constraints. System prompts are used to customize AI assistants, set safety guidelines, and control how the model responds to users.

T 13 TERMS

Temperature

A parameter that controls the randomness of AI model outputs during text generation. Lower temperature (e.g., 0.1) produces more deterministic, focused responses; higher temperature (e.g., 1.0+) increases creativity and diversity but may reduce coherence. Temperature affects sampling probability distributions.

Text-to-Image

The generation of images from natural language descriptions using AI models. Text-to-image systems like DALL-E 3, Midjourney, Stable Diffusion, and Adobe Firefly interpret prompts to create photorealistic images, artwork, designs, and illustrations from textual descriptions.

Text-to-Speech (TTS)

AI technology that converts written text into natural-sounding spoken audio. Modern TTS models like ElevenLabs, PlayHT, and XTTS achieve human-like voice quality with emotional expression, voice cloning, and multilingual support. Used in audiobooks, accessibility, virtual assistants, and content creation.

Text-to-Video

AI technology that generates video content from text descriptions. Models like Sora, Runway Gen-3, Pika, and Kling create cinematic videos with complex scenes, camera movements, and character animations from natural language prompts, revolutionizing video production.

Token

The basic unit of text that language models process. A token can be a word, subword, or character depending on the tokenizer. "ChatGPT" might be one token, while "unbelievable" could be split into "un", "believ", "able". Token counts determine model input limits, pricing, and context window size.

Tokenizer

An algorithm that breaks text into tokens for model processing. Different models use different tokenization strategies: BPE (Byte Pair Encoding), WordPiece, or SentencePiece. The tokenizer determines how text maps to numerical IDs the model can understand.

Tokenomics (AI)

The pricing structure of AI API services based on token usage. Most LLM providers charge per input and output token, with prices varying by model capability. Understanding tokenomics is essential for budgeting AI application costs and optimizing prompt efficiency.

Tool Use (AI)

The capability of AI models to interact with external tools and services—web browsers, calculators, code interpreters, databases, and APIs. Tool use transforms LLMs from passive text generators into active agents that can retrieve information, perform calculations, and execute actions.

TPU (Tensor Processing Unit)

Google's custom-designed AI accelerator chip optimized for machine learning workloads, particularly matrix operations used in neural networks. TPUs are used to train Google's largest models (Gemini, PaLM) and are available through Google Cloud for external developers.

Training

The process of teaching an AI model to perform tasks by exposing it to data and adjusting its internal parameters to minimize prediction errors. Training involves forward passes (predictions), loss computation, and backward passes (weight updates via backpropagation).

Training Data

The dataset used to teach machine learning models. Training data quality, quantity, diversity, and labeling accuracy directly impact model performance. For LLMs, training data includes web text, books, code, and academic papers—often billions of tokens.

Transfer Learning

A technique where knowledge gained from training a model on one task is applied to a different but related task. Transfer learning enables fine-tuning pre-trained foundation models for specific applications with much less data and compute than training from scratch.

Transformer

The dominant neural network architecture powering modern AI, introduced in the 2017 "Attention Is All You Need" paper. Transformers use self-attention mechanisms to process entire sequences in parallel, enabling efficient training on massive datasets. GPT, BERT, Gemini, Claude, and Llama are all transformer-based.

U 1 TERM

Unsupervised Learning

A machine learning approach where models find patterns and structures in unlabeled data without predefined outputs. Techniques include clustering (K-means, DBSCAN), dimensionality reduction (PCA, t-SNE), and anomaly detection. Used for customer segmentation, fraud detection, and data exploration.

V 4 TERMS

Variational Autoencoder (VAE)

A generative model that learns to encode data into a continuous latent space and decode it back, enabling generation of new data samples. VAEs are used in image generation, drug discovery, anomaly detection, and as components within larger generative systems like Stable Diffusion.

Vector Database

A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings. Vector databases like Pinecone, Weaviate, Chroma, and Qdrant enable fast similarity search and are essential infrastructure for RAG pipelines, recommendation systems, and semantic search.

Vision Transformer (ViT)

A transformer architecture adapted for image processing that divides images into patches and processes them as sequences. ViT demonstrated that transformers can match or exceed CNNs for image classification, leading to unified architectures for text, image, and multimodal processing.

Voice Cloning

AI technology that replicates a person's voice from audio samples, enabling the synthesis of new speech in that voice. Used in content creation, dubbing, accessibility, and entertainment. Tools like ElevenLabs and Resemble.AI can clone voices from just a few seconds of audio.

W 2 TERMS

Watermarking (AI)

Techniques for embedding invisible markers in AI-generated content (text, images, audio) to indicate it was created by AI. Watermarking helps combat misinformation, protects intellectual property, and enables content authentication. Companies like Google and OpenAI implement watermarking in their models.

Weights

The numerical values within a neural network that are adjusted during training to minimize prediction errors. Weights determine how input signals are transformed as they pass through network layers. The collection of all weights constitutes the model's learned knowledge and capabilities.

Z 1 TERM

Zero-Shot Learning

The ability of AI models to perform tasks they were not explicitly trained on, without any task-specific examples. Large language models exhibit strong zero-shot capabilities—for instance, translating between languages or classifying text without prior examples of those specific tasks.

No terms found

Explore AI tools in action