Speak AI

A Simple Glossary for Understanding Multilingual AI and Language Tools

Artificial Intelligence (AI)

Computer systems are designed to perform tasks that typically require human intelligence, such as understanding language, recognising patterns, and making decisions.

Attention

A mechanism that helps a model decide which words in a sentence are most relevant to each other.

Bias in AI

Systematic errors or unfairness in AI outputs often reflect prejudices present in training data.

Decoder

The part of a model that generates output text by predicting the next token step by step.

Embedding

An embedding is a vector, a set of numbers in multiple dimensions, that represents a word and captures its meaning and relationships with other words.

Encoder

The part of a model that processes input text and captures its context and relationships.

Fairness

The principle that AI systems should treat all users and languages equitably without discrimination.

Generative AI (GenAI)

A type of AI that creates new content such as text, images, or audio based on input prompts.

Large Language Model (LLM)

An AI model trained on massive amounts of text data to understand and generate human-like language. Examples include GPT-4 and BLOOM.

Low-Resource Languages

Languages with limited digital text data make it harder for AI to learn and perform well on them.

Machine Translation

The automatic translation of text or speech from one language to another by computers.

Multilingual AI

AI systems are capable of understanding and generating text in multiple languages.

Multilingual Prompting

Writing prompts that combine two or more languages to test or utilize an AI’s multilingual capabilities.

Natural Language Processing (NLP)

A field of AI focused on enabling computers to understand, interpret, and generate human language.

Prompt Engineering

The craft of designing and refining input instructions (prompts) to guide AI models to produce better or more accurate outputs.

Tokens

Small pieces of text (words or word parts) that AI models process. The number of tokens affects how much computation and cost is required.

Transformer

A neural network architecture that uses attention to process and generate sequences of text efficiently.

Vector

A list of numbers used by AI models to represent words, where distance reflects similarity in meaning.