[4] What is AI, anyway?

AI Tip of the Week

For our AI tip this week, we’re shouting out Patrick Zgambo, who uses ChatGPT Voice to write a song about something he’s trying to learn.  He wrote a post (3 Azure Certs in 8 Weeks with ChatGPT Voice & AI Songs!) about studying for the Azure AI exams – and using AI to make funny songs to help him memorize concepts. Try it out!  We did too – here is a link to a song ChatGPT wrote about this week’s newsletter.  Happy Rodeoing, friends!!

As you may know, both of us studied Computer Science back in the day.

You’re thinking – oh boy, here we go.   Break out the stats book with these two.  They’re going to talk about the good old days of Emacs and compilers, starting with binary and EECS.

Nope.  This ain’t that newsletter.  (Plenty of those already exist in the world.)

The cool thing about you, the reader of 3rd Rodeo AI is that you live somewhere interesting in the world, and come from many industries, geographies, and backgrounds.  Some among you are AI PhDs or work at the cutting edge of machine learning – while others are only reading this because you like us for some reason (we love you too)! Most of you however are people with “normal jobs.”

Regardless of who you are or what you know, to become an AI Power User, you’re need to know JUST THE BASICS.

So – today, let’s cover only the stuff you need to know:

What are the FEWEST AI concepts you’ll need to know?

AI (aka Artificial Intelligence) tries to copy human intelligence, so computers can do things like make decisions, solve problems, or understand language.  It’s long been a dream of humans, to make the boring stuff go away by shoving it to machines.  Way back to textile looms.  Before, even.

ML (aka Machine Learning) is a part of AI. To get closer to being an artificial intelligence, ML systems “train” by looking at input data, finding patterns in this data, storing the patterns to use later, and then – when we ask them to – using these patterns to carry out tasks.  The “learning” part of ML seeks to get them to improve as they operate – instead of being programmed with rules every single thing.  Like how kids learn a language by listening to people around them speak and learning the “rules” of the language – not by listening to EVERY SINGLE SENTENCE POSSIBLE in that language (which would be impossible). 

Predictive AI: Predictive AI is based on a specific set of data that we’ve hand selected, tagged, and fed it.  It will predict the next thing based on what you’ve tagged previous.  Blueberries or Chihuahuas – lots of images.  Using computer vision, it should be able to guess which one it is.  That’s predictive AI.  Forecasting customer churn, predicting equipment failure in manufacturing, or assessing credit risk in finance.

GenAI (aka Generative AI) – focuses on creating new content.  Things like words, pictures, music, and code.  Gen AI learns patterns from existing data and generates (see what we did there?) outputs that fit those patterns.  You’re not training on a subset of the data – you’re training on the huge internet. [Usual speil, next word thing]

LLM (aka a “Large Language Model:) – a type of AI model  trained on huge amounts of text data.  In order to understand and generate human-like language, an LLM uses patterns to predict the next word in a sequence.  This lets it answer questions, write text, and carry on conversations similar to humans like us.

Agentic AI: We also have something called Agentic AI.  What’s that, you may ask?  Hold your horses Nelly! That’s later in the rodeo!  Keep reading – we’ll get there!

Before we go TOO much further, let’s answer Dona sister Bon’s biggest question:  “what does GBT stand for?”

First of all – it’s GPT, Bon.

GPT.

Generative:  it will invent new data (like words, images, and video) that is similar to the data it was trained on.

Pre-trained:  it was trained previously, before it’s put into use.  This has implications we’ll explore in a bit – like, if we don’t do anything else, a pre-trained model doesn’t know anything about what’s happened even 2 minutes after it was trained.  This is why many of these models have to tell you the exact date training was completed.

Transformer:  Probably the most “fun” part of this whole thing.  Transformers try to understand language by looking at ALL the words in a sentence at the same time. Instead of reading words one after another, a transformer uses something called “self-attention” to figure out which words are important to each other, and how they relate to each other.

For example, in the sentence “The feisty orange cat, chased by the kind of dopey-looking brown dog with a spiked collar, climbed a pine tree to get away as fast as possible and ponder its existence,” self-attention helps the model understand that “cat” is connected to “climbed a pine tree,” even though many other words are in between.

This allows the transformer to understand what the heck is going on (in the training phase) – and produce more accurate and coherent responses (in the generative phase).

Ok, still with us?

How does GenAI work, anyway?

Using GenAI in production has two main parts – the “pre-training” part, and the “using it” part.

To make a Gen AI model like GPT, we feed massive amounts of data (like, most of the open internet) into its training.  By processing the data, the model seeks to learn the relationships, structures, and patterns of its training data.

To write text, LLMs must first translate words into a language they understand.

First, a block of words is broken down into tokens.  Tokens are basic units that can be encoded.  Tokens often represent fractions of words.

Tokens are not always full words because tokenization breaks text into smaller units to make it easier for language models to process. These units, called tokens, can be:

Short words (like “cat”) and punctuation marks / symbols (like “!” or “?”) are usually one token. 

Complex words (like “unbelievable”) might be tokenized into several chunks (like “un-“, “believ-“, and “-able”).  This allows the AI model to understand each part by itself – and handle different combinations like “believable” or “unbelievably.”  This can help the model understand the “meaning” of even words it hasn’t seen before in that exact form, but breaking them down.

In order to undersand a word’s meaning (like “cat,” in our earlier example), an LLM looks at all the places it’s seen the word “cat” in the billions of words it’s seen in the training data.

Eventually, we end up with a bunch of words that typically show up next to (or very near) the word “cat” in the training data – as well as words that were NOT found near it.

As the model goes through and processes this set of words, it produces something called a “vector” (or “word embedding.”)  A vector is just a way to represent words as numbers.  Now the model can “understand” words mathematically, making up a list of numbers to captures meaning and relationships of the word.  This lets us do things like finding similar words, answering questions, or completing sentences based on meaning.

For example, imagine the word “cat” can be represented as a simple vector like this: [0.8, 0.3, 0.5]. Each number could represent a feature of a “cat,” like “furry” (0.8), “small” (0.3), and “pouncy” (0.5).

The vector for the word “dog” might be [0.7, 0.5, 0.6].  That makes sense.  In many ways, a “dog” is similar to, but not the same as “cat.” [We’ll fight you on which is better later – and there IS a clear winner here, meow.]

Anyway, an LLM uses these word vectors to understand relationships.  “Cat” and “dog” are related but not identical.  “Cat,” “feline,” and “tabby” would have very similar embeddings.

The way these characteristics (like “fluffy”) are derived by the model means we humans don’t know EXACTLY what they represent.  However, words we expect to be used in similar ways should have similar-looking embeddings.  Things like “cat” and “feline,” “sea” and “ocean,” “light purple” and “lavender.”

When you give an LLM a prompt, it first converts your words into vectors using its word embeddings, which are numerical representations capturing meaning.

These vectors allow the LLM to understand relationships between words, such as context or similarity, based on patterns it learned during training.

The model processes the prompt through layers of neural networks to predict what words should come next, considering the context from the prompt.

It uses probabilities to decide which words or phrases make the most sense and creates a sequence of output vectors.

Finally, these output vectors are converted back into words, forming the response you see on your screen.

Again – if this was pretty dense, check out the song!!

Onwards!

✨Dona & Jeremiah ✨

If you need help with any of these concepts OR want to meet others who are also doing the thing, our “Become an AI Power User workshop” on December 13th might be useful to you.

You can find out more here: https://shop.3rdrodeoai.com/products/find-your-career-niche-in-the-ai-verse-dec-13-2024