What is ChatGPT? What are the technologies behind ChatGPT and How do we calculate perplexity to identify if the content is written by AI or Humans

ChatGPT has become the most common tool for everyday work, for all of us in a very short time of period.

ChatGPT has become our new search engine, our Wikipedia, our dictionary, and even our colleague who helps us with our regular tasks.

chatGPT and Perplexity

I am thinking about the day when these types of tools will turn into machines, look similar to humans, and have the advantage of knowing everything 😐

The more we learn about AI, the more we understand it and use it in a better way. Let’s understand chatGPT and how we calculate perplexity to identify, if the content is written by AI or Humans, We will also understand the technology behind chatGPT working.

What is ChatGPT?

ChatGPT is an advanced language model that understands the natural language, it is trained on different resources. ChatGPT is developed by OpenAI and it is based on GPT-4 architecture.

GPT-4 stands for generative pre-trained transformer, this is a powerful AI model that can generate human-like content.

ChatGPT is a conversational chatbot that was launched as a prototype on Nov 30, 2022. You can use ChatGPT for writing content, drafting emails, debugging your code, and writing a thesis.

Technologies behind ChatGPT

Most of us already know, what is chatGPT and how we can use it for our productivity, But only a few of us know how chatGPT actually works and what technologies working behind it.

Let’s start with Artificial Intelligence

Artificial Intelligence has two main fields, Machine Learning and Deep Learning, both are called weak AI.

Machine Learning handles structured data like tables and Deep Learning handles unstructured data like documents.

You must be thinking of if these two are weak AI then what is strong AI?

Artificial General Intelligence (AGI) and Artificial Super Intelligence (ASI) are considered Strong AI.

Deep Learning has multiple types of deep learning models, with time These models evolved and now we are using one of the deep learning models called transformers on which the chatGPT is based.

Type of Deep Learning models

  • Artificial Neural Network (ANN)
  • Convolutional Neural Network (CNN)
  • Recurrent Neural Network (RNN)
  • Transformers (chatGPT based on it)

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filter (or kernel) optimization.

Difference between feed-forward neural network and Recurrent Neural Network

Feed-Forward Neural NetworkRecurrent Neural Network
Information moves in the forward directioninformation goes through the cycle of a loop
Not good at predicting the futureGood at predicting the future
Check for only current inputCheck for current input and learn from past inputs
Can’t remember pastIt can remember due to its internal memory, create a copy of output, and loop it back into the network
feed-forward neural network vs. recurrent neural network

Now, let’s compare the Recurrent Neural Network and Transformers

Difference between Recurrent Neural Networks and Transformers

Recurrent Neural NetworkTransformers
Good for the sequence ChallengesGood for extended Sequence
Lost info when additional element added into the sequenceNo loss of information because of the hidden state of each step in the encoder
Maintain an internal state that is updated at each time step and used to process the current input and the previous state.Use a self-attention mechanism to weigh the importance of different parts of the input at each position.
This allows them to capture temporal dependencies in sequential dataThis allows them to effectively process sequential data of varying lengths, but not explicitly maintain an internal state
Recurrent Neural Network vs. Transformers

Transformers are designed for language generation tasks, translations, summarization, and text completion.

After chatGPT, Article writing becomes very easy, so some people just copy and paste chatGTP content and they called themselves professional bloggers without knowing that Google and other search engines can detect, whether an article is written by humans or an AI.

So, How do these search engines detect, whether the content written on a blog post is written by AI or a human? Let’s find out.

What is Perplexity? and How Perplexity is calculated for a language model to check whether the content is written by an AI or a human?

Perplexity is a measure of how well a probability model predicts a sample, it is a way to evaluate a language model, Perplexity measures how well a language model can predict a given sequence of words

A lower perplexity indicates that the model is more confident and accurate in its predictions that the content written is AI-generated, while a higher perplexity suggests higher uncertainty and less accurate predictions and is more like human-written content.

Mathematically, perplexity is calculated as follows for a language model:

Perplexity(W) = 2^H(W)

Where:

  • W represents a sequence of words or tokens.
  • H(W) is the entropy of the sequence, which measures the average number of bits needed to represent each token in the sequence according to the model’s predictions.

Suppose we have a very basic language model that predicts the next word in a sentence based on the previous word. Here’s our text corpus:

Corpus: "I am happy. I am excited."

We want to calculate the perplexity of the sentence “I am excited.” according to our language model. To do this, we’ll follow these steps:

Step 1: Tokenize the sentence:

Tokens: ["I", "am", "excited", "."]

Step 2: Calculate the probability of each token given the previous token: Let’s say our simple language model estimates probabilities as follows:

  • P(“am” | “I”) = 0.8
  • P(“excited” | “am”) = 0.6
  • P(“.” | “excited”) = 0.9

Step 3: Calculate the entropy: Entropy (H) measures the average number of bits needed to represent each token in the sequence. It’s calculated using the formula:

H(W) = - (1/N) * Σ[log2(P(wi | wi-1))]

Where N is the number of tokens in the sequence and wi-1 and wi are the previous and current tokens, respectively.

For our example, the entropy calculation is:

H("I am excited.") = - (1/3) * [log2(0.8) + log2(0.6) + log2(0.9)]

Step 4: Calculate perplexity: Perplexity (PP) is calculated as 2 raised to the power of the entropy:

Perplexity(W) = 2^H(W)

For our example:

Perplexity("I am excited.") = 2^H("I am excited.")

Now, let’s plug in the values and calculate:

H("I am excited.") ≈ - (1/3) * [log2(0.8) + log2(0.6) + log2(0.9)] ≈ 1.361
Perplexity("I am excited.") ≈ 2^1.361 ≈ 2.355

So, the perplexity of the sentence “I am excited.” according to our simple language model is approximately 2.355.

Lower the value it is more likely to be generated and if the value is higher it is more likely to be human-written.

Thank you for reading this. Happy Learning 🙂

Checkout our React JS articles

Related Posts

Prompt Writing

The Art of Prompt Writing: Unveiling the Essence of Effective Prompt Engineering

In the rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML), prompt writing has emerged as a crucial skill set, especially in the context of…

face swap in video and image with google colab script

Face Swapping in Video and Images with Google Colab Scripts | Learn to swap faces in videos and images with Google Colab scripts for free.

Face swapping typically involves using computer vision techniques and machine learning models, and Google Colab can be used to run Python code that leverages these technologies. To…

Leave a Reply

Your email address will not be published. Required fields are marked *