ChatGPT 4.0 is expected to make big waves in the chatbot universe. But is it really superior to its predecessor? Let’s check out.
ChatGPT was just the tip of the upcoming automation iceberg, with Google, Microsoft, and many other AI-first companies following suit.
For starters, we have found a few chatbots like ChatGPT, many of which don’t belong to big tech.
But call it the power of marketing; ChatGPT is still the reigning king of all such AI bots. It can do many things, including mathematics, poetry, and blog posts, and people are even using this to file lawsuits.
It has scores of professionals worrying about their skill set going useless in the near future.
However, I have read a Linkedin post that said:
AI won’t replace you, but a person using AI can.
So, keeping our fingers crossed, let’s get educated about the latest ChatGPT update and see how it differs from its previous versions.
ChatGPT: Legacy, Default, and The Update
So there are three versions available to the paid users: Legacy (3.5), Default (3.5), and the recent ChatGPT Update (4).
Although we’ll go a little deep about their capabilities, this is what OpenAI has to say about the differences:
So, while free users have only Legacy 3.5 to play with, the premium subscription offers all three to try and settle for what the users think is the best.
To summarize the preceding image, the paid plans are about getting more accurate results at a decent speed. However, the distinctions are only apparent if the prompts are complicated and need creativity.
|Parameter||ChatGPT 4||ChatGPT 3.5|
|Bar Exam Score||Top 10%||Bottom 10%|
|AI2 Reasoning Challenge (ARC)||96.3%||85.2%|
|Python Coding Score||67%||48.1%|
|Context||Over 25k words||Less|
Besides, ChatGPT 4 can accept visual inputs.
Well, enough of the textbook definitions. Let’s get our hands dirty and evaluate these candidates in the real-life battleground.
Further sections are full of images that may appear unclear. In such a case, right-click any image and select Open in a new tab to view properly.
Being an engineering graduate, I can’t help throwing them some basic problems. Let’s start easy with algebraic equations.
Many of us have seen these equations ax2+bx+c=0, where we have to solve for X. Here, I gave this simple prompt, Solve for x: x2 + x – 6 = 0
While all gave the same roots (X= -3,2), Legacy and the Update were more similar in using the formula directly (as any student will) to find out the result.
However, Default 3.5 explained two methods, including factorization, which normally any skilled student deploys when given such mundane equations.
Next, I prompt it to solve a slightly complex cubic equation: x^3 -12x^2 + 48x – 64 = 0.
This really proved why ChatGPT 4 is the “update”.
Here are the responses:
All this hype and ChatGPT Legacy and Default couldn’t solve a generic cubic equation. However, Legacy did a little better and found two roots correctly, while Default failed with all.
The Update was the clear winner in stage two and solved the equation perfectly, finding all three roots with a nice explanation.
We can safely assume most elementary mathematics problems have dictionary solutions. If you know the theorem or formula, input the values, and get the results.
And ChatGPT, being AI, can make quick work of such queries. However, logical reasoning is a different territory, with high chances of AI falling flat.
I gave them the classic:
A is older than B.
C is older than A.
B is older than C.
Is the third statement true or false if the first two statements are true?
And all of the ChatGPT versions were correct in stating that the third statement was false.
Next, I used names in place of the alphabet, and the results might surprise you:
So, Default 3.5 continued its sub-par performance and got confused with this modest variation. Still, the Legacy and the Update performed optimally.
You might have noticed by now the purpose of this Stage I and Stage II is to find the point of difference, where the complexity of a given prompt sets apart the Update from the rest two.
Here, the prompt was a simple logical puzzle:
One morning after sunrise, Rohit was standing facing a pole. The shadow of the pole fell exactly to his right. To which direction was he facing?
This one pushed the Legacy to give an inaccurate answer, whereas the Default responded with vague clarifications leading to a wrong conclusion.
Only the Update shined with the correct answer, with easy-to-follow statements.
Filing lawsuits can be tricky, but sometimes it doesn’t come to that if you can draft a striking first notice.
Here, I went with this prompt: Write a letter to Tim cook to hand over apple to me for not replying to one of my tweets.
Funny, yes! But let’s see what AI can make out of this.
The Legacy 3.5 straightaway took the prompt like a robo-slave and churned out a letter that can make me an excellent subject of mockery if it ever reached its intended destination.
The Default was no good either. However, it just shut me down as a grumpy old man would do to a five-year-old.
While the arguments made were pinpoint, this ended the fun right there with little learning.
Although this was a simple enough prompt, it needed some thought and creativity. And that’s where the big brother, the Update, made its case:
First, this was drafted near perfectly. Second, it saved me a Google search for the address of Apple headquarters (though one should verify such entries).
Third, it was nicely written with an official tone and a humorous touch. Besides, the intent was clear in the subject line itself.
And still, the letter conveyed the sentiment of a disgruntled Apple fan.
So, this makes ChatGPT 4 (aka the Update) miles ahead of its old cousins. It’s scarily intelligent and has some signs of common sense, making it more than a dull, boring chatbot.
With the launch of ChatGPT, poetry, I thought, could be its weak point.
After all, it takes emotions, creativity, and much effort for a human to create something that truly resonates with its readers.
Put simply, poetry is art at its best, and I secretly wished AI to fail. But that was before my coworker hit all of us hard in Geekflare’s Slack channel with a ChatGPT creation that was before this 4.0 update.
Here’s the prompt I gave to our candidates: “express poetically why or why not serving burgers, along with their current menu, can benefit the dominos pizza chain. Keep it less than 100 words.”
Can you spot the difference?
The Default’s version was an ultra-short, only 32 words, and couldn’t utilize the available bandwidth to showcase its creativity.
The Legacy, although it used the maximum words among the three, concludes that the endeavor of serving pizzas alongside isn’t risky and will result in sure success either way, which isn’t entirely true.
The Update’s poetry was just 53 words, wasting almost half of the allotted word count. Still, it was clear about the rewards and potential pitfalls and couldn’t come to any result, which is, I guess, more humane than the rest.
Next, I asked them all to “explain the poetry to a five-year-old.”
Interestingly, Legacy couldn’t take context from the conversation and explained “Poetry” literally. Default did take the context and summarized it in a paragraph which is still decent.
Continuing the trend, ChatGPT 4 simplified its creativity while keeping the poetic flavor alive.
ChatGPT Premium vs. ChatGPT Free
Free, being free, lacks speed and accuracy and is no match against ChatGPT 4, but it isn’t entirely useless, either.
To compare it on an even ground, I threw to it the same prompts we have tested Legacy, Default, and Update with.
🔵 Mathematics: It solved the quadratic equations but gave the wrong answers for the cubic. (like the Legacy and the Default)
🔵 Logical Reasoning: Passed the first stage with alphabets and names but failed with the second (like Legacy).
🔵 Letters: Didn’t write the letter and deemed the prompt unethical and inappropriate. (like the Default)
🔵 Poetry: Generated poetry in 30+ words and explained it decently. (similar to the Default).
So, we can conclude the free version isn’t bad either. Actually, it’s on par with Default 3.5 and even better in some aspects.
Also read: Powerful Prompts To Elevate Your ChatGPT Experience
The Way Ahead
Rumors about AI replacing jobs in the future aren’t completely wrong.
First, automation did this in the manufacturing industry, and now it’s spreading wings everywhere else.
Personally, it’s way faster than me in solving cubic equations, creating poetry, or writing letters. However, the fact that it rarely says NO to a prompt and hardly learns from its mistakes pegs its way behind us humans.
To reiterate, AI won’t replace us, but someone using AI can.
Here at Geekflare, our marketing team uses ChatGPT in interesting ways. For instance, we recently reached the 100 million views milestone, and our CEO thought to give it back to the audience via a giveaway.
And I guess the marketing guys needed a title to grab the reader’s attention. So, they gave one prompt and asked ChatGPT to suggest a few variations, like this:
Besides, we use it for content summarization, grammar checking, suggesting titles for new articles, and whatnot.
Conclusively, there are many ways to take benefit and race ahead of stereotypes that see AI as a useless piece of junk.
The only thing to remember is there must be someone (human) to judge AI work as it can be (grossly) inaccurate and misleading.
The Update is Really Something!
In my short encounter, ChatGPT 4 felt more creative, understanding, and realistic. Still, this is a machine and can give wrong answers confidently.
But what’s stunning is the level of upgrade OpenAI has done to this project in just a matter of a few months.
And I can’t wait to see the magic the next update may bewilder us with!
PS: Not just a chat window; harness its power with these best ChatGPT Chrome extensions. And have you ever thought about integrating ChatGPT with Siri?