

Picture by Creator
# Introduction
It looks like nearly each week, a brand new mannequin claims to be state-of-the-art, beating current AI fashions on all benchmarks.
I get free entry to the newest AI fashions at my full-time job inside weeks of launch. I sometimes don’t pay a lot consideration to the hype and simply use whichever mannequin is auto-selected by the system.
Nevertheless, I do know builders and pals who wish to construct software program with AI that may be shipped to manufacturing. Since these initiatives are self-funded, their problem lies to find the perfect mannequin to do the job. They wish to stability price with reliability.
Attributable to this, after the discharge of GPT-5.2, I made a decision to run a sensible take a look at to grasp whether or not this mannequin was definitely worth the hype, and if it actually was higher than the competitors.
Particularly, I selected to check flagship fashions from every supplier: Claude Opus 4.5 (Anthropic’s most succesful mannequin), GPT-5.2 Professional (OpenAI’s newest prolonged reasoning mannequin), and DeepSeek V3.2 (one of many newest open-source options).
To place these fashions to the take a look at, I selected to get them to construct a playable Tetris sport with a single immediate.
These had been the metrics I used to judge the success of every mannequin:
| Standards | Description |
|---|---|
| First Try Success | With only one immediate, did the mannequin ship working code? A number of debugging iterations results in greater price over time, which is why this metric was chosen. |
| Characteristic Completeness | Have been all of the options talked about within the immediate constructed by the mannequin, or was something missed out? |
| Playability | Past the technical implementation, was the sport really easy to play? Or had been there points that created friction within the person expertise? |
| Value-effectiveness | How a lot did it price to get production-ready code? |
# The Immediate
Right here is the immediate I entered into every AI mannequin:
Construct a totally useful Tetris sport as a single HTML file that I can open immediately in my browser.
Necessities:
GAME MECHANICS:
– All 7 Tetris piece varieties
– Easy piece rotation with wall kick collision detection
– Items ought to fall robotically, enhance the pace steadily because the person’s rating will increase
– Line clearing with visible animation
– “Subsequent piece” preview field
– Recreation over detection when items attain the highestCONTROLS:
– Arrow keys: Left/Proper to maneuver, All the way down to drop quicker, As much as rotate
– Contact controls for cellular: Swipe left/proper to maneuver, swipe right down to drop, faucet to rotate
– Spacebar to pause/unpause
– Enter key to restart after sport overVISUAL DESIGN:
– Gradient colours for each bit sort
– Easy animations when items transfer and contours clear
– Clear UI with rounded corners
– Replace scores in actual time
– Stage indicator
– Recreation over display screen with ultimate rating and restart buttonGAMEPLAY EXPERIENCE AND POLISH:
– Easy 60fps gameplay
– Particle results when traces are cleared (elective however spectacular)
– Enhance the rating based mostly on variety of traces cleared concurrently
– Grid background
– Responsive designMake it visually polished and really feel satisfying to play. The code needs to be clear and well-organized.
# The Outcomes
// 1. Claude Opus 4.5
The Opus 4.5 mannequin constructed precisely what I requested for.
The UI was clear and directions had been displayed clearly on the display screen. All of the controls had been responsive and the sport was enjoyable to play.
The gameplay was so easy that I really ended up enjoying for fairly a while and received sidetracked from testing the opposite fashions.
Additionally, Opus 4.5 took lower than 2 minutes to supply me with this working sport, leaving me impressed on the primary strive.


Tetris sport constructed by Opus 4.5
// 2. GPT-5.2 Professional
GPT-5.2 Professional is OpenAI’s newest mannequin with prolonged reasoning. For context, GPT-5.2 has three tiers: On the spot, Considering, and Professional. On the level of writing this text, GPT-5.2 Professional is their most clever mannequin, offering prolonged pondering and reasoning capabilities.
It is usually 4x dearer than Opus 4.5.
There was a whole lot of hype round this mannequin, main me to go in with excessive expectations.
Sadly, I used to be underwhelmed by the sport this mannequin produced.
On the first strive, GPT-5.2 Professional produced a Tetris sport with a format bug. The underside rows of the sport had been exterior of the viewport, and I couldn’t see the place the items had been touchdown.
This made the sport unplayable, as proven within the screenshot under:


Tetris sport constructed by GPT-5.2
I used to be particularly stunned by this bug because it took round 6 minutes for the mannequin to supply this code.
I made a decision to strive once more with this follow-up immediate to repair the viewport downside:
The sport works, however there is a bug. The underside rows of the Tetris board are minimize off on the backside of the display screen. I can not see the items after they land and the canvas extends past the seen viewport.
Please repair this by:
1. Ensuring your entire sport board suits within the viewport
2. Including correct centering so the complete board is seenThe sport ought to match on the display screen with all rows seen.
After the follow-up immediate, the GPT-5.2 Professional mannequin produced a useful sport, as seen within the under screenshot:


Tetris second strive by GPT-5.2
Nevertheless, the sport play wasn’t as easy because the one produced by the Opus 4.5 mannequin.
After I pressed the “down” arrow for the piece to drop, the following piece would generally plummet immediately at a excessive pace, not giving me sufficient time to consider methods to place it.
The sport ended up being playable provided that I let each bit fall by itself, which wasn’t the perfect expertise.
(Word: I attempted the GPT-5.2 Customary mannequin too, which produced comparable buggy code on the primary strive.)
// 3. DeepSeek V3.2
DeepSeek’s first try at constructing this sport had two points:
- Items began disappearing after they hit the underside of the display screen.
- The “down” arrow that’s used to drop the items quicker ended up scrolling your entire webpage fairly than simply transferring the sport items.


Tetris sport constructed by DeepSeek V3.2
I re-prompted the mannequin to repair this difficulty, and the gameplay controls ended up working appropriately.
Nevertheless, some items nonetheless disappeared earlier than they landed. This made the sport utterly unplayable even after the second iteration.
I’m certain that this difficulty will be fastened with 2–3 extra prompts, and given DeepSeek’s low pricing, you might afford 10+ debugging rounds and nonetheless spend lower than one profitable Opus 4.5 try.
# Abstract: GPT-5.2 vs Opus 4.5 vs DeepSeek 3.2
// Value Breakdown
Here’s a price comparability between the three fashions:
| Mannequin | Enter (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| DeepSeek V3.2 | $0.27 | $1.10 |
| GPT-5.2 | $1.75 | $14.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| GPT-5.2 Professional | $21.00 | $84.00 |
DeepSeek V3.2 is the most affordable various, and you can too obtain the mannequin’s weights at no cost and run it by yourself infrastructure.
GPT-5.2 is sort of 7x dearer than DeepSeek V3.2, adopted by Opus 4.5 and GPT-5.2 Professional.
For this particular job (constructing a Tetris sport), we consumed roughly 1,000 enter tokens and three,500 output tokens.
For every extra iteration, we are going to estimate an additional 1,500 tokens per extra spherical. Right here is the full price incurred per mannequin:
| Mannequin | Complete Value | Outcome |
|---|---|---|
| DeepSeek V3.2 | ~$0.005 | Recreation is not playable |
| GPT-5.2 | ~$0.07 | Playable, however poor person expertise |
| Claude Opus 4.5 | ~$0.09 | Playable and good person expertise |
| GPT-5.2 Professional | ~$0.41 | Playable, however poor person expertise |
# Takeaways
Based mostly on my expertise constructing this sport, I might persist with the Opus 4.5 mannequin for day after day coding duties.
Though GPT-5.2 is cheaper than Opus 4.5, I personally wouldn’t use it to code, for the reason that iterations required to yield the identical end result would doubtless result in the identical amount of cash spent.
DeepSeek V3.2, nonetheless, is way extra inexpensive than the opposite fashions on this listing.
In the event you’re a developer on a funds and have time to spare on debugging, you’ll nonetheless find yourself saving cash even when it takes you over 10 tries to get working code.
I used to be stunned at GPT 5.2 Professional’s incapacity to supply a working sport on the primary strive, because it took round 6 minutes to assume earlier than developing with flawed code. In spite of everything, that is OpenAI’s flagship mannequin, and Tetris needs to be a comparatively easy job.
Nevertheless, GPT-5.2 Professional’s strengths lie in math and scientific analysis, and it’s particularly designed for issues that don’t depend on sample recognition from coaching knowledge. Maybe this mannequin is over-engineered for easy day-to-day coding duties, and may as an alternative be used when constructing one thing that’s complicated and requires novel structure.
The sensible takeaway from this experiment:
- Opus 4.5 performs greatest at day-to-day coding duties.
- DeepSeek V3.2 is a funds various that delivers cheap output, though it requires some debugging effort to succeed in your required final result.
- GPT-5.2 (Customary) didn’t carry out in addition to Opus 4.5, whereas GPT-5.2 (Professional) might be higher fitted to complicated reasoning than fast coding duties like this one.
Be at liberty to copy this take a look at with the immediate I’ve shared above, and comfortable coding!
 
 
Natassha Selvaraj is a self-taught knowledge scientist with a ardour for writing. Natassha writes on the whole lot knowledge science-related, a real grasp of all knowledge subjects. You possibly can join along with her on LinkedIn or try her YouTube channel.


Picture by Creator
# Introduction
It looks like nearly each week, a brand new mannequin claims to be state-of-the-art, beating current AI fashions on all benchmarks.
I get free entry to the newest AI fashions at my full-time job inside weeks of launch. I sometimes don’t pay a lot consideration to the hype and simply use whichever mannequin is auto-selected by the system.
Nevertheless, I do know builders and pals who wish to construct software program with AI that may be shipped to manufacturing. Since these initiatives are self-funded, their problem lies to find the perfect mannequin to do the job. They wish to stability price with reliability.
Attributable to this, after the discharge of GPT-5.2, I made a decision to run a sensible take a look at to grasp whether or not this mannequin was definitely worth the hype, and if it actually was higher than the competitors.
Particularly, I selected to check flagship fashions from every supplier: Claude Opus 4.5 (Anthropic’s most succesful mannequin), GPT-5.2 Professional (OpenAI’s newest prolonged reasoning mannequin), and DeepSeek V3.2 (one of many newest open-source options).
To place these fashions to the take a look at, I selected to get them to construct a playable Tetris sport with a single immediate.
These had been the metrics I used to judge the success of every mannequin:
| Standards | Description |
|---|---|
| First Try Success | With only one immediate, did the mannequin ship working code? A number of debugging iterations results in greater price over time, which is why this metric was chosen. |
| Characteristic Completeness | Have been all of the options talked about within the immediate constructed by the mannequin, or was something missed out? |
| Playability | Past the technical implementation, was the sport really easy to play? Or had been there points that created friction within the person expertise? |
| Value-effectiveness | How a lot did it price to get production-ready code? |
# The Immediate
Right here is the immediate I entered into every AI mannequin:
Construct a totally useful Tetris sport as a single HTML file that I can open immediately in my browser.
Necessities:
GAME MECHANICS:
– All 7 Tetris piece varieties
– Easy piece rotation with wall kick collision detection
– Items ought to fall robotically, enhance the pace steadily because the person’s rating will increase
– Line clearing with visible animation
– “Subsequent piece” preview field
– Recreation over detection when items attain the highestCONTROLS:
– Arrow keys: Left/Proper to maneuver, All the way down to drop quicker, As much as rotate
– Contact controls for cellular: Swipe left/proper to maneuver, swipe right down to drop, faucet to rotate
– Spacebar to pause/unpause
– Enter key to restart after sport overVISUAL DESIGN:
– Gradient colours for each bit sort
– Easy animations when items transfer and contours clear
– Clear UI with rounded corners
– Replace scores in actual time
– Stage indicator
– Recreation over display screen with ultimate rating and restart buttonGAMEPLAY EXPERIENCE AND POLISH:
– Easy 60fps gameplay
– Particle results when traces are cleared (elective however spectacular)
– Enhance the rating based mostly on variety of traces cleared concurrently
– Grid background
– Responsive designMake it visually polished and really feel satisfying to play. The code needs to be clear and well-organized.
# The Outcomes
// 1. Claude Opus 4.5
The Opus 4.5 mannequin constructed precisely what I requested for.
The UI was clear and directions had been displayed clearly on the display screen. All of the controls had been responsive and the sport was enjoyable to play.
The gameplay was so easy that I really ended up enjoying for fairly a while and received sidetracked from testing the opposite fashions.
Additionally, Opus 4.5 took lower than 2 minutes to supply me with this working sport, leaving me impressed on the primary strive.


Tetris sport constructed by Opus 4.5
// 2. GPT-5.2 Professional
GPT-5.2 Professional is OpenAI’s newest mannequin with prolonged reasoning. For context, GPT-5.2 has three tiers: On the spot, Considering, and Professional. On the level of writing this text, GPT-5.2 Professional is their most clever mannequin, offering prolonged pondering and reasoning capabilities.
It is usually 4x dearer than Opus 4.5.
There was a whole lot of hype round this mannequin, main me to go in with excessive expectations.
Sadly, I used to be underwhelmed by the sport this mannequin produced.
On the first strive, GPT-5.2 Professional produced a Tetris sport with a format bug. The underside rows of the sport had been exterior of the viewport, and I couldn’t see the place the items had been touchdown.
This made the sport unplayable, as proven within the screenshot under:


Tetris sport constructed by GPT-5.2
I used to be particularly stunned by this bug because it took round 6 minutes for the mannequin to supply this code.
I made a decision to strive once more with this follow-up immediate to repair the viewport downside:
The sport works, however there is a bug. The underside rows of the Tetris board are minimize off on the backside of the display screen. I can not see the items after they land and the canvas extends past the seen viewport.
Please repair this by:
1. Ensuring your entire sport board suits within the viewport
2. Including correct centering so the complete board is seenThe sport ought to match on the display screen with all rows seen.
After the follow-up immediate, the GPT-5.2 Professional mannequin produced a useful sport, as seen within the under screenshot:


Tetris second strive by GPT-5.2
Nevertheless, the sport play wasn’t as easy because the one produced by the Opus 4.5 mannequin.
After I pressed the “down” arrow for the piece to drop, the following piece would generally plummet immediately at a excessive pace, not giving me sufficient time to consider methods to place it.
The sport ended up being playable provided that I let each bit fall by itself, which wasn’t the perfect expertise.
(Word: I attempted the GPT-5.2 Customary mannequin too, which produced comparable buggy code on the primary strive.)
// 3. DeepSeek V3.2
DeepSeek’s first try at constructing this sport had two points:
- Items began disappearing after they hit the underside of the display screen.
- The “down” arrow that’s used to drop the items quicker ended up scrolling your entire webpage fairly than simply transferring the sport items.


Tetris sport constructed by DeepSeek V3.2
I re-prompted the mannequin to repair this difficulty, and the gameplay controls ended up working appropriately.
Nevertheless, some items nonetheless disappeared earlier than they landed. This made the sport utterly unplayable even after the second iteration.
I’m certain that this difficulty will be fastened with 2–3 extra prompts, and given DeepSeek’s low pricing, you might afford 10+ debugging rounds and nonetheless spend lower than one profitable Opus 4.5 try.
# Abstract: GPT-5.2 vs Opus 4.5 vs DeepSeek 3.2
// Value Breakdown
Here’s a price comparability between the three fashions:
| Mannequin | Enter (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| DeepSeek V3.2 | $0.27 | $1.10 |
| GPT-5.2 | $1.75 | $14.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| GPT-5.2 Professional | $21.00 | $84.00 |
DeepSeek V3.2 is the most affordable various, and you can too obtain the mannequin’s weights at no cost and run it by yourself infrastructure.
GPT-5.2 is sort of 7x dearer than DeepSeek V3.2, adopted by Opus 4.5 and GPT-5.2 Professional.
For this particular job (constructing a Tetris sport), we consumed roughly 1,000 enter tokens and three,500 output tokens.
For every extra iteration, we are going to estimate an additional 1,500 tokens per extra spherical. Right here is the full price incurred per mannequin:
| Mannequin | Complete Value | Outcome |
|---|---|---|
| DeepSeek V3.2 | ~$0.005 | Recreation is not playable |
| GPT-5.2 | ~$0.07 | Playable, however poor person expertise |
| Claude Opus 4.5 | ~$0.09 | Playable and good person expertise |
| GPT-5.2 Professional | ~$0.41 | Playable, however poor person expertise |
# Takeaways
Based mostly on my expertise constructing this sport, I might persist with the Opus 4.5 mannequin for day after day coding duties.
Though GPT-5.2 is cheaper than Opus 4.5, I personally wouldn’t use it to code, for the reason that iterations required to yield the identical end result would doubtless result in the identical amount of cash spent.
DeepSeek V3.2, nonetheless, is way extra inexpensive than the opposite fashions on this listing.
In the event you’re a developer on a funds and have time to spare on debugging, you’ll nonetheless find yourself saving cash even when it takes you over 10 tries to get working code.
I used to be stunned at GPT 5.2 Professional’s incapacity to supply a working sport on the primary strive, because it took round 6 minutes to assume earlier than developing with flawed code. In spite of everything, that is OpenAI’s flagship mannequin, and Tetris needs to be a comparatively easy job.
Nevertheless, GPT-5.2 Professional’s strengths lie in math and scientific analysis, and it’s particularly designed for issues that don’t depend on sample recognition from coaching knowledge. Maybe this mannequin is over-engineered for easy day-to-day coding duties, and may as an alternative be used when constructing one thing that’s complicated and requires novel structure.
The sensible takeaway from this experiment:
- Opus 4.5 performs greatest at day-to-day coding duties.
- DeepSeek V3.2 is a funds various that delivers cheap output, though it requires some debugging effort to succeed in your required final result.
- GPT-5.2 (Customary) didn’t carry out in addition to Opus 4.5, whereas GPT-5.2 (Professional) might be higher fitted to complicated reasoning than fast coding duties like this one.
Be at liberty to copy this take a look at with the immediate I’ve shared above, and comfortable coding!
 
 
Natassha Selvaraj is a self-taught knowledge scientist with a ardour for writing. Natassha writes on the whole lot knowledge science-related, a real grasp of all knowledge subjects. You possibly can join along with her on LinkedIn or try her YouTube channel.
















