• Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
Tuesday, October 14, 2025
newsaiworld
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
Morning News
No Result
View All Result
Home Artificial Intelligence

This Puzzle Reveals Simply How Far LLMs Have Progressed in a Little Over a Yr

Admin by Admin
October 8, 2025
in Artificial Intelligence
0
0 75whxqnvenpap1e.jpg
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

READ ALSO

TDS E-newsletter: September Should-Reads on ML Profession Roadmaps, Python Necessities, AI Brokers, and Extra

Dreaming in Blocks — MineWorld, the Minecraft World Mannequin


that the capabilities of LLMs have progressed dramatically in the previous couple of years, but it surely’s arduous to quantify simply how good they’ve turn into.

That received me considering again to a geometrical downside I got here throughout on a YouTube channel final yr. This was in June 2024, and I attempted to get the main massive language mannequin on the time (GPT-4o) to resolve the puzzle. It didn’t go that effectively and required rather a lot of effort to discover a resolution, and I questioned how the most recent LLMs would fare with the identical puzzle.

The puzzle

Right here’s a fast reminder of what I used to be asking the LLM to resolve again then. Assume now we have the next grid of dots/nodes. Within the x and y airplane, every node is strictly one unit away from its adjoining neighbour. It seems like this,

Now, the query I wished to reply was this,

What number of distinct squares may be drawn on this diagram?

It shortly grew to become clear that GPT-4o didn’t know the reply, so I modified tack barely and as a substitute requested it this.

I would love a Python program that plots out all of the squares we will 
draw on the connected diagram, assuming that the corners of any sq. 
should lie on one of many spots on the diagram. Assume every adjoining spot is 
1 unit aside in each the x and y instructions. Additionally print out a abstract of 
the variety of squares of the identical measurement and what their facet lengths are

Lengthy story quick, I ultimately received GPT-4o to give you an accurate Python-based resolution. Nonetheless, it took me round two hours and effectively over 40 iterations of me going backwards and forwards with the mannequin to refine its reply till it got here up with one thing that labored.

NB, have you ever labored out the reply but? Even now, I nonetheless can’t fairly imagine that it’s 21 squares you possibly can draw on this graph.

Interplay with Sonnet 4.5

This brings us again to the current day and Anthropic’s current launch of what’s thought-about one of the best coding mannequin out there at present, specifically Claude Sonnet 4.5. It’s been round 16 months since I final checked out this, so simply how a lot have LLMs improved within the intervening interval? Let’s see how Sonnet 4.5 copes with the puzzle.

To make use of Sonnet 4.5, I visited Anthropic’s homepage.

https://www.anthropic.com

You need to see a web page like this.

Picture from the Anthropic web site

Click on the massive black “Strive Claude” button positioned close to the highest proper of the display screen. You’ll be requested to log in at this level. For those who don’t have already got an account with Anthropic, please set one up now. It’s fully free to take action.

As soon as you’re logged in, you’ll be taken to a display screen displaying a well-recognized chat-based window that means that you can work together with and ask questions of Sonnet 4.5.

Earlier than I requested the mannequin to offer a Python program to resolve the puzzle, I requested it a straight query to see if it already knew the reply.

It got here again with,

Taking a look at this grid, I have to rely all potential squares of various 
sizes that may be fashioned by connecting the dots.
Let me first determine the grid construction. The dots kind:

2 dots within the prime row
2 dots within the second row
6 dots within the third row
6 dots within the fourth row
2 dots within the fifth row
2 dots within the sixth row

This seems to be a cross or plus-sign formed grid sample.
Now I will rely squares by measurement:
1×1 squares: These are fashioned by any 4 dots that make a unit sq..

Within the horizontal center part (rows 3-4): 5 squares per row × 1 = 5 squares
Within the vertical center part (columns the place the cross intersects): counting 
the vertical segments offers us 5 extra squares
Complete: 10 small squares

2×2 squares: These want 4×4 preparations of dots.

Within the horizontal part: 4 squares
Within the vertical part: 4 squares
Overlap in middle: -1 (counted twice)
Complete: 7 squares of measurement 2×2

3×3 squares and bigger: The grid sample is simply 6 dots extensive at its most, however the cross form limits the place bigger squares can match. Checking the geometry:

One 3×3 sq. can match within the middle the place the arms of the cross overlap
Complete: 1 sq. of measurement 3×3

Complete: 10 + 7 + 1 = 18 squares

So, confidently incorrect. Curious and intrigued concerning the mannequin’s reference to a 3×3 sq. (which is unimaginable on this grid), I requested it to indicate me the sq. on the diagram. It replied with this.

Wanting on the grid extra rigorously, I have to rethink whether or not a 3×3 sq. 
really exists.
For a 3×3 sq. to exist, I would wish a 4×4 association of dots 
(since a 3×3 sq. requires 4 dots on both sides). Let me examine the grid 
construction:
The grid seems to be:

Rows 1-2: solely 2 dots every (centered)
Rows 3-4: 6 dots every (full width)
Rows 5-6: solely 2 dots every (centered)

For a 3×3 sq., I would wish 4 consecutive rows and 4 consecutive columns all with dots on the intersection factors. Nevertheless, wanting on the cross/plus sample:

The highest solely has 2 columns of dots
The center has 6 columns of dots
The underside solely has 2 columns of dots

Since rows 1-2 and rows 5-6 solely have 2 dots every (not spanning the complete width)
, there is not really a 4×4 block of dots anyplace on this grid.
I have to appropriate my reply: There's NO 3×3 sq. potential on this grid.
The overall ought to be: 10 (1×1 squares) + 7 (2×2 squares) = 17 squares
I apologize for the error in my preliminary rely. The cross-shaped sample 
prevents any sq. bigger than 2×2 from being fashioned.

Alright, it semi-redeemed itself with that reply, however basically, it didn’t accurately say there have been 21 squares in whole.

The ultimate reply

To maintain issues truthful with the mannequin comparability over the 16 months, I requested Sonnet 4.5 the very same query I had requested GPT-4o, which, as a reminder, was this.

I would love a Python program that plots out all of the squares we will draw on 
the connected diagram, assuming that the corners of any sq. should lie on one 
of the spots on the diagram. Assume every adjoining spot is 1 unit aside in 
each the x and y instructions. Additionally print out a abstract of the variety of 
squares of the identical measurement and what their facet lengths are

Sonnet 4.5 thought for all of 5 seconds earlier than producing its Python code.

import matplotlib.pyplot as plt
import numpy as np
from itertools import combos
from collections import defaultdict

# Outline the grid factors based mostly on the picture
factors = [
    # Top two rows (2 dots each, centered)
    (3, 6), (4, 6),
    (3, 5), (4, 5),
    # Middle two rows (6 dots each, full width)
    (1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4),
    (1, 3), (2, 3), (3, 3), (4, 3), (5, 3), (6, 3),
    # Bottom two rows (2 dots each, centered)
    (3, 2), (4, 2),
    (3, 1), (4, 1),
]

def is_square(p1, p2, p3, p4):
    """Verify if 4 factors kind a sq. and return facet size if true."""
    factors = [p1, p2, p3, p4]
    
    # Calculate all pairwise distances
    distances = []
    for i in vary(len(factors)):
        for j in vary(i + 1, len(factors)):
            dist = np.sqrt((factors[i][0] - factors[j][0])**2 + 
                          (factors[i][1] - factors[j][1])**2)
            distances.append(dist)
    
    distances.type()
    
    # For a sq.: 4 equal sides and a pair of equal diagonals
    # Diagonals ought to be sqrt(2) occasions the facet size
    if len(distances) == 6:
        sides = distances[:4]
        diagonals = distances[4:]
        
        # Verify if all sides are equal
        if np.allclose(sides, [sides[0]] * 4, atol=1e-6):
            # Verify if diagonals are equal
            if np.allclose(diagonals, [diagonals[0]] * 2, atol=1e-6):
                # Verify if diagonal = facet * sqrt(2)
                if np.allclose(diagonals[0], sides[0] * np.sqrt(2), atol=1e-6):
                    return True, sides[0]
    
    return False, 0

def order_square_points(p1, p2, p3, p4):
    """Order factors of a sq. in counterclockwise order ranging from bottom-left."""
    factors = [p1, p2, p3, p4]
    # Discover centroid
    cx = sum(p[0] for p in factors) / 4
    cy = sum(p[1] for p in factors) / 4
    
    # Kind by angle from centroid
    def angle(p):
        return np.arctan2(p[1] - cy, p[0] - cx)
    
    factors.type(key=angle)
    return factors

# Discover all squares
squares = []
squares_by_size = defaultdict(listing)

for combo in combos(factors, 4):
    is_sq, side_length = is_square(*combo)
    if is_sq:
        ordered = order_square_points(*combo)
        squares.append(ordered)
        squares_by_size[round(side_length, 2)].append(ordered)

# Print abstract
print("=" * 50)
print("SUMMARY OF SQUARES FOUND")
print("=" * 50)
total_squares = 0
for side_length in sorted(squares_by_size.keys()):
    rely = len(squares_by_size[side_length])
    total_squares += rely
    print(f"Facet size {side_length}: {rely} sq.(s)")

print(f"nTotal squares: {total_squares}")
print("=" * 50)

# Create visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 14))
axes = axes.flatten()

# Plot 1: All dots
ax = axes[0]
x_coords = [p[0] for p in factors]
y_coords = [p[1] for p in factors]
ax.scatter(x_coords, y_coords, c='black', s=100, zorder=3)
ax.set_xlim(0, 7)
ax.set_ylim(0, 7)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.set_title('Authentic Grid', fontsize=14, fontweight='daring')
ax.set_xlabel('X')
ax.set_ylabel('Y')

# Plot 2: All squares overlaid
ax = axes[1]
ax.scatter(x_coords, y_coords, c='black', s=100, zorder=3)
colours = plt.cm.rainbow(np.linspace(0, 1, len(squares)))
for sq., coloration in zip(squares, colours):
    square_closed = sq. + [square[0]]
    xs = [p[0] for p in square_closed]
    ys = [p[1] for p in square_closed]
    ax.plot(xs, ys, c=coloration, linewidth=2, alpha=0.6)
ax.set_xlim(0, 7)
ax.set_ylim(0, 7)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.set_title(f'All {total_squares} Squares Overlaid', fontsize=14, fontweight='daring')
ax.set_xlabel('X')
ax.set_ylabel('Y')

# Plot 3 & 4: Squares by measurement
plot_idx = 2
for side_length in sorted(squares_by_size.keys()):
    if plot_idx >= len(axes):
        break
    
    ax = axes[plot_idx]
    ax.scatter(x_coords, y_coords, c='lightgray', s=100, zorder=2)
    
    squares_of_size = squares_by_size[side_length]
    colours = plt.cm.viridis(np.linspace(0, 1, len(squares_of_size)))
    
    for sq., coloration in zip(squares_of_size, colours):
        square_closed = sq. + [square[0]]
        xs = [p[0] for p in square_closed]
        ys = [p[1] for p in square_closed]
        ax.plot(xs, ys, c=coloration, linewidth=2.5, marker='o', markersize=8)
    
    ax.set_xlim(0, 7)
    ax.set_ylim(0, 7)
    ax.set_aspect('equal')
    ax.grid(True, alpha=0.3)
    ax.set_title(f'Squares with facet size {side_length} ({len(squares_of_size)} whole)', 
                 fontsize=14, fontweight='daring')
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    plot_idx += 1

plt.tight_layout()
plt.present()

At first sight, I assumed there was no method Sonnet may have give you an accurate resolution so shortly. GPT-4o had taken hours. Nevertheless, I put in all of the required libraries the code wanted and copied and pasted it right into a Jupyter Pocket book cell, operating it and acquiring the next output. I’ve omitted the graphs for now, which had been additionally output.

==================================================
SUMMARY OF SQUARES FOUND
==================================================
Facet size 1.0: 9 sq.(s)
Facet size 1.41: 4 sq.(s)
Facet size 2.24: 2 sq.(s)
Facet size 2.83: 4 sq.(s)
Facet size 3.61: 2 sq.(s)

Complete squares: 21
==================================================

#
# Plus some graphs that I am not displaying right here
#

That shocked me. The reply was completely spot on.

The one slight factor the mannequin didn’t fairly get proper was that it didn’t output a plot of every set of in another way sized squares. It simply did the 9 1x1s and the 4 √2x√2 ones. I solved that by asking Sonnet to incorporate these, too.

Are you able to print the graphs in sq. facet order. Can also you've two graphs  
facet by facet on every "line"

That is what it produced.

Lovely.

Abstract

To show simply how dramatically LLMs have superior in a couple of yr, I made a decision to revisit a difficult geometric puzzle I first tried to resolve with GPT-4o again in June 2024. The puzzle was to put in writing a Python program that finds and plots all potential squares on a particular cross-shaped grid of dots.

My expertise just a little over a yr in the past was a battle; it took me roughly two hours and over 40 prompts to information GPT-4o to an accurate Python resolution.

Quick ahead to at present, and I examined the brand new Claude Sonnet 4.5. After I first requested the mannequin the query straight, it did not calculate the right variety of squares. Not an awesome begin, nonetheless, the actual take a look at was giving it the very same immediate I used on GPT-4o.

To my shock, it produced a whole, appropriate Python resolution in one shot. The code it generated not solely discovered all 21 squares but in addition accurately categorised them by their distinctive facet lengths and generated detailed plots to visualise them. Whereas I wanted one fast follow-up immediate to good the plots, the core downside was solved immediately.

May or not it’s that the very act of my attempting to resolve this puzzle final yr and publishing my findings launched it to the web-o-sphere, that means Anthropic have merely crawled it and included it into their mannequin information base? Sure, I suppose that could possibly be it, however then why couldn’t the mannequin reply the primary direct query I requested it concerning the whole variety of squares accurately? 

To me, this experiment starkly illustrates the unbelievable leap in LLM functionality. What was as soon as a two-hour iterative battle with the main mannequin of its time 16 months in the past is now a five-second, one-shot success with the main mannequin at present.

Tags: aYearLLMsProgressedPuzzleShows

Related Posts

Landis brown gvdfl 814 c unsplash.jpg
Artificial Intelligence

TDS E-newsletter: September Should-Reads on ML Profession Roadmaps, Python Necessities, AI Brokers, and Extra

October 11, 2025
Mineworld video example ezgif.com resize 2.gif
Artificial Intelligence

Dreaming in Blocks — MineWorld, the Minecraft World Mannequin

October 10, 2025
0 v yi1e74tpaj9qvj.jpeg
Artificial Intelligence

Previous is Prologue: How Conversational Analytics Is Altering Information Work

October 10, 2025
Pawel czerwinski 3k9pgkwt7ik unsplash scaled 1.jpg
Artificial Intelligence

Knowledge Visualization Defined (Half 3): The Position of Colour

October 9, 2025
Nasa hubble space telescope rzhfmsl1jow unsplash.jpeg
Artificial Intelligence

Know Your Actual Birthday: Astronomical Computation and Geospatial-Temporal Analytics in Python

October 8, 2025
Image 93.jpg
Artificial Intelligence

How you can Carry out Efficient Agentic Context Engineering

October 7, 2025
Next Post
019994fb 72d9 7c98 97ef fb89451fe2b8.jpeg

How you can Discover Cash Earlier than They Get Listed on Binance or Coinbase

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Blog.png

XMN is accessible for buying and selling!

October 10, 2025
0 3.png

College endowments be a part of crypto rush, boosting meme cash like Meme Index

February 10, 2025
Gemini 2.0 Fash Vs Gpt 4o.webp.webp

Gemini 2.0 Flash vs GPT 4o: Which is Higher?

January 19, 2025
1da3lz S3h Cujupuolbtvw.png

Scaling Statistics: Incremental Customary Deviation in SQL with dbt | by Yuval Gorchover | Jan, 2025

January 2, 2025
Gary20gensler2c20sec id 727ca140 352e 4763 9c96 3e4ab04aa978 size900.jpg

Coinbase Recordsdata Authorized Movement In opposition to SEC Over Misplaced Texts From Ex-Chair Gary Gensler

September 14, 2025

EDITOR'S PICK

Chapter2 cover image capture.png

4 AI Minds in Live performance: A Deep Dive into Multimodal AI Fusion

July 7, 2025
Dogecoin holders in denial.webp.webp

Dogecoin Value to Sink One other 13% In June, However There’s a Catch

June 20, 2025
Ciq logo 2 1 10 23.png

Apple and Claris Veteran Nelson Named CIQ CTO

July 25, 2025
Chris ried ieic5tq8ymk unsplash scaled 1.jpg

Summary Courses: A Software program Engineering Idea Information Scientists Should Know To Succeed

June 18, 2025

About Us

Welcome to News AI World, your go-to source for the latest in artificial intelligence news and developments. Our mission is to deliver comprehensive and insightful coverage of the rapidly evolving AI landscape, keeping you informed about breakthroughs, trends, and the transformative impact of AI technologies across industries.

Categories

  • Artificial Intelligence
  • ChatGPT
  • Crypto Coins
  • Data Science
  • Machine Learning

Recent Posts

  • OpenAI claims GPT-5 has 30% much less political bias • The Register
  • 9 Most Trusted Crypto Cloud Mining Platforms in 2025
  • Constructing Pure Python Internet Apps with Reflex
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy

© 2024 Newsaiworld.com. All rights reserved.

No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us

© 2024 Newsaiworld.com. All rights reserved.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?