the opposite day, I got here throughout a library I’d by no means heard of earlier than. It was referred to as NumExpr.
I used to be instantly due to some claims made in regards to the library. Specifically, it acknowledged that for some advanced numerical calculations, it was as much as 15 instances quicker than NumPy.
I used to be intrigued as a result of, up till now, NumPy has remained unchallenged in its dominance within the numerical computation area in Python. Specifically with Information Science, NumPy is a cornerstone for machine studying, exploratory information evaluation and mannequin coaching. Something we are able to use to squeeze out each final little bit of efficiency in our methods might be welcomed. So, I made a decision to place the claims to the take a look at myself.
Yow will discover a hyperlink to the NumExpr repository on the finish of this text.
What’s NumExpr?
In line with its GitHub web page, NumExpr is a quick numerical expression evaluator for Numpy. Utilizing it, expressions that function on arrays are accelerated and use much less reminiscence than performing the identical calculations in Python with different numerical libraries, resembling NumPy.
As well as, as it’s multithreaded, NumExpr can use all of your CPU cores, which typically leads to substantial efficiency scaling in comparison with NumPy.
Organising a growth atmosphere
Earlier than we begin coding, let’s arrange our growth atmosphere. The perfect observe is to create a separate Python atmosphere the place you’ll be able to set up any crucial software program and experiment with coding, realizing that something you do on this atmosphere gained’t have an effect on the remainder of your system. I take advantage of conda for this, however you need to use no matter technique you recognize finest that fits you.
If you wish to go down the Miniconda route and don’t have already got it, you need to set up Miniconda first. Get it utilizing this hyperlink:
https://www.anaconda.com/docs/fundamental
1/ Create our new dev atmosphere and set up the required libraries
(base) $ conda create -n numexpr_test python=3.12-y
(base) $ conda activate numexpr
(numexpr_test) $ pip set up numexpr
(numexpr_test) $ pip set up jupyter
2/ Begin Jupyter
Now sort in jupyter pocket book
into your command immediate. It is best to see a jupyter pocket book open in your browser. If that doesn’t occur mechanically, you’ll probably see a screenful of knowledge after the jupyter pocket book
command. Close to the underside, you will see that a URL that you need to copy and paste into your browser to launch the Jupyter Pocket book.
Your URL might be completely different to mine, but it surely ought to look one thing like this:-
http://127.0.0.1:8888/tree?token=3b9f7bd07b6966b41b68e2350721b2d0b6f388d248cc69
Evaluating NumExpr and NumPy efficiency
To check the efficiency, we’ll run a collection of numerical computations utilizing NumPy and NumExpr, and time each methods.
Instance 1 — A easy array addition calculation
On this instance, we run a vectorised addition of two massive arrays 5000 instances.
import numpy as np
import numexpr as ne
import timeit
a = np.random.rand(1000000)
b = np.random.rand(1000000)
# Utilizing timeit with lambda capabilities
time_np_expr = timeit.timeit(lambda: 2*a + 3*b, quantity=5000)
time_ne_expr = timeit.timeit(lambda: ne.consider("2*a + 3*b"), quantity=5000)
print(f"Execution time (NumPy): {time_np_expr} seconds")
print(f"Execution time (NumExpr): {time_ne_expr} seconds")
>>>>>>>>>>>
Execution time (NumPy): 12.03680682599952 seconds
Execution time (NumExpr): 1.8075962659931974 seconds
I’ve to say, that’s a fairly spectacular begin from the NumExpr library already. I make {that a} 6 instances enchancment over the NumPy runtime.
Let’s double-check that each operations return the identical outcome set.
# Arrays to retailer the outcomes
result_np = 2*a + 3*b
result_ne = ne.consider("2*a + 3*b")
# Guarantee the 2 new arrays are equal
arrays_equal = np.array_equal(result_np, result_ne)
print(f"Arrays equal: {arrays_equal}")
>>>>>>>>>>>>
Arrays equal: True
Instance 2 — Calculate Pi utilizing a Monte Carlo simulation
Our second instance will look at a extra sophisticated use case with extra real-world functions.
Monte Carlo simulations contain working many iterations of a random course of to estimate a system’s properties, which may be computationally intensive.
On this case, we’ll use Monte Carlo to calculate the worth of Pi. It is a well-known instance the place we take a sq. with a aspect size of 1 unit and inscribe 1 / 4 circle inside it with a radius of 1 unit. The ratio of the quarter circle’s space to the sq.’s space is (π/4)/1, and we are able to multiply this expression by 4 to get π by itself.
So, if we contemplate quite a few random (x,y) factors that each one lie inside or on the bounds of the sq., as the whole variety of these factors tends to infinity, the ratio of factors that lie on or contained in the quarter circle to the whole variety of factors tends in direction of Pi.
First, the NumPy implementation.
import numpy as np
import timeit
def monte_carlo_pi_numpy(num_samples):
x = np.random.rand(num_samples)
y = np.random.rand(num_samples)
inside_circle = (x**2 + y**2) <= 1.0
pi_estimate = (np.sum(inside_circle) / num_samples) * 4
return pi_estimate
# Benchmark the NumPy model
num_samples = 1000000
time_np_expr = timeit.timeit(lambda: monte_carlo_pi_numpy(num_samples), quantity=1000)
pi_estimate = monte_carlo_pi_numpy(num_samples)
print(f"Estimated Pi (NumPy): {pi_estimate}")
print(f"Execution Time (NumPy): {time_np_expr} seconds")
>>>>>>>>
Estimated Pi (NumPy): 3.144832
Execution Time (NumPy): 10.642843848007033 seconds
Now, utilizing NumExpr.
import numpy as np
import numexpr as ne
import timeit
def monte_carlo_pi_numexpr(num_samples):
x = np.random.rand(num_samples)
y = np.random.rand(num_samples)
inside_circle = ne.consider("(x**2 + y**2) <= 1.0")
pi_estimate = (np.sum(inside_circle) / num_samples) * 4 # Use NumPy for summation
return pi_estimate
# Benchmark the NumExpr model
num_samples = 1000000
time_ne_expr = timeit.timeit(lambda: monte_carlo_pi_numexpr(num_samples), quantity=1000)
pi_estimate = monte_carlo_pi_numexpr(num_samples)
print(f"Estimated Pi (NumExpr): {pi_estimate}")
print(f"Execution Time (NumExpr): {time_ne_expr} seconds")
>>>>>>>>>>>>>>>
Estimated Pi (NumExpr): 3.141684
Execution Time (NumExpr): 8.077501275009126 seconds
OK, so the speed-up was not as spectacular that point, however a 20% enchancment isn’t horrible both. A part of the reason being that NumExpr doesn’t have an optimised SUM() operate, so we needed to default again to NumPy for that operation.
Instance 3 — Implementing a Sobel picture filter
On this instance, we’ll implement a Sobel filter for photos. The Sobel filter is often utilized in picture processing for edge detection. It calculates the picture depth gradient at every pixel, highlighting edges and depth transitions. Our enter picture is of the Taj Mahal in India.

Let’s see the NumPy code working first and time it.
import numpy as np
from scipy.ndimage import convolve
from PIL import Picture
import timeit
# Sobel kernels
sobel_x = np.array([[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]])
sobel_y = np.array([[-1, -2, -1],
[ 0, 0, 0],
[ 1, 2, 1]])
def sobel_filter_numpy(picture):
"""Apply Sobel filter utilizing NumPy."""
img_array = np.array(picture.convert('L')) # Convert to grayscale
gradient_x = convolve(img_array, sobel_x)
gradient_y = convolve(img_array, sobel_y)
gradient_magnitude = np.sqrt(gradient_x**2 + gradient_y**2)
gradient_magnitude *= 255.0 / gradient_magnitude.max() # Normalize to 0-255
return Picture.fromarray(gradient_magnitude.astype(np.uint8))
# Load an instance picture
picture = Picture.open("/mnt/d/take a look at/taj_mahal.png")
# Benchmark the NumPy model
time_np_sobel = timeit.timeit(lambda: sobel_filter_numpy(picture), quantity=100)
sobel_image_np = sobel_filter_numpy(picture)
sobel_image_np.save("/mnt/d/take a look at/sobel_taj_mahal_numpy.png")
print(f"Execution Time (NumPy): {time_np_sobel} seconds")
>>>>>>>>>
Execution Time (NumPy): 8.093792188999942 seconds
And now the NumExpr code.
import numpy as np
import numexpr as ne
from scipy.ndimage import convolve
from PIL import Picture
import timeit
# Sobel kernels
sobel_x = np.array([[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]])
sobel_y = np.array([[-1, -2, -1],
[ 0, 0, 0],
[ 1, 2, 1]])
def sobel_filter_numexpr(picture):
"""Apply Sobel filter utilizing NumExpr for gradient magnitude computation."""
img_array = np.array(picture.convert('L')) # Convert to grayscale
gradient_x = convolve(img_array, sobel_x)
gradient_y = convolve(img_array, sobel_y)
gradient_magnitude = ne.consider("sqrt(gradient_x**2 + gradient_y**2)")
gradient_magnitude *= 255.0 / gradient_magnitude.max() # Normalize to 0-255
return Picture.fromarray(gradient_magnitude.astype(np.uint8))
# Load an instance picture
picture = Picture.open("/mnt/d/take a look at/taj_mahal.png")
# Benchmark the NumExpr model
time_ne_sobel = timeit.timeit(lambda: sobel_filter_numexpr(picture), quantity=100)
sobel_image_ne = sobel_filter_numexpr(picture)
sobel_image_ne.save("/mnt/d/take a look at/sobel_taj_mahal_numexpr.png")
print(f"Execution Time (NumExpr): {time_ne_sobel} seconds")
>>>>>>>>>>>>>
Execution Time (NumExpr): 4.938702256011311 seconds
On this event, utilizing NumExpr led to an awesome outcome, with a efficiency that was near double that of NumPy.
Here’s what the edge-detected picture seems to be like.

Instance 4 — Fourier collection approximation
It’s well-known that advanced periodic capabilities may be simulated by making use of a collection of sine waves superimposed on one another. On the excessive, even a sq. wave may be simply modelled on this means. The strategy is named the Fourier collection approximation. Though an approximation, we are able to get as near the goal wave form as reminiscence and computational capability permit.
The maths behind all this isn’t the first focus. Simply bear in mind that once we enhance the variety of iterations, the run-time of the answer rises markedly.
import numpy as np
import numexpr as ne
import time
import matplotlib.pyplot as plt
# Outline the fixed pi explicitly
pi = np.pi
# Generate a time vector and a sq. wave sign
t = np.linspace(0, 1, 1000000) # Lowered dimension for higher visualization
sign = np.signal(np.sin(2 * np.pi * 5 * t))
# Variety of phrases within the Fourier collection
n_terms = 10000
# Fourier collection approximation utilizing NumPy
start_time = time.time()
approx_np = np.zeros_like(t)
for n in vary(1, n_terms + 1, 2):
approx_np += (4 / (np.pi * n)) * np.sin(2 * np.pi * n * 5 * t)
numpy_time = time.time() - start_time
# Fourier collection approximation utilizing NumExpr
start_time = time.time()
approx_ne = np.zeros_like(t)
for n in vary(1, n_terms + 1, 2):
approx_ne = ne.consider("approx_ne + (4 / (pi * n)) * sin(2 * pi * n * 5 * t)", local_dict={"pi": pi, "n": n, "approx_ne": approx_ne, "t": t})
numexpr_time = time.time() - start_time
print(f"NumPy Fourier collection time: {numpy_time:.6f} seconds")
print(f"NumExpr Fourier collection time: {numexpr_time:.6f} seconds")
# Plotting the outcomes
plt.determine(figsize=(10, 6))
plt.plot(t, sign, label='Authentic Sign (Sq. Wave)', colour='black', linestyle='--')
plt.plot(t, approx_np, label='Fourier Approximation (NumPy)', colour='blue')
plt.plot(t, approx_ne, label='Fourier Approximation (NumExpr)', colour='crimson', linestyle='dotted')
plt.title('Fourier Collection Approximation of a Sq. Wave')
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.legend()
plt.grid(True)
plt.present()
And the output?

That’s one other fairly good outcome. NumExpr exhibits a 5 instances enchancment over Numpy on this event.
Abstract
NumPy and NumExpr are each highly effective libraries used for Python numerical computations. They every have distinctive strengths and use instances, making them appropriate for various kinds of duties. Right here, we in contrast their efficiency and suitability for particular computational duties, specializing in examples resembling easy array addition to extra advanced functions, like utilizing a Sobel filter for picture edge detection.
Whereas I didn’t fairly see the claimed 15x velocity enhance over NumPy in my assessments, there’s little doubt that NumExpr may be considerably quicker than NumPy in lots of instances.
In the event you’re a heavy consumer of NumPy and must extract each little bit of efficiency out of your code, I like to recommend attempting the NumExpr library. Moreover the truth that not all NumPy code may be replicated utilizing NumExpr, there’s virtually no draw back, and the upside would possibly shock you.
For extra particulars on the NumExpr library, try the GitHub web page right here.