• Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
Sunday, July 20, 2025
newsaiworld
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
Morning News
No Result
View All Result
Home Machine Learning

What PyTorch Actually Means by a Leaf Tensor and Its Grad

Admin by Admin
June 22, 2025
in Machine Learning
0
Image 66.jpg
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

READ ALSO

Exploratory Information Evaluation: Gamma Spectroscopy in Python (Half 2)

Don’t Waste Your Labeled Anomalies: 3 Sensible Methods to Enhance Anomaly Detection Efficiency


isn’t one more clarification of the chain rule. It’s a tour by the weird aspect of autograd — the place gradients serve physics, not simply weights

I initially wrote this tutorial for myself through the first yr of my PhD, whereas navigating the intricacies of gradient calculations in PyTorch. Most of it’s clearly designed with commonplace backpropagation in thoughts — and that’s advantageous, since that’s what most individuals want.

However Physics-Knowledgeable Neural Community (PINN) is a moody beast and it wants a distinct type of gradient logic. I spent a while feeding it and I figured it may be value sharing the findings with the group, particularly with fellow PINN practitioners — possibly it’ll save somebody a number of complications. However if in case you have by no means heard of PINNs, don’t fear! This submit remains to be for you — particularly if you happen to’re into issues like gradients of gradients and all that enjoyable stuff.

Fundamentals phrases

Tensor within the laptop world means merely a multidimensional array, i.e. a bunch of numbers listed by a number of integers. To be exact, there exist additionally zero-dimensional tensors, that are simply single numbers. Some individuals say that tensors are a generalization of matrices to greater than two dimensions.

If in case you have studied common relativity earlier than, you could have heard that mathematical tensors have things like covariant and contravariant indices. However neglect about it — in PyTorch tensors are simply multidimensional arrays. No finesse right here.

Leaf tensor is a tensor that may be a leaf (within the sense of a graph idea) of a computation graph. We are going to have a look at these beneath, so this definition will make a bit extra sense.

The requires_grad property of a tensor tells PyTorch whether or not it ought to keep in mind how this tensor is utilized in additional computations. For now, consider tensors with requires_grad=True as variables, whereas tensors with requires_grad=False as constants.

Leaf tensors

Let’s begin by creating a number of tensors and checking their properties requires_grad and is_leaf.

import torch

a = torch.tensor([3.], requires_grad=True)
b = a * a

c = torch.tensor([5.])
d = c * c

assert a.requires_grad is True and a.is_leaf is True
assert b.requires_grad is True and b.is_leaf is False
assert c.requires_grad is False and c.is_leaf is True
assert d.requires_grad is False and d.is_leaf is True  # sic!
del a, b, c, d

a is a leaf as anticipated, and b shouldn’t be as a result of it’s a results of a multiplication. a is ready to require grad, so naturally b inherits this property.

c is a leaf clearly, however why d is a leaf? The rationale d.is_leaf is True stems from a particular conference: all tensors with requires_grad set to False are thought of leaf tensors, as per PyTorch’s documentation:

All Tensors which have requires_grad which is False will likely be leaf Tensors by conference.

Whereas mathematically, d shouldn’t be a leaf (because it outcomes from one other operation, c * c), gradient computation won’t ever lengthen past it. In different phrases, there gained’t be any spinoff with respect to c. This permits d to be handled as a leaf.

In a nutshell, in PyTorch, leaf tensors are both:

  • Immediately inputted (i.e. not calculated from different tensors) and have requires_grad=True. Instance: neural community weights which can be randomly initialized.
  • Don’t require gradients in any respect, no matter whether or not they’re straight inputted or computed. Within the eyes of autograd, these are simply constants. Examples:
    • any neural community enter knowledge,
    • an enter picture after imply elimination or different operations, which includes solely non-gradient-requiring tensors.

A small comment for individuals who wish to know extra. The requires_grad property is inherited as illustrated right here:

a = torch.tensor([5.], requires_grad=True)
b = torch.tensor([5.], requires_grad=True)
c = torch.tensor([5.], requires_grad=False)

d = torch.sin(a * b * c)

assert d.requires_grad == any((x.requires_grad for x in (a, b, c)))

Code comment: all code snippets must be self-contained aside from imports that I embrace solely once they seem first time. I drop them as a way to decrease boilerplate code. I belief that the reader will be capable of handle these simply.

Grad retention

A separate concern is gradient retention. All nodes within the computation graph, that means all tensors used, have gradients computed in the event that they require grad. Nonetheless, solely leaf tensors retain these gradients. This is sensible as a result of gradients are sometimes used to replace tensors, and solely leaf tensors are topic to updates throughout coaching. Non-leaf tensors, like b within the first instance, are usually not straight up to date; they alter because of modifications in a, so their gradients may be discarded. Nonetheless, there are situations, particularly in Physics-Knowledgeable Neural Networks (PINNs), the place you may wish to retain the gradients of those intermediate tensors. In such circumstances, you will have to explicitly mark non-leaf tensors to retain their gradients. Let’s see:

a = torch.tensor([3.], requires_grad=True)
b = a * a
b.backward()

assert a.grad shouldn't be None
assert b.grad is None  # generates a warning

You most likely have simply seen a warning:

UserWarning: The .grad attribute of a Tensor that's not a leaf Tensor is being 
accessed. Its .grad attribute will not be populated throughout autograd.backward(). 
When you certainly need the .grad discipline to be populated for a non-leaf Tensor, use 
.retain_grad() on the non-leaf Tensor. When you entry the non-leaf Tensor by 
mistake, be sure to entry the leaf Tensor as a substitute. 
See github.com/pytorch/pytorch/pull/30531 for extra informations. 
(Triggered internally at atensrcATen/core/TensorBody.h:491.)

So let’s repair it by forcing b to retain its gradient

a = torch.tensor([3.], requires_grad=True)
b = a * a
b.retain_grad()  # <- the distinction
b.backward()

assert a.grad shouldn't be None
assert b.grad shouldn't be None

Mysteries of grad

Now let’s have a look at the well-known grad itself. What’s it? Is it a tensor? In that case, is it a leaf tensor? Does it require or retain grad?

a = torch.tensor([3.], requires_grad=True)
b = a * a
b.retain_grad()
b.backward()

assert isinstance(a.grad, torch.Tensor)
assert a.grad.requires_grad is False and a.grad.retains_grad is False and a.grad.is_leaf is True
assert b.grad.requires_grad is False and b.grad.retains_grad is False and b.grad.is_leaf is True

Apparently:

– grad itself is a tensor,
– grad is a leaf tensor,
– grad doesn’t require grad.

Does it retain grad? This query doesn’t make sense as a result of it doesn’t require grad within the first place. We are going to come again to the query of the grad being a leaf tensor in a second, however now we are going to check a number of issues.

A number of backwards and retain_graph

What is going to occur after we calculate the identical grad twice?

a = torch.tensor([3.], requires_grad=True)
b = a * a
b.retain_grad()
b.backward()
strive:
    b.backward()
besides RuntimeError:
    """
    RuntimeError: Attempting to backward by the graph a second time (or 
    straight entry saved tensors after they've already been freed). Saved 
    intermediate values of the graph are freed while you name .backward() or 
    autograd.grad(). Specify retain_graph=True if it is advisable backward by 
    the graph a second time or if it is advisable entry saved tensors after 
    calling backward.
    """

The error message explains all of it. This could work:

a = torch.tensor([3.], requires_grad=True)
b = a * a
b.retain_grad()

b.backward(retain_graph=True)
print(a.grad)  # prints tensor([6.])

b.backward(retain_graph=True)
print(a.grad)  # prints tensor([12.])

b.backward(retain_graph=False)
print(a.grad)  # prints tensor([18.])

# b.backward(retain_graph=False)  # <- right here we might get an error, as a result of in 
# the earlier name we didn't retain the graph.

Aspect (however necessary) be aware: it’s also possible to observe, how the gradient accumulates in a: with each iteration it’s added.

Highly effective create_graph argument

Learn how to make grad require grad?

a = torch.tensor([5.], requires_grad=True)
b = a * a
b.retain_grad()
b.backward(create_graph=True)

# Right here an fascinating factor occurs: now a.grad would require grad! 
assert a.grad.requires_grad is True
assert a.grad.is_leaf is False

# Then again, the grad of b doesn't require grad, as beforehand. 
assert b.grad.requires_grad is False
assert b.grad.is_leaf is True

The above may be very helpful: a.grad which mathematically is [frac{partial b}{partial a}] shouldn’t be a continuing (leaf) anymore, however a daily member of the computation graph that may be additional used. We are going to use that reality in Half 2.

Why the b.grad doesn’t require grad? As a result of spinoff of b with respect to b is just 1.

If the backward feels counterintuitive for you now, don’t fear. We are going to quickly change to a different technique referred to as nomen omen grad that permits to exactly select components of the derivatives. Earlier than, two aspect notes:

Aspect be aware 1: When you set create_graph to True, it additionally units retain_graph to True (if not explicitly set). Within the pytorch code it appears precisely like
this:

    if retain_graph is None:
        retain_graph = create_graph

Aspect be aware 2: You most likely noticed a warning like this:

    UserWarning: Utilizing backward() with create_graph=True will create a reference 
    cycle between the parameter and its gradient which may trigger a reminiscence leak. 
    We suggest utilizing autograd.grad when creating the graph to keep away from this. If 
    you need to use this operate, ensure to reset the .grad fields of your 
    parameters to None after use to interrupt the cycle and keep away from the leak. 
    (Triggered internally at C:cbpytorch_1000000000000worktorchcsrcautogradengine.cpp:1156.)
      Variable._execution_engine.run_backward(  # Calls into the C++ engine to 
    run the backward go

And we are going to comply with the recommendation and use autograd.grad now.

Taking derivatives with autograd.grad operate

Now let’s transfer from the one way or the other high-level .backward() technique to decrease stage grad technique that explicitly calculates spinoff of 1 tensor with respect to a different.

from torch.autograd import grad

a = torch.tensor([3.], requires_grad=True)
b = a * a * a
db_da = grad(b, a, create_graph=True)[0]
assert db_da.requires_grad is True

Equally, as with backward, the spinoff of b with respect to a may be handled as a operate and differentiated additional. So in different phrases, the create_graph flag may be understood as: when calculating gradients, hold the historical past of how they had been calculated, so we will deal with them as non-leaf tensors that require grad, and use additional.

Particularly, we will calculate second-order spinoff:

d2b_da2 = grad(db_da, a, create_graph=True)[0]
# Aspect be aware: the grad operate returns a tuple and the primary aspect of it's what we want.
assert d2b_da2.merchandise() == 18
assert d2b_da2.requires_grad is True

As stated earlier than: that is really the important thing property that permits us to do PINN with pytorch.

Wrapping up

Most tutorials about PyTorch gradients give attention to backpropagation in classical supervised studying. This one explored a distinct perspective — one formed by the wants of PINNs and different gradient-hungry beasts.

We learnt what leaves are within the PyTorch jungle, why gradients are retained by default just for leaf nodes, and how one can retain them when wanted for different tensors. We noticed how create_graph turns gradients into differentiable residents of the autograd world.

However there are nonetheless many issues to uncover — particularly why gradients of non-scalar features require further care, how one can compute second-order derivatives with out utilizing your complete RAM, and why slicing your enter tensor is a nasty thought while you want an elementwise gradient.

So let’s meet in Half 2, the place we’ll take a more in-depth have a look at grad 👋

Tags: gradLeafmeansPyTorchTensor

Related Posts

Logo2.jpg
Machine Learning

Exploratory Information Evaluation: Gamma Spectroscopy in Python (Half 2)

July 19, 2025
Chatgpt image jul 12 2025 03 01 44 pm.jpg
Machine Learning

Don’t Waste Your Labeled Anomalies: 3 Sensible Methods to Enhance Anomaly Detection Efficiency

July 17, 2025
Title new scaled 1.png
Machine Learning

Easy methods to Overlay a Heatmap on a Actual Map with Python

July 16, 2025
Afif ramdhasuma rjqck9mqhng unsplash 1.jpg
Machine Learning

Accuracy Is Lifeless: Calibration, Discrimination, and Different Metrics You Really Want

July 15, 2025
Chatgpt image jul 6 2025 10 09 01 pm 1024x683.png
Machine Learning

AI Brokers Are Shaping the Way forward for Work Job by Job, Not Job by Job

July 14, 2025
Pexels sofia falco 1148410914 32439212.jpg
Machine Learning

Fearful About AI? Use It to Your Benefit

July 13, 2025
Next Post
Nisha data science journey 1.png

Information Science, No Diploma - KDnuggets

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

0 3.png

College endowments be a part of crypto rush, boosting meme cash like Meme Index

February 10, 2025
Gemini 2.0 Fash Vs Gpt 4o.webp.webp

Gemini 2.0 Flash vs GPT 4o: Which is Higher?

January 19, 2025
1da3lz S3h Cujupuolbtvw.png

Scaling Statistics: Incremental Customary Deviation in SQL with dbt | by Yuval Gorchover | Jan, 2025

January 2, 2025
0khns0 Djocjfzxyr.jpeg

Constructing Data Graphs with LLM Graph Transformer | by Tomaz Bratanic | Nov, 2024

November 5, 2024
How To Maintain Data Quality In The Supply Chain Feature.jpg

Find out how to Preserve Knowledge High quality within the Provide Chain

September 8, 2024

EDITOR'S PICK

Zutacore Image 2 2 1 0425.png

The AI Manufacturing facility Heats Up: Liquid Cooling Choices Defined

April 26, 2025
Dpo Data Center 2 1 0125.png

$200M HPC Information Heart for AI in Wisconsin Launched by DPO and Billerud

January 25, 2025
Chatgpt image jul 12 2025 03 01 44 pm.jpg

Don’t Waste Your Labeled Anomalies: 3 Sensible Methods to Enhance Anomaly Detection Efficiency

July 17, 2025
B2b integration services 1.png

Bridging the Digital Chasm: How Enterprises Conquer B2B Integration Roadblocks

July 16, 2025

About Us

Welcome to News AI World, your go-to source for the latest in artificial intelligence news and developments. Our mission is to deliver comprehensive and insightful coverage of the rapidly evolving AI landscape, keeping you informed about breakthroughs, trends, and the transformative impact of AI technologies across industries.

Categories

  • Artificial Intelligence
  • ChatGPT
  • Crypto Coins
  • Data Science
  • Machine Learning

Recent Posts

  • From Reactive to Predictive: Forecasting Community Congestion with Machine Studying and INT
  • Analysts Evaluate BlockDAG’s Present Trajectory to Solana’s Early Development Cycle
  • 7 Python Net Growth Frameworks for Knowledge Scientists
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy

© 2024 Newsaiworld.com. All rights reserved.

No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us

© 2024 Newsaiworld.com. All rights reserved.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?