Organizing Code, Experiments, and Analysis for Kaggle Competitions

and I neglect. Train me and I bear in mind. Contain me and I study.

holds true, and studying by doing is among the most instructive processes to amass a brand new talent. Within the area of information science and machine studying, taking part in competitions is among the simplest methods to achieve hands-on expertise and improve your abilities and talents.

Kaggle is the world’s largest knowledge science neighborhood, and its competitions are extremely revered within the business. Most of the world’s main ML conferences (e.g., NeurIPS), organizations (e.g., Google), and universities (e.g., Stanford) host competitions on Kaggle.

The featured Kaggle Competitions award medals to prime performers on the personal leaderboard. Just lately, I’ve participated in my very first medal-awarding Kaggle competitors, and I used to be lucky sufficient to earn a Silver Medal. This was the NeurIPS – Ariel Knowledge Problem 2025. I don’t intend to share my answer right here. Should you’re , you may take a look at my answer right here.

What I didn’t notice previous to participation is how a lot Kaggle exams apart from simply ML abilities.

Kaggle exams one’s coding and software program engineering abilities. It harassed one’s capacity to correctly set up their codebase in an effort to shortly iterate and check out new concepts. It additionally examined the power to trace experiments and ends in a transparent method.

Being a part of the NeurIPS 2025 Competitors Observe, a analysis convention, additionally examined the power to analysis and study a brand new area shortly and successfully.

All in all, this competitors humbled me quite a bit and taught me many classes apart from ML.

The aim of this text is to share a few of these non-ML classes with you. All of them revolve round one precept: group, group, group.

First, I’ll persuade you why clear code structuring and course of group isn’t time losing or good to have, however quite important for competing in Kaggle particularly and any profitable knowledge science undertaking generally. Then, I’ll share with you a few of the strategies I used and classes realized relating to code structuring and the experimentation course of.

I need to begin with a word of humility. Under no circumstances am I an knowledgeable on this area. I’m nonetheless within the outset of my journey. All I hope for is that some readers will discover a few of these classes useful and can study from my pitfalls. In case you have every other suggestions or solutions, I urge you to share them in order that all of us can study collectively.

1 Science Golden Tip: Manage

It’s no secret that pure scientists wish to preserve detailed information of their work and analysis course of. Unclear steps could (and can) result in incorrect conclusions and understanding. Irreproducible work is the bane of science. For us knowledge scientists, why ought to or not it’s any totally different?

1.1 However Velocity is Essential!

The widespread counterargument is that the character of information science is fast-paced and iterative. Typically talking, experimentation is affordable and fast; apart from, who on this planet prefers writing documentation over coding and constructing fashions?

As a lot as I sympathize with this thought and I like fast outcomes, I worry that this mindset is short-sighted. Do not forget that the ultimate objective of any knowledge science undertaking is to both ship correct, data-supported, and reproducible insights or to construct dependable and reproducible fashions. If quick work compromises the top objective, then it isn’t price something.

My answer to this dilemma is to make the mundane components of group as easy, fast, and painless as potential. We shouldn’t search complete deletion of the group course of, however quite repair its faults to make it as environment friendly and productive as potential.

1.2 Prices of Lack of Group

Think about with me this state of affairs. For every of your experiments, you will have a single pocket book on Kaggle that does every thing from loading and preprocessing the information to coaching the mannequin, evaluating it, and at last submitting it. By now, you will have run dozens of experiments. You uncover a small bug within the knowledge loading perform that you simply utilized in all of your experiments. Fixing it will likely be a nightmare as a result of you’ll have to undergo every of your notebooks, repair the bug, guarantee no new bugs have been launched, after which re-run all of your experiments to get the up to date outcomes. All of this might have been averted if you happen to had a transparent code construction and your code have been reusable and modular.

Drivendata (2022) mentions an excellent instance of the prices of an unorganized knowledge science undertaking. It mentions the story of a failed knowledge science undertaking that took months to finish and value thousands and thousands of {dollars}. The failure got here all the way down to an incorrect conclusion found early within the undertaking. A code bug within the knowledge cleansing polluted the information and led to improper insights. If the crew had higher tracked the information sources and transformations, they’d have caught the bug earlier, and cash would have been saved.

If there may be one lesson to remove from this part, it’s that group shouldn’t be a nice-to-have, however quite an important a part of any knowledge science undertaking. With out a clear code construction and course of group, we’re sure to make errors, waste time, and produce irreproducible work.

1.3 What to trace and set up?

There are three essential facets that I think about well worth the effort to trace:

Codebase
Experiments Outcomes and Configurations
Analysis and Studying

2 The Codebase

In spite of everything, code is the spine of any knowledge science undertaking. So, there’s a lesson or two to study from software program engineers right here.

2.1 Repo Construction

So long as you give a lot thought to the construction of your codebase, you’re doing nice.

There isn’t any one universally agreed upon construction (nor will ever be). So, this part is extremely subjective and opinionated. I’ll talk about the final construction I like and use.

I wish to initialize my work with the broadly in style Cookiecutter Knowledge Science (ccds) template. While you initialize a undertaking with ccds, it creates a folder with the next construction. ¹

├── LICENSE            <- Open-source license if one is chosen
├── Makefile           <- Makefile with comfort instructions like `make knowledge` or `make prepare`
├── README.md          <- The highest-level README for builders utilizing this undertaking.
├── knowledge
│   ├── exterior       <- Knowledge from third social gathering sources.
│   ├── interim        <- Intermediate knowledge that has been remodeled.
│   ├── processed      <- The ultimate, canonical knowledge units for modeling.
│   └── uncooked            <- The unique, immutable knowledge dump.
│
├── docs               <- A default mkdocs undertaking; see www.mkdocs.org for particulars
│
├── fashions             <- Educated and serialized fashions, mannequin predictions, or mannequin summaries
│
├── notebooks          <- Jupyter notebooks. Naming conference is a quantity (for ordering),
│                         the creator's initials, and a brief `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml     <- Undertaking configuration file with package deal metadata for 
│                         {{ cookiecutter.module_name }} and configuration for instruments like black
│
├── references         <- Knowledge dictionaries, manuals, and all different explanatory supplies.
│
├── studies            <- Generated evaluation as HTML, PDF, LaTeX, and so on.
│   └── figures        <- Generated graphics and figures for use in reporting
│
├── necessities.txt   <- The necessities file for reproducing the evaluation atmosphere, e.g.
│                         generated with `pip freeze > necessities.txt`
│
├── setup.cfg          <- Configuration file for flake8
│
└── {{ cookiecutter.module_name }}   <- Supply code to be used on this undertaking.
    │
    ├── __init__.py             <- Makes {{ cookiecutter.module_name }} a Python module
    │
    ├── config.py               <- Retailer helpful variables and configuration
    │
    ├── dataset.py              <- Scripts to obtain or generate knowledge
    │
    ├── options.py             <- Code to create options for modeling
    │
    ├── modeling                
    │   ├── __init__.py 
    │   ├── predict.py          <- Code to run mannequin inference with educated fashions          
    │   └── prepare.py            <- Code to coach fashions
    │
    └── plots.py                <- Code to create visualizations

2.1.1 Surroundings Administration

While you use ccds, you’re prompted to pick out an atmosphere supervisor. I personally want uv by Astral. It information all of the used packages within the pyproject.toml file and permits us to recreate the identical atmosphere by merely utilizing uv sync.

Underneath the hood, uv makes use of venv. I discover utilizing uv a lot easier than immediately managing digital environments as a result of managing and studying pyproject.toml is way easier than necessities.txt.

Furthermore, I discover uv a lot easier than conda. uv is constructed particularly for python whereas conda is way more generic.

2.1.2 The Generated Module

An awesome a part of this template is the { cookiecutter.module_name } listing. On this listing, you outlined a Python package deal that shall comprise all of the essential components of your code (e.g. preprocessing capabilities, fashions definition, inference perform, and so on.).

I discover the utilization of the package deal fairly useful, and in Part 2.3, I’ll talk about what to put right here and what to put in Jupyter Notebooks.

2.1.3 Staying Versatile

Don’t regard this construction as good or full. You don’t have to make use of every thing ccds supplies, and you might (and will) alter it if the undertaking requires it. ccds supplies you with an excellent place to begin so that you can tune to your precise undertaking wants and calls for.

2.2 Model Management

Git has change into an absolute necessity for any undertaking involving code. It permits us to trace modifications, revert to earlier variations, and, with GitHub, collaborate with crew members.

While you use Git, you mainly entry a time machine that may treatment any faults you introduce to your code. Right now, using Git is non-negotiable.

2.3 The Three Code Varieties

Selecting when to make use of Python scripts and when to make use of Jupyter Notebooks is a long-debated subject within the knowledge science neighborhood. Right here I current my stance on the subject.

I wish to separate all of my code into one in every of three directories:

The Module
Scripts
Notebooks

2.3.1 The Module

The module ought to comprise all of the essential capabilities and courses you create.

Its utilization helps us reduce redundancy and create a single supply of fact for all of the essential operations taking place on the information.

In knowledge science initiatives, some operations will likely be repeated in all of your coaching and inference workflows, equivalent to studying the information from information, reworking knowledge, and mannequin definitions. Repeating all these capabilities in all of your notebooks or scripts is tough and very boring. Utilizing a module permits us to write down the code as soon as after which import it in all places.

Furthermore, this helps scale back errors and errors. When a bug within the module is found, you repair it as soon as within the module, and it’s robotically fastened in all scripts and notebooks importing it.

2.3.2 Scripts

The scripts listing accommodates .py information. These information are the one supply of producing outputs from the undertaking. They’re the interface to interacting with our module and code.

The 2 essential usages for these information are coaching and inference. All of the used fashions ought to be created by working one of many scripts, and all submissions on Kaggle ought to be made by such information.

The utilization of those scripts helps make our outcomes reproducible. To breed an older outcome (prepare the identical mannequin, for instance), one solely has to clone the identical model of the repo and run the script used to make the previous outcomes 2.

For the reason that scripts are run from the CLI, utilizing a library to handle CLI arguments simplifies the code. I like utilizing typer for easy scripts that don’t have many config choices and utilizing hydra for advanced ones (I’ll talk about hydra in additional depth later).

2.3.3 Notebooks

Jupyter Notebooks are fantastic for exploration and prototyping due to the quick suggestions loop they supply.

On many events, I begin writing code in a pocket book to shortly check it and work out all errors. Solely then would I switch it to the module.

Nevertheless, notebooks shouldn’t be used to create closing outcomes. They’re arduous to breed and observe modifications in. Subsequently, at all times use the scripts to create closing outputs.

3 Working the Codebase on Kaggle

Utilizing the construction mentioned within the earlier part, we have to comply with these steps to run our code on Kaggle:

Clone The Repo
Set up Required Packages
Run one of many Scripts

As a result of Kaggle supplies us with a Jupyter Pocket book interface to run our code and most Kaggle competitions have restrictions on web entry, submissions aren’t as simple as working a script on our native machine. In what follows, I’ll talk about carry out every of the above steps on Kaggle.

3.1 Cloning The Repo

To begin with, we are able to’t immediately clone our repo from GitHub within the submission pocket book due to the web restrictions. Nevertheless, Kaggle permits us to import outputs of different Kaggle notebooks into our present pocket book. Subsequently, the answer is to create a separate Kaggle pocket book that clones our repo and installs the required packages. This pocket book’s output is then imported into the submission pocket book.

Almost certainly, you’ll be utilizing a non-public repo. The only option to clone a non-public repo on Kaggle is to make use of a private entry token (PAT). You possibly can create a PAT on GitHub by following this information. An awesome apply is to create a PAT particularly for Kaggle with the minimal required permissions.

Within the cloning pocket book, you should use the next code to clone your repo:

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
github_token = user_secrets.get_secret("GITHUB_TOKEN")
person = "YOUR_GITHUB_USERNAME"
CLONE_URL = f"https://oauth2:{github_token}@github.com/{person}/YOUR_REPO_NAME.git"
get_ipython().system(f"git clone {CLONE_URL}")

This code downloads your repo into the working listing of the present pocket book. It assumes that you’ve saved your PAT in a Kaggle secret named GITHUB_TOKEN. Just remember to activate the key within the pocket book settings earlier than working it.

3.2 Putting in Required Packages

Within the cloning pocket book, it’s also possible to set up the required packages. If you’re utilizing uv, you may construct your customized module, set up it, and set up its dependencies by working the next instructions: 3.

cd ariel-2025 && uv construct

This creates a wheel file within the dist/ listing to your module. You possibly can then set up it and all its dependencies in a customized listing by working: ⁴.

pip set up /path/to/wheel/file --target /path/to/customized/dir

Be certain that to switch /path/to/wheel/file and /path/to/customized/dir with the precise paths. /path/to/wheel/file would be the path to the .whl file contained in the REPO_NAME/dist/ listing. The /path/to/customized/dir may be any listing you want. Bear in mind the customized listing path as a result of subsequent notebooks will depend on it to import your module and your undertaking dependencies.

I wish to each obtain the repo and set up the packages in a single pocket book. I identify this pocket book the identical identify because the repo to simplify importing it later.

3.3 Working One of many Scripts

The very first thing to do in any subsequent pocket book is to import the pocket book containing the cloned repo and put in packages. While you do that, Kaggle shops the contents of /kaggle/working/ from the imported pocket book right into a listing named /kaggle/enter/REPO_NAME/, the place REPO_NAME is the identify of the repo 5.

Many occasions, your scripts will create outputs (e.g., submission information) relative to their places. By default, your code will reside on /kaggle/enter/REPO_NAME/, which is read-only. Subsequently, you must copy the contents of the repo to /kaggle/working/, which is the present working listing and is read-write. Whereas this can be pointless, it’s a good apply that causes no hurt and prevents foolish points.

cp -r /kaggle/enter/REPO_NAME/REPO_NAME/ /kaggle/working/

Should you immediately run your scripts from /kaggle/working/scripts/, you’re going to get import errors as a result of Python can’t discover the put in packages and your module. This will simply be solved by updating the PYTHONPATH atmosphere variable. I exploit the next command to replace it after which run my scripts:

! export PYTHONPATH=/kaggle/enter/REPO_NAME/custom_dir:$PYTHONPATH && cd /kaggle/working/REPO_NAME/scripts && python your_script.py --arg1 val1 --arg2 val2

I normally identify any pocket book working a script with the script identify for simplicity. Furthermore, after I re-run the pocket book on Kaggle, I identify the model with the hash of the present Git commit to maintain observe of which model of the code was used to generate the outcomes. ⁶.

3.4 Gathering All the things Collectively

On the finish, two notebooks are needed:

The Cloning Pocket book: clones the repo and installs the required packages.
The Script Pocket book: runs one of many scripts.

It’s possible you’ll want extra script notebooks within the pipeline. For instance, you could have one pocket book for coaching and one other for inference. Every of those notebooks will comply with the identical construction because the script pocket book mentioned above.

Separating every step within the pipeline (e.g. knowledge preprocessing, coaching, inference) into its personal pocket book is helpful when one step takes a very long time to run and barely modifications. For instance, within the Ariel Knowledge Problem, my preprocessing step took greater than seven hours to run. If I had every thing in a single pocket book, I must wait seven hours each time I attempted a brand new thought. Furthermore, closing dates on Kaggle kernels would have made it inconceivable to run all the pipeline in a single pocket book.

Every pocket book would then import the earlier pocket book’s output and run its personal step, and construct from there. A great recommendation is to make the paths of any knowledge information or fashions arguments to the scripts so to simply change them when working on Kaggle or every other atmosphere.

While you replace your code, re-run the cloning pocket book to replace the code on Kaggle. Then, re-run solely the mandatory script notebooks to generate the brand new outcomes.

3.5 Is all this Effort Price it?

Completely sure!

I do know that the desired pipeline will add some overhead when beginning your undertaking. Nevertheless, it would prevent way more effort and time in the long term. It is possible for you to to write down all of your code domestically and run the identical code on Kaggle.

While you create a brand new mannequin, all it’s a must to do is copy one of many script notebooks and alter the script. No conflicts will come up between your native and Kaggle code. It is possible for you to to trace all of your modifications utilizing Git. It is possible for you to to breed any previous outcomes by merely testing the corresponding Git commit and re-running the mandatory notebooks on Kaggle.

Furthermore, it is possible for you to to develop on any machine you want. All the things is centralized on GitHub. You possibly can work out of your native machine. Should you want extra energy, you may work from a cloud VM. If you wish to prepare on Kaggle, you are able to do that too. All of your code and atmosphere are the identical in all places.

That is such a small value to pay for such an excellent comfort. As soon as the pipeline is about up, you may neglect about it and give attention to what issues: researching and constructing fashions!

4 Recording Learnings and Analysis

When diving into a brand new area, an enormous a part of your time will likely be spent researching, learning, and studying papers. It’s simple to get misplaced in all the knowledge you learn, and you may neglect the place you encountered a sure thought or idea. To that finish, it is very important handle and set up your studying.

4.1 Readings Monitoring

Rajpurkar (2023) suggests having a listing of all of the papers and articles you learn. This lets you shortly overview what you will have learn and refer again to it when wanted.

Professor Rajpurkar additionally suggests annotating every paper with one, two, or three stars. One-star papers are irrelevant papers, however you didn’t know that earlier than studying them. Two-star papers are related. Three-star papers are extremely related. This lets you shortly filter your readings in a while.

You also needs to take notes on every paper you learn. These notes ought to give attention to how the paper pertains to your undertaking. They need to be quick to be reviewed simply, however have sufficient particulars to know the principle concepts. Within the papers checklist, you must hyperlink studying notes to every paper for simple entry.

I additionally like conserving notes on the papers themselves, equivalent to highlights. Should you’re utilizing a PDF reader or an e-Ink system, you must retailer the annotated model of the paper for future reference and hyperlink it in your notes. Should you want studying on paper, you may scan the annotated model and retailer it digitally.

4.2 Instruments

For many paperwork, I like utilizing Google Docs as a result of it permits me to entry my notes from wherever. Furthermore, you may write on Google Docs in Markdown, which is my most popular writing format (I’m utilizing it to write down this text).

Zotero is a good instrument for managing analysis papers. It’s nice at storing and organizing papers. You possibly can create a set for every undertaking and retailer all of the related papers there. Importing papers may be very simple utilizing the browser extension, and exporting citations in BibTeX format is easy.

5 Experiment Monitoring

In knowledge science initiatives, you’ll usually run many experiments and check out many concepts. As soon as once more, it’s simple to get misplaced in all this mess.

We now have already made an excellent step ahead by structuring our codebase correctly and utilizing scripts to run our experiments. Nonetheless, I need to talk about two software program instruments that permit us to do even higher.

5.1 Wandb

Weights and Biases (wandb), pronounced “w-and-b” (for weights and biases) or “wand-b” (for being magical like a wand) or “wan-db” (for being a database), is a good instrument for monitoring experiments. It permits us to run a number of experiments and save all their configurations and ends in a central place.

Lessons and tips learned while earning a Kaggle Competition Medal — Determine 1: Wandb Dashboard Picture from Adrish Dey’s Configuring W&B Initiatives with Hydra article

Wandb supplies us with a dashboard to match the outcomes of various experiments, the hyperparameters used, and the coaching curves. It additionally tracks system metrics equivalent to GPU and CPU utilization.

Wandb additionally integrates with Hugging Face libraries, making it simple to trace experiments when utilizing transformers.

When you begin utilizing a number of experiments, wandb turns into an indispensable instrument.

5.2 Hydra

Hydra is a instrument constructed by Meta that simplifies configuration administration. It permits you to outline all of your configuration in YAML information and simply override them from the CLI.

It’s a very versatile instrument and matches a number of use circumstances. This information discusses use Hydra for experiment configuration.

6 The Finish-to-Finish Course of

Determine 2: Finish-to-Finish Organized Kaggle Competitors Course of created by the Writer utilizing Mermaid.js

Determine 2 summarizes the method mentioned on this article. First, we analysis concepts and file our learnings. Then, we experiment with these concepts on our native machines in Jupyter Notebooks. As soon as we’ve got a working thought, we refactor the code into our module and create scripts to run the experiments. We run the brand new experiment(s) on Kaggle. Lastly, we observe the outcomes of the brand new experiments.

As a result of every thing is rigorously tracked, we’re in a position to predict our shortcomings and shortly head again to the analysis or improvement phases to repair them.

7 Conclusion

Dysfunction is the supply of all evil in knowledge science initiatives. If we’re to provide dependable and reproducible work, we should attempt for group and readability in our processes. Kaggle competitions aren’t any exception.

On this article, we mentioned a method to arrange our codebase, tricks to observe analysis and learnings, and instruments to trace experiments. Determine 2 summarizes the proposed approach.