Logo

dev-resources.site

for different kinds of informations.

4 Tools Kaggle Grandmasters use to win $100,000s in competitions

Published at
2/3/2022
Categories
competition
kaggle
tips
tricks
Author
jesperdramsch
Categories
4 categories in total
competition
open
kaggle
open
tips
open
tricks
open
Author
13 person written this
jesperdramsch
open
4 Tools Kaggle Grandmasters use to win $100,000s in competitions

4 Tools Kaggle Grandmasters use to win $100,000s in competitions

Expertise is figuring out what works and what doesn't.

Why not let the experts tell you?

Rather than experiment from the ground up for a decade!

  1. Pseudolabelling
  2. Negative Mining
  3. Augmentation Tricks
  4. Test-time augmentation

🎨 Pseudolabelling

Some competitions don't have a lot of data.

Pseudo labels are created by building a good model on the training data.

Then predict on the public test data.

Finally, use labels with high confidence as additional training data!

πŸ“‰ Hard Negative Mining

This works best on classifiers with a binary outcome.

The core idea:

  1. Take misclassified samples from your training data.
  2. Retrain the model on this data specifically.

Sometimes this is specifically applied to retraining on false positives as negatives.

🏁 Finish training unaugmented

Data augmentation is a way to artificially create more data by slightly altering the existing data.

This trains the ML model to recognize more variance in the data.

Finishing the last training epochs unaugmented usually increases accuracy.

πŸ”ƒ Test-Time Augmentation (TTA)

Augmentation during training? Classic.

How about augmenting your data during testing though?

You can create an ensemble of samples through augmentation.

Predict on the ensemble and then use the average prediction from our model!

Conclusion

Kaggle can teach you some sweet tricks for your machine learning endeavours.

This article was about these four:

  • Create extra training data
  • Train on bad samples
  • Top of training with original data
  • Test on an ensemble of your data
kaggle Article's
30 articles in total
Favicon
Building My First ML Model Using Amazon SageMaker + Kaggle + Jupyter Notebook
Favicon
15+ Useful PYTHON Libraries for Data Science
Favicon
Top 10 SQL projects with Kaggle Datasets
Favicon
Flux Dev - ComfyUI 1-CLICK Kaggle Notebook
Favicon
Stable Diffusion 3.5 Large (FP16) - ComfyUI 1-CLICK Kaggle Notebook
Favicon
How to setup the Nvidia TAO Toolkit on Kaggle Notebook
Favicon
Passing Input Arguments in Kaggle Notebook Using Environment Variables
Favicon
Style Your Kaggle Notebook
Favicon
Website Time dataset
Favicon
Create chat bot - JO PARIS 2024
Favicon
Partnership between Dev Community and Kaggle to help writers with their notebooks?
Favicon
Amazon product dataset
Favicon
Technical Report: Initial Data Analysis of Titanic Datasets
Favicon
Leveraging Kaggle for Free Geographical Data: A Guide to Integrating with PostGIS via QGIS
Favicon
πŸ“’ Neo4J Ninjas as Kaggle dataset πŸ₯·
Favicon
Google Gemma first try
Favicon
Tutorial: Creating Dataset The Elder Scroll: Skyrim Armor and Sending to Kaggle Datasets
Favicon
How To Do Stable Diffusion XL (SDXL) DreamBooth Training For Free - Utilizing Kaggle - Easy Tutorial
Favicon
Now you can do full Stable Diffusion XL (SDXL) DreamBooth training on Kaggle for free under 2 hours.
Favicon
How To Do Stable Diffusion XL (SDXL) Full DreamBooth Fine Tuning Training For Free via Kaggle
Favicon
How To Do Stable Diffusion XL (SDXL) LoRA Training For Free On Cloud (Kaggle)
Favicon
Kaggle Coleridge 52nd Solution
Favicon
How to use Kaggle for Climate Change studies
Favicon
Kaggle SETI 59th Solution
Favicon
5 Tools to Start Working with Python 🀯☒️😱
Favicon
πŸ¦† From API to scheduled offline copies with DuckDB on Kaggle ♾️
Favicon
SageMaker Data Ingestion using Kaggle
Favicon
Kaggle's Intro to Programming: A Short Review
Favicon
Tweets from heads of governments and states
Favicon
4 Tools Kaggle Grandmasters use to win $100,000s in competitions

Featured ones: