Logo

dev-resources.site

for different kinds of informations.

Kaggle SETI 59th Solution

Published at
7/25/2023
Categories
kaggle
Author
tmyoda
Categories
1 categories in total
kaggle
open
Author
6 person written this
tmyoda
open
Kaggle SETI 59th Solution

This article is translated from my Japanese tech blog.
https://tmyoda.hatenablog.com/entry/20210819/1629384283

About the SETI Competition

https://www.kaggle.com/competitions/seti-breakthrough-listen

This competition is given a spectrogram of a signal and predicts anomalies in it.
(The data used in this competition has been artificially generated from a simulator)

Pipeline

Image description

Augmentation

I didn't have enough time to investigate augmentation thoroughly. For now, I used these four and mixup is included. I don't know which one is effective...

  • vflip
  • shift_scale_rotate
  • motion_blur
  • spec_augment

I wanted to use SpecAug in albumentations, so I created a class as follows.

class SpecAugment(ImageOnlyTransform):
    def __init__(self, alpha=0.1, **kwargs):
        super(SpecAugment, self).__init__(**kwargs)
        self.spec_alpha = alpha

    def apply(self, img, **params):
        x = img
        t0 = np.random.randint(0, x.shape[0])
        delta = np.random.randint(0, int(x.shape[0] * self.spec_alpha))
        x[t0:min(t0 + delta, x.shape[0])] = 0
        t0 = np.random.randint(0, x.shape[1])
        delta = np.random.randint(0, int(x.shape[1] * self.spec_alpha))
        x[:, t0:min(t0 + delta, x.shape[1])] = 0
        return x

Enter fullscreen mode Exit fullscreen mode

Test Time Augmentation (TTA)

Since there are four augmentations I applied this time, I decided to perform the TTA 16 times. The number 16 was chosen because I wanted to apply all the augmentations at least once for each image during the TTA.

For example, when TTA 16 times, 4 types of augmentation, and the probability of each augmentation being applied is p=0.5, the probability of all augmentations being applied at least once can be calculated using the following formula.

TTA: 16, Augmentation 4
Image description

TTA: 4, Augmentation 4

Image description

Resizing Network

This model is the best score so far.
I believe it would be better to input the image without resizing, but my GPU has not enough memory.
If I want to input the image without resize, I need to reduce the batch size.

However, this leads to a situation where, in the case of imbalanced data like this time (9:1), only one class appears in a batch.

So, I decided to train with the largest possible image size using this model.

Training

In this competition, the dataset was reset once, and the dataset was completely refreshed. So, I decided to use the previous data for pre-training. Doing this, the score slightly increased for both LB and CV.

Also, the pre-training of the model is fold-out, and the fine-tuning is 4Fold CV.

Model

I have encountered a problem model would not learn when enlarged (probably due to bad learning rate and scheduler) even I tried various models (nfnet, volo, swin,...).

So, I decided to use efficientnetv2_s and m which had good score.

What I tried

1st Place Solution

I was surprised by the first place solution.
I think the idea to remove this background can be used in other competitions dealing with spectrograms.

https://www.kaggle.com/c/seti-breakthrough-listen/discussion/266385

kaggle Article's
30 articles in total
Favicon
Building My First ML Model Using Amazon SageMaker + Kaggle + Jupyter Notebook
Favicon
15+ Useful PYTHON Libraries for Data Science
Favicon
Top 10 SQL projects with Kaggle Datasets
Favicon
Flux Dev - ComfyUI 1-CLICK Kaggle Notebook
Favicon
Stable Diffusion 3.5 Large (FP16) - ComfyUI 1-CLICK Kaggle Notebook
Favicon
How to setup the Nvidia TAO Toolkit on Kaggle Notebook
Favicon
Passing Input Arguments in Kaggle Notebook Using Environment Variables
Favicon
Style Your Kaggle Notebook
Favicon
Website Time dataset
Favicon
Create chat bot - JO PARIS 2024
Favicon
Partnership between Dev Community and Kaggle to help writers with their notebooks?
Favicon
Amazon product dataset
Favicon
Technical Report: Initial Data Analysis of Titanic Datasets
Favicon
Leveraging Kaggle for Free Geographical Data: A Guide to Integrating with PostGIS via QGIS
Favicon
πŸ“’ Neo4J Ninjas as Kaggle dataset πŸ₯·
Favicon
Google Gemma first try
Favicon
Tutorial: Creating Dataset The Elder Scroll: Skyrim Armor and Sending to Kaggle Datasets
Favicon
How To Do Stable Diffusion XL (SDXL) DreamBooth Training For Free - Utilizing Kaggle - Easy Tutorial
Favicon
Now you can do full Stable Diffusion XL (SDXL) DreamBooth training on Kaggle for free under 2 hours.
Favicon
How To Do Stable Diffusion XL (SDXL) Full DreamBooth Fine Tuning Training For Free via Kaggle
Favicon
How To Do Stable Diffusion XL (SDXL) LoRA Training For Free On Cloud (Kaggle)
Favicon
Kaggle Coleridge 52nd Solution
Favicon
How to use Kaggle for Climate Change studies
Favicon
Kaggle SETI 59th Solution
Favicon
5 Tools to Start Working with Python 🀯☒️😱
Favicon
πŸ¦† From API to scheduled offline copies with DuckDB on Kaggle ♾️
Favicon
SageMaker Data Ingestion using Kaggle
Favicon
Kaggle's Intro to Programming: A Short Review
Favicon
Tweets from heads of governments and states
Favicon
4 Tools Kaggle Grandmasters use to win $100,000s in competitions

Featured ones: