Logo

dev-resources.site

for different kinds of informations.

Tweets from heads of governments and states

Published at
9/19/2022
Categories
datascience
twitterbot
nlp
kaggle
Author
rohitfarmer
Categories
4 categories in total
datascience
open
twitterbot
open
nlp
open
kaggle
open
Author
11 person written this
rohitfarmer
open
Tweets from heads of governments and states

Since October 2018, I have been maintaining a bot written in Python and running on a Raspberry Pi 3B+ that collects tweets from heads of governments and offices (worldwide) followed by https://twitter.com/headoffice. It was an excellent exercise learning Python, Twitter API, SQLite database, and using a Raspberry Pi for hobby projects. I have now released the data on Kaggle at https://doi.org/10.34740/KAGGLE/DSV/4208877 for the community to use.

The dataset contains an Excel workbook per year with data points on the rows and features on the columns. Features include the timestamp (UTC), language in which the tweet is written, user id, user name, tweet id, and tweet text. The first version includes the data from October 2018 until September 15, 2022. After that, future releases will be quarterly. It is a textual dataset and is primarily useful for analyses related to natural language processing.

In the Kaggle submission, I have also included a notebook (https://www.kaggle.com/code/rohitfarmer/dont-run-tweet-collection-and-preprocessing) with the Python code that collected the tweets and the additional code that I used to pre-process the data before submission. After releasing the first data set, I updated the code and moved the bot from Python to R using the rtweet library instead of tweepy. I found rtweet to perform better, especially in filtering out duplicated tweets.

In the current setup (https://github.com/rohitfarmer/government-tweets) that is still running on my Raspberry Pi 3B+, the main bot script runs every fifteen minutes via crontab and fetches data that is more recent than the latest tweet collected in the previous run. The data is stored in an SQLite database which is backed up to MEGA cloud storage via Rclone once every midnight ET.

I enjoyed the process of creating the bot and being able to run it for a couple of years, and I hope I will soon find some time to look into the data and fetch some exciting insights. But, until then, the data is available to the data science community to utilize as they please. So, please open a discussion on the Kaggle page for questions, comments, or collaborations.

Day 13 of #100DaysToOffLoad

kaggle Article's
30 articles in total
Favicon
Building My First ML Model Using Amazon SageMaker + Kaggle + Jupyter Notebook
Favicon
15+ Useful PYTHON Libraries for Data Science
Favicon
Top 10 SQL projects with Kaggle Datasets
Favicon
Flux Dev - ComfyUI 1-CLICK Kaggle Notebook
Favicon
Stable Diffusion 3.5 Large (FP16) - ComfyUI 1-CLICK Kaggle Notebook
Favicon
How to setup the Nvidia TAO Toolkit on Kaggle Notebook
Favicon
Passing Input Arguments in Kaggle Notebook Using Environment Variables
Favicon
Style Your Kaggle Notebook
Favicon
Website Time dataset
Favicon
Create chat bot - JO PARIS 2024
Favicon
Partnership between Dev Community and Kaggle to help writers with their notebooks?
Favicon
Amazon product dataset
Favicon
Technical Report: Initial Data Analysis of Titanic Datasets
Favicon
Leveraging Kaggle for Free Geographical Data: A Guide to Integrating with PostGIS via QGIS
Favicon
πŸ“’ Neo4J Ninjas as Kaggle dataset πŸ₯·
Favicon
Google Gemma first try
Favicon
Tutorial: Creating Dataset The Elder Scroll: Skyrim Armor and Sending to Kaggle Datasets
Favicon
How To Do Stable Diffusion XL (SDXL) DreamBooth Training For Free - Utilizing Kaggle - Easy Tutorial
Favicon
Now you can do full Stable Diffusion XL (SDXL) DreamBooth training on Kaggle for free under 2 hours.
Favicon
How To Do Stable Diffusion XL (SDXL) Full DreamBooth Fine Tuning Training For Free via Kaggle
Favicon
How To Do Stable Diffusion XL (SDXL) LoRA Training For Free On Cloud (Kaggle)
Favicon
Kaggle Coleridge 52nd Solution
Favicon
How to use Kaggle for Climate Change studies
Favicon
Kaggle SETI 59th Solution
Favicon
5 Tools to Start Working with Python 🀯☒️😱
Favicon
πŸ¦† From API to scheduled offline copies with DuckDB on Kaggle ♾️
Favicon
SageMaker Data Ingestion using Kaggle
Favicon
Kaggle's Intro to Programming: A Short Review
Favicon
Tweets from heads of governments and states
Favicon
4 Tools Kaggle Grandmasters use to win $100,000s in competitions

Featured ones: