Logo

dev-resources.site

for different kinds of informations.

Crawler Web dev.to using Colly when learning Golang

Published at
11/1/2022
Categories
go
nginx
crawler
colly
Author
chieund
Categories
4 categories in total
go
open
nginx
open
crawler
open
colly
open
Author
7 person written this
chieund
open
Crawler Web dev.to using Colly when learning Golang

I would like to recommend a website of mine that I made during my Golang learning.
My website http://techdaily.info is for learning golang language.
Besides crawling dev.to, I also crawl some other websites like freecodecamp.com, medium.com, hashnode.com, logrocket.com, infoq.com
So I built a website that specializes in crawling other sites
some technology that i used.

  • Golang
  • Colly
  • Nginx
  • Service
  • Docker
  • Mysql
  • Run action deploy to server
  • Cronjob daily crawl

Build Run Local

Change file app_example.yaml to app.yaml

cp app_example.yaml app.yaml
Enter fullscreen mode Exit fullscreen mode

Build Docker

docker-compose up --build
Enter fullscreen mode Exit fullscreen mode

Install package Golang

docker-compose exec crawl go mod tidy
Enter fullscreen mode Exit fullscreen mode

Folder vendor

docker-compose exec crawl go mod vendor
Enter fullscreen mode Exit fullscreen mode

Run Crawl

docker-compose exec crawl go run cmd/main.go
Enter fullscreen mode Exit fullscreen mode

Use air autoload

docker-compose exec crawl air -c .air.conf
Enter fullscreen mode Exit fullscreen mode

Deploy

Run file makefile build project into folder bin

make copy_template build_app_web build_app_crawl
Enter fullscreen mode Exit fullscreen mode

Create Services in run in background

Create Service and Run App Web

sudo nano /lib/systemd/system/app_web.service
Enter fullscreen mode Exit fullscreen mode

Copy Content

[Unit]
Description=App Web

[Service]
Type=simple
Restart=always
RestartSec=5s
WorkingDirectory=/root/actions-runner/crawl/crawl/crawl/bin
ExecStart=/root/actions-runner/crawl/crawl/crawl/bin/app_web

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode
sudo systemctl enable app_web
sudo systemctl start app_web
sudo systemctl status app_web
Enter fullscreen mode Exit fullscreen mode

Run App Crawl

./app_crawl
Enter fullscreen mode Exit fullscreen mode

Add CronTab

crontab -e
Enter fullscreen mode Exit fullscreen mode

add cron time

*/60 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article
*/20 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article-detail
Enter fullscreen mode Exit fullscreen mode

Reload cron run

sudo service cron reload
Enter fullscreen mode Exit fullscreen mode

Website

http://techdaily.info/


"Buy Me A Coffee"

https://github.com/chieund/crawl

crawler Article's
30 articles in total
Favicon
The best web crawler tools in 2025
Favicon
Proxy IP and crawler anomaly detection make data collection more stable and efficient
Favicon
Session management of proxy IP in crawlers
Favicon
How Crawler IP Proxies Enhance Competitor Analysis and Market Research
Favicon
How to configure Swiftproxy proxy server in Puppeteer?
Favicon
Common web scraping roadblocks and how to avoid them
Favicon
什么是网络爬虫及其工作原理?
Favicon
网络爬虫架构设计
Favicon
Traditional crawler or AI-assisted crawler? How to choose?
Favicon
AI+Node.js x-crawl crawler: Why are traditional crawlers no longer the first choice for data crawling?
Favicon
Building a README Crawler With Node.js
Favicon
The Ultimate Instagram Scraping API Guide for 2024
Favicon
How to efficiently scrape millions of Google Businesses on a large scale using a distributed crawler
Favicon
A Step-by-Step Guide to Building a Scalable Distributed Crawler for Scraping Millions of Top TikTok Profiles
Favicon
Python爬虫如何爬wss数据
Favicon
Web Crawler in Action: How to use Webspot to implement automatic recognition and data extraction of list web pages
Favicon
Web Scraping vs. Crawling: What’s the Difference?
Favicon
Crawler Web dev.to using Colly when learning Golang
Favicon
Glue Crawlers: No GetObject, No Problem
Favicon
Simple tool crawl urls form domain
Favicon
用 2Captcha 通過 CAPTCHA 人機驗證
Favicon
The Difference Between Web Scraping vs Web Crawling
Favicon
Design a Web Crawler
Favicon
Build A Web Crawler To Find Any Broken Links on Your Site with Python & BeautifulSoup
Favicon
DRUM
Favicon
15 Best Website Downloaders & Website Copier – Save website locally to read offline
Favicon
Google News | Crawler
Favicon
[Beginner] How to build Youtube video crawler web application with Rails 6, Devise, Nokogiri and Bootstrap 4?
Favicon
TYPO3 Crawler with TYPO3 9 & 10 Support
Favicon
How to generate a Laravel sitemaps on the fly?

Featured ones: