Skip to content

Releases: andreaskoch/gargantua

v0.5.0-alpha

14 Jan 15:31
Compare
Choose a tag to compare
  • Ignore invalid SSL certificates
  • Log response headers

v0.4.1-alpha Fix duplicate visit issue

16 Nov 14:52
Compare
Choose a tag to compare
Pre-release

With the last release I introduced a bug which caused gargantua to visits same URL more than once.

v0.4.0-alpha Logging

05 Nov 12:15
Compare
Choose a tag to compare
v0.4.0-alpha Logging Pre-release
Pre-release

You can specify a log file with the --log argument:

gargantua crawl --url https://www.sitemaps.org/sitemap.xml --workers 5 --log "gargantua.log"
Date and time       #worker   Status Code     Bytes   Response Time   URL                                                          Parent URL
2020/11/05 09:23:14 #001:     200             4403    148.759000ms    https://www.sitemaps.org                                     https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #002:     200             4403    290.536000ms    http://www.sitemaps.org/                                     https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #003:     200            45077    283.243000ms    https://www.sitemaps.org/protocol.html                       https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #004:     404             1245    155.376000ms    https://www.sitemaps.org/protocol.htm                        https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #005:     200             4403    155.577000ms    https://www.sitemaps.org/index.html                          https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #001:     200             2591    286.451000ms    http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd    https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #003:     200            10839    143.738000ms    https://www.sitemaps.org/terms.html                          https://www.sitemaps.org/ko/faq.html
2020/11/05 09:23:14 #005:     200            15681    141.580000ms    https://www.sitemaps.org/faq.html                            https://www.sitemaps.org/ko/protocol.html
2020/11/05 09:23:14 #002:     404             1245    286.175000ms    http://www.sitemaps.org/protocol.htm                         https://www.sitemaps.org/ko/faq.html

v0.3.0-alpha Customizable User-Agent

02 May 12:01
Compare
Choose a tag to compare
Pre-release

You can now customize the user-agent that is used by the crawler:

gargantua crawl \
                --url https://www.sitemaps.org/sitemap.xml \
                --workers 5 \
                --user-agent "gargantua bot / iPhone"

Minor improvements and documentation

16 Feb 20:01
Compare
Choose a tag to compare
Pre-release

If fixed the bug that caused the UI to no exit after the crawler was done and made a quick YouTube video showing how gargantua works:

gargantua-in-action-crawling-a-website

Blog post: https://andykdocs.de/!gargantua-prototype

The prototype

07 Feb 22:04
Compare
Choose a tag to compare
The prototype Pre-release
Pre-release

「 gargantua 」crawls websites from your command line and displays the results and statistics live via a text-based UI:

gargantua crawl --url https://www.sitemaps.org/sitemap.xml --workers 5

Screenshot of gargantua v0.1.0-alpha crawling sitemaps.org