Golang Release Archive Size

File name Kind OS Arch Size
go1.14.5.src.tar.gz Source 21MB
go1.14.5.darwin-amd64.tar.gz Archive macOS x86-64 119MB
go1.13.13.darwin-amd64.tar.gz Archive macOS x86-64 116MB

背景:好奇

Go 发布的新版本包大小好像是越来越大了,很好奇是不是这样的,有没有可能制作一个趋势图?

需求

获取所有历史版本和它的包大小,从而得到一个包大小增长变化的趋势图。

设计(功能拆分)

  1. 获取所有的历史版本和包大小;
  2. 渲染增长变化的趋势图;

如何获取所有的历史版本和包大小呢?

  1. 找到源网站地址 (https://golang.org/dl/)
  2. 通过 Chrome Network 获得其 cURL 命令
  3. 使用 https://mholt.github.io/curl-to-go 把 cURL 命令转换成 Go http fetch 代码
  4. 使用 goquery 解析网页
  5. 分析这些历史版本和包大小的“表结构”

废话不多说,直接贴代码了

代码片段

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"

	"github.com/PuerkitoBio/goquery"
)

func main() {
	resp := fetch()
	defer func() {
		err := resp.Body.Close()
		if err != nil {
			log.Fatal(err)
		}
	}()
	doc, err := goquery.NewDocumentFromReader(resp.Body)
	if err != nil {
		log.Fatal(err)
	}
	type releaseSize struct {
		version string
		size    string
	}
	var results []releaseSize
	doc.Find(".codetable").Each(func(i int, s *goquery.Selection) {
		s.Find("tbody>tr").Each(func(i int, s *goquery.Selection) {
			var row []string
			s.Find("td").Each(func(indexth int, tablecell *goquery.Selection) {
				row = append(row, tablecell.Text())
			})
			if len(row) == 0 {
				return
			}
			// set your fetch data config
			if row[1] == "Archive" && row[2] == "macOS" {
				results = append(results, releaseSize{
					version: row[0],
					size:    row[4],
				})
			}
		})
	})
	ret := ""
	for j := len(results) - 1; j >= 0; j-- {
		ret += fmt.Sprintf("%s,%s\n", results[j].version, results[j].size[:len(results[j].size)-2])
	}
	ioutil.WriteFile("go-release-history-archive-size.csv", []byte(ret), 0644)
}

func fetch() *http.Response {
	// Generated by curl-to-Go: https://mholt.github.io/curl-to-go

	req, err := http.NewRequest("GET", "https://golang.org/dl/", nil)
	if err != nil {
		// handle err
		panic(err)
	}
	req.Header.Set("Authority", "golang.org")
	req.Header.Set("Cache-Control", "max-age=0")
	req.Header.Set("Upgrade-Insecure-Requests", "1")
	req.Header.Set("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36")
	req.Header.Set("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9")
	req.Header.Set("Sec-Fetch-Site", "same-origin")
	req.Header.Set("Sec-Fetch-Mode", "navigate")
	req.Header.Set("Sec-Fetch-User", "?1")
	req.Header.Set("Sec-Fetch-Dest", "document")
	req.Header.Set("Referer", "https://golang.org/")
	req.Header.Set("Accept-Language", "zh-CN,zh;q=0.9,en;q=0.8,zh-TW;q=0.7,ja;q=0.6")
	// 请填写你自己的 cookie
	req.Header.Set("Cookie", "...")

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		// handle err
		panic(err)
	}
	return resp
}

思考🤔

  • 增大的包都包括哪些内容呢?
  • Go Team 有没有关注到这个问题?
  • 有没有必要考虑把它降下来?

后续

The Go 1 compatibility guarantee (https://golang.org/doc/go1compat) pretty much guarantees that every new release will be larger. It's hard for us to get rid of existing code.

Although the size is increasing the increase does not seem all that big, and the file size doesn't seem particularly large by modern standards.

来源于 Go Team Ian 的答复。


茶歇驿站

一个可以让你停下来看一看,在茶歇之余给你帮助的小站,这里的内容主要是后端技术,个人管理,团队管理,以及其他个人杂想。