分类 默认分类 下的文章

原文:http://www.alloyteam.com/2013/12/js-calculate-the-number-of-bytes-occupied-by-a-string/

最近项目有个需求要用js计算一串字符串写入到localStorage里所占的内存,众所周知的,js是使用Unicode编码的。而Unicode的实现有N种,其中用的最多的就是UTF-8和UTF-16。因此本文只对这两种编码进行讨论。

下面这个定义摘自维基百科(http://zh.wikipedia.org/zh-cn/UTF-8),做了部分删减。

UTF-8(8-bit Unicode Transformation Format)是一种针对Unicode的可变长度字符编码,可以表示Unicode标准中的任何字符,且其编码中的第一个字节仍与ASCII相容,使用一至四个字节为每个字符编码

其编码规则如下:

  1. 字符代码在000000 – 00007F之间的,用一个字节编码;
  2. 000080 – 0007FF之间的字符用两个字节;
  3. 000800 – 00D7FF 和 00E000 – 00FFFF之间的用三个字节,注: Unicode在范围 D800-DFFF 中不存在任何字符;
  4. 010000 – 10FFFF之间的用4个字节。

而UTF-16 则是定长的字符编码,大部分字符使用两个字节编码,字符代码超出 65535 的使用四个字节,如下:

  1. 000000 – 00FFFF 两个字节;
  2. 010000 – 10FFFF 四个字节。

一开始认为既然页面用的是UTF-8编码,那么存入localStorage的字符串,应该也是用UTF-8编码的。但后来测试发现,明明计算出的size是不到5MB,存入localStorage却抛异常了。想了想,页面的编码是可以改的。如果localStorage按照页面的编码存字符串,不就乱套了?浏览器应该都是使用UTF-16编码的。用UTF-16编码计算出5MB的字符串,果然顺利写进去了。超过则失败了。
好了,附上代码实现。计算规则就是上面写的,为了计算速度,把两个for循环分开写了。

/**
 * 计算字符串所占的内存字节数,默认使用UTF-8的编码方式计算,也可制定为UTF-16
 * UTF-8 是一种可变长度的 Unicode 编码格式,使用一至四个字节为每个字符编码
 * 
 * 000000 - 00007F(128个代码)      0zzzzzzz(00-7F)                             一个字节
 * 000080 - 0007FF(1920个代码)     110yyyyy(C0-DF) 10zzzzzz(80-BF)             两个字节
 * 000800 - 00D7FF 
   00E000 - 00FFFF(61440个代码)    1110xxxx(E0-EF) 10yyyyyy 10zzzzzz           三个字节
 * 010000 - 10FFFF(1048576个代码)  11110www(F0-F7) 10xxxxxx 10yyyyyy 10zzzzzz  四个字节
 * 
 * 注: Unicode在范围 D800-DFFF 中不存在任何字符
 * {@link http://zh.wikipedia.org/wiki/UTF-8}
 * 
 * UTF-16 大部分使用两个字节编码,编码超出 65535 的使用四个字节
 * 000000 - 00FFFF  两个字节
 * 010000 - 10FFFF  四个字节
 * 
 * {@link http://zh.wikipedia.org/wiki/UTF-16}
 * @param  {String} str 
 * @param  {String} charset utf-8, utf-16
 * @return {Number}
 */
var sizeof = function(str, charset){
    var total = 0,
        charCode,
        i,
        len;
    charset = charset ? charset.toLowerCase() : '';
    if(charset === 'utf-16' || charset === 'utf16'){
        for(i = 0, len = str.length; i < len; i++){
            charCode = str.charCodeAt(i);
            if(charCode <= 0xffff){
                total += 2;
            }else{
                total += 4;
            }
        }
    }else{
        for(i = 0, len = str.length; i < len; i++){
            charCode = str.charCodeAt(i);
            if(charCode <= 0x007f) {
                total += 1;
            }else if(charCode <= 0x07ff){
                total += 2;
            }else if(charCode <= 0xffff){
                total += 3;
            }else{
                total += 4;
            }
        }
    }
    return total;
}

前提条件:

相关所有消费服务必须停止。

执行如下命令即可:

首先查看指定消费组信息下面使用的topic offset 情况:

bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group you_consumer_group_name --describe

20180910114108959.png

注意看current-offsetlog-end-offset还有 lag ,分别为当前偏移量,结束的偏移量,落后的偏移量。

现在的情况是有90条信息已经消费完毕了。只要看lag为0。如果还有未消费的会显示如下信息:

20180910114655982.png

现在明显是lag12,current-offset =91,log-end-offset=103 ,告诉我们有12条未消费,

current-offset 当前已经消费到偏移量为91,可以理解为已经消费91条。

log-end-offset可以理解为总共103条记录。

lag可以理解为未消费记录条数。

如果想控制当前offset,需要注意的是这里面的消息可能消费过后,超过配置文件(server.properties)里面的属性-> log.retention.hours = 168 ,这个属性代表消息保留时间为多少小时。默认为168小时,也就是一周时间。所以你只能控制到近期保留的消息偏移量 (我这里举例设置从80偏移量开始)-> 可以执行如下命令:

bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group you_consumer_group_name --topic you_topic_name --execute --reset-offsets --to-offset 80

20180910123508387.png

执行如上命令类似如下结果说明设置成功了。我们继续使用第一条命令查看验证一下:

bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group you_consumer_group_name --describe

20180910123648634.png

OK,很明显手动调节偏移量成功了。

作者:ZouChengli
来源:CSDN
原文:https://blog.csdn.net/ZouChengli/article/details/82587404
版权声明:本文为博主原创文章,转载请附上博文链接!

原文:https://juejin.im/post/5c0bc191e51d456f206b01cb

在Linux下,ls这个命令大家肯定太熟悉了,良许相信只要是Linux工程师,每天都会离不开这个命令,而且一天会使用个几百次。但是,除了 ls -l 以外,你还知 ls 的哪些高级用法呢?良许今天为大家介绍 ls 命令的8种高级用法。

假如我们有这样的一个文件夹,我们用tree命令查看它的目录结构:

请输入图片描述

用法1:列出/home/alvin/test_dir目录下所有文件及目录的详细资料

命令:

ls -lR /home/alvin/test_dir/

结果:

[alvin@VM_0_16_centos test_dir]$ ls -lR /home/alvin/test_dir/
/home/alvin/test_dir/:
total 28
-rw-rw-r-- 1 alvin alvin   37 Nov 18 09:12 atb_aux.c
-rw-rw-r-- 1 alvin alvin    8 Nov 18 09:12 atb_can.c
-rw-rw-r-- 1 alvin alvin   24 Nov 18 09:12 atb_orch.c
-rw-rw-r-- 1 alvin alvin    5 Nov 18 09:12 atb_ota.c
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 include
-rw-rw-r-- 1 alvin alvin    0 Nov 18 09:12 Makefile
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 output
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 src

/home/alvin/test_dir/include:
total 0
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 a.h
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 b.h
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 c.h

/home/alvin/test_dir/output:
total 0
-rwxrwxr-x 1 alvin alvin 0 Nov 18 09:12 app

/home/alvin/test_dir/src:
total 0
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 a.c
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 b.c
-rw-rw-r-- 1 alvin alvin 0 Nov 18 09:12 c.c

在这里,-l 选项大家应该比较熟悉,是以列表的形式显示结果的意思。而对于 -R 选项,意思是递归处理,将指定目录下的所有文件及子目录一并处理。

用法2:列出/home/alvin/test_dir目录下以atb开头的所有文件的详细内容

命令:

ls -l atb*

结果:

[alvin@VM_0_16_centos test_dir]$ ls -l atb*
-rw-rw-r-- 1 alvin alvin 37 Nov 18 09:12 atb_aux.c
-rw-rw-r-- 1 alvin alvin  8 Nov 18 09:12 atb_can.c
-rw-rw-r-- 1 alvin alvin 24 Nov 18 09:12 atb_orch.c
-rw-rw-r-- 1 alvin alvin  5 Nov 18 09:12 atb_ota.c

用法3:只列出目录下的子目录

方法1:

命令

ls -F /home/alvin/test_dir | grep /$

结果:

[alvin@VM_0_16_centos test_dir]$ ls -F /home/alvin/test_dir | grep /$
include/
output/
src/

其中:-F选项表示在每个文件名后附上一个字符以说明改文件的类型。*:表示可执行的普通文件;/:表示目录;@:表示符号链接;|:表示FIFOs;=:表示套接字。

/$ 其实是一个正则表达式,表示以 / 结尾。grep /$ 表示过滤出以 / 结尾的结果,也就是子目录。

方法2:

命令:

ls -p /home/alvin/test_dir | grep /$

结果:

[alvin@VM_0_16_centos test_dir]$ ls -p | grep /$
include/
output/
src/

其中:-p 选项与 -F 选项类似,也是在每个文件名后附上一个字符以说明改文件的类型。

方法3:

命令:

ls -l /home/alvin/test_dir | grep "^d"

结果:

[alvin@VM_0_16_centos test_dir]$ ls -l /home/alvin/test_dir | grep "^d"
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 include
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 output
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 src

其中:^d 也是一个正则表达式,表示以 d 开头。grep "^d" 表示过滤出以 d 开头的结果,而 ls -l 所列出来的结果,首位如果是 d 的话,表示这个文件是个目录,这样就可以过滤出子目录了。

方法4:

命令:

ls -d */

结果:

[alvin@VM_0_16_centos test_dir]$ ls -d */
include/  output/  src/

其中:-d 选项表示将目录象文件一样显示,而不显示其下的文件。

用法4:按时间顺序列出目录下的文件,越新越排后面。

命令:

ls -ltr

结果:

[alvin@VM_0_16_centos test_dir]$ ls -lrt
total 28
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 src
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 output
-rw-rw-r-- 1 alvin alvin    0 Nov 18 09:12 Makefile
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 include
-rw-rw-r-- 1 alvin alvin    5 Nov 18 09:12 atb_ota.c
-rw-rw-r-- 1 alvin alvin   24 Nov 18 09:12 atb_orch.c
-rw-rw-r-- 1 alvin alvin    8 Nov 18 09:12 atb_can.c
-rw-rw-r-- 1 alvin alvin   37 Nov 18 09:12 atb_aux.c

其中:-t 选项表示以文件修改时间排序,越新的越靠前。-r 选项表示对结果进行反向排序,二者结合的话表示以修改时间排序,越新的越靠后。

用法5:以文件大小进行排序

命令:

ls -lhS

结果:

[alvin@VM_0_16_centos test_dir]$ ls -lhS
total 28K
drwxrwxr-x 2 alvin alvin 4.0K Nov 18 09:12 include
drwxrwxr-x 2 alvin alvin 4.0K Nov 18 09:12 output
drwxrwxr-x 2 alvin alvin 4.0K Nov 18 09:12 src
-rw-rw-r-- 1 alvin alvin   37 Nov 18 09:12 atb_aux.c
-rw-rw-r-- 1 alvin alvin   24 Nov 18 09:12 atb_orch.c
-rw-rw-r-- 1 alvin alvin    8 Nov 18 09:12 atb_can.c
-rw-rw-r-- 1 alvin alvin    5 Nov 18 09:12 atb_ota.c
-rw-rw-r-- 1 alvin alvin    0 Nov 18 09:12 Makefile

其中:-h 选项表示以可读选项显示,否则文件大小是默认以字节数显示,比如:4873字节,你知道是多少吗?加了 -h 选项系统就帮你换算成以k或者其它单位的结果。

-S 选项表示以文件大小进行排序,文件越大越靠前。想要文件小者靠前的话,加个 -r 选项就行。

用法6:统计当前目录下的文件数和目录数

统计文件数:
命令:

ls -l | grep "^-" | wc -l

结果:

[alvin@VM_0_16_centos test_dir]$ ls -l | grep "^-" | wc -l
5

其中:^- 表示以 - 开头,即普通文件,ls -l | grep "^-" 过滤出普通文件,再用 wc -l统计过滤出的普通文件的个数。

统计目录数:
命令:

ls -l | grep "^d" | wc -l

结果:

[alvin@VM_0_16_centos test_dir]$ ls -l | grep "^d" | wc -l
3

其中:^d 表示以 d 开头,即目录,ls -l | grep "^d" 过滤出目录,再用 wc -l 统计过滤出的目录的个数。

用法7:列出所有文件的绝对路径

命令:

ls | sed "s:^:`pwd`/:"

结果:

[alvin@VM_0_16_centos test_dir]$ ls | sed "s:^:`pwd`/:"
/home/alvin/test_dir/atb_aux.c
/home/alvin/test_dir/atb_can.c
/home/alvin/test_dir/atb_orch.c
/home/alvin/test_dir/atb_ota.c
/home/alvin/test_dir/include
/home/alvin/test_dir/Makefile
/home/alvin/test_dir/output
/home/alvin/test_dir/src

其中:sed "s:^:`pwd`/:" 表示在行首增加 pwd(即文件所在路径),与文件名组合成绝对路径。

用法8:列出当前目录下的所有文件(包括隐藏文件)的绝对路径, 对目录不做递归

在上一个用法里,对于隐藏文件(即以 . 开头的文件)是不作处理的,如果我们需要对隐藏文件也作处理的话,就需要使用下面这个命令:

find $PWD -maxdepth 1 | xargs ls -ld

结果:

[alvin@VM_0_16_centos test_dir]$ find $PWD -maxdepth 1 | xargs ls -ld
drwxrwxr-x 5 alvin alvin 4096 Nov 18 17:30 /home/alvin/test_dir
-rw-rw-r-- 1 alvin alvin   37 Nov 18 09:12 /home/alvin/test_dir/atb_aux.c
-rw-rw-r-- 1 alvin alvin    8 Nov 18 09:12 /home/alvin/test_dir/atb_can.c
-rw-rw-r-- 1 alvin alvin   24 Nov 18 09:12 /home/alvin/test_dir/atb_orch.c
-rw-rw-r-- 1 alvin alvin    5 Nov 18 09:12 /home/alvin/test_dir/atb_ota.c
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 /home/alvin/test_dir/include
-rw-rw-r-- 1 alvin alvin    0 Nov 18 09:12 /home/alvin/test_dir/Makefile
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 /home/alvin/test_dir/output
drwxrwxr-x 2 alvin alvin 4096 Nov 18 09:12 /home/alvin/test_dir/src

其中:find $PWD -maxdepth 1 限定了在当前层级(即不递归),再对find结果传递给 ls -ld 作参数,这样就将当前目录下的所有文件补齐了所在路径。

php 或者 golang 导出 csv 文件的时候,用 excel 打开会显示乱码
这个是因为excel默认的编码不是 utf-8 的原因

简单粗暴的方案:

增加 bom 头:

chr(0xEF).chr(0xBB).chr(0xBF)

php示例

$fp = fopen($filename, "w+");
fwrite($fp, chr(0xEF).chr(0xBB).chr(0xBF));

// write something ....
fclose($fp);

golang示例

dstf, err := os.Create(fileName)
defer dstf.Close()
if err != nil {   
    return err
}
 
dstf.WriteString("\xEF\xBB\xBF") // 写入UTF-8 BOM,防止中文乱码
// 写数据到文件
w := csv.NewWriter(dstf)
do something

其他语言理论上应该是一样的,不过没有测试过

原文:https://research.swtch.com/interfaces

Go's interfaces—static, checked at compile time, dynamic when asked for—are, for me, the most exciting part of Go from a language design point of view. If I could export one feature of Go into other languages, it would be interfaces.

This post is my take on the implementation of interface values in the “gc” compilers: 6g, 8g, and 5g. Over at Airs, Ian Lance Taylor has written two posts about the implementation of interface values in gccgo. The implementations are more alike than different: the biggest difference is that this post has pictures.

Before looking at the implementation, let's get a sense of what it must support.

Usage

Go's interfaces let you use duck typing like you would in a purely dynamic language like Python but still have the compiler catch obvious mistakes like passing an int where an object with a Read method was expected, or like calling the Read method with the wrong number of arguments. To use interfaces, first define the interface type (say, ReadCloser):

type ReadCloser interface {
    Read(b []byte) (n int, err os.Error)
    Close()
}

and then define your new function as taking a ReadCloser. For example, this function calls Read repeatedly to get all the data that was requested and then calls Close:

func ReadAndClose(r ReadCloser, buf []byte) (n int, err os.Error) {
    for len(buf) > 0 && err == nil {
        var nr int
        nr, err = r.Read(buf)
        n += nr
        buf = buf[nr:]
    }
    r.Close()
    return
}

The code that calls ReadAndClose can pass a value of any type as long as it has Read and Close methods with the right signatures. And, unlike in languages like Python, if you pass a value with the wrong type, you get an error at compile time, not run time.

Interfaces aren't restricted to static checking, though. You can check dynamically whether a particular interface value has an additional method. For example:

type Stringer interface {
    String() string
}

func ToString(any interface{}) string {
    if v, ok := any.(Stringer); ok {
        return v.String()
    }
    switch v := any.(type) {
    case int:
        return strconv.Itoa(v)
    case float:
        return strconv.Ftoa(v, 'g', -1)
    }
    return "???"
}

The value any has static type interface{}, meaning no guarantee of any methods at all: it could contain any type. The “comma ok” assignment inside the if statement asks whether it is possible to convert any to an interface value of type Stringer, which has the method String. If so, the body of that statement calls the method to obtain a string to return. Otherwise, the switch picks off a few basic types before giving up. This is basically a stripped down version of what the fmt package does. (The if could be replaced by adding case Stringer: at the top of the switch, but I used a separate statement to draw attention to the check.)

As a simple example, let's consider a 64-bit integer type with a String method that prints the value in binary and a trivial Get method:

type Binary uint64

func (i Binary) String() string {
    return strconv.Uitob64(i.Get(), 2)
}

func (i Binary) Get() uint64 {
    return uint64(i)
}

A value of type Binary can be passed to ToString, which will format it using the String method, even though the program never says that Binary intends to implement Stringer. There's no need: the runtime can see that Binary has a String method, so it implements Stringer, even if the author of Binary has never heard of Stringer.

These examples show that even though all the implicit conversions are checked at compile time, explicit interface-to-interface conversions can inquire about method sets at run time. “Effective Go” has more details about and examples of how interface values can be used.

Interface Values

Languages with methods typically fall into one of two camps: prepare tables for all the method calls statically (as in C++ and Java), or do a method lookup at each call (as in Smalltalk and its many imitators, JavaScript and Python included) and add fancy caching to make that call efficient. Go sits halfway between the two: it has method tables but computes them at run time. I don't know whether Go is the first language to use this technique, but it's certainly not a common one. (I'd be interested to hear about earlier examples; leave a comment below.)

As a warmup, a value of type Binary is just a 64-bit integer made up of two 32-bit words (like in the last post, we'll assume a 32-bit machine; this time memory grows down instead of to the right):

请输入图片描述

Interface values are represented as a two-word pair giving a pointer to information about the type stored in the interface and a pointer to the associated data. Assigning b to an interface value of type Stringer sets both words of the interface value.
请输入图片描述

(The pointers contained in the interface value are gray to emphasize that they are implicit, not directly exposed to Go programs.)

The first word in the interface value points at what I call an interface table or itable (pronounced i-table; in the runtime sources, the C implementation name is Itab). The itable begins with some metadata about the types involved and then becomes a list of function pointers. Note that the itable corresponds to the interface type, not the dynamic type. In terms of our example, the itable for Stringer holding type Binary lists the methods used to satisfy Stringer, which is just String: Binary's other methods (Get) make no appearance in the itable.

The second word in the interface value points at the actual data, in this case a copy of b. The assignment var s Stringer = b makes a copy of b rather than point at b for the same reason that var c uint64 = b makes a copy: if b later changes, s and c are supposed to have the original value, not the new one. Values stored in interfaces might be arbitrarily large, but only one word is dedicated to holding the value in the interface structure, so the assignment allocates a chunk of memory on the heap and records the pointer in the one-word slot. (There's an obvious optimization when the value does fit in the slot; we'll get to that later.)

To check whether an interface value holds a particular type, as in the type switch above, the Go compiler generates code equivalent to the C expression s.tab->type to obtain the type pointer and check it against the desired type. If the types match, the value can be copied by by dereferencing s.data.

To call s.String(), the Go compiler generates code that does the equivalent of the C expression s.tab->fun[0](s.data): it calls the appropriate function pointer from the itable, passing the interface value's data word as the function's first (in this example, only) argument. You can see this code if you run 8g -S x.go (details at the bottom of this post). Note that the function in the itable is being passed the 32-bit pointer from the second word of the interface value, not the 64-bit value it points at. In general, the interface call site doesn't know the meaning of this word nor how much data it points at. Instead, the interface code arranges that the function pointers in the itable expect the 32-bit representation stored in the interface values. Thus the function pointer in this example is (*Binary).String not Binary.String.

The example we're considering is an interface with just one method. An interface with more methods would have more entries in the fun list at the bottom of the itable.

Computing the Itable

Now we know what the itables look like, but where do they come from? Go's dynamic type conversions mean that it isn't reasonable for the compiler or linker to precompute all possible itables: there are too many (interface type, concrete type) pairs, and most won't be needed. Instead, the compiler generates a type description structure for each concrete type like Binary or int or func(map[int]string). Among other metadata, the type description structure contains a list of the methods implemented by that type. Similarly, the compiler generates a (different) type description structure for each interface type like Stringer; it too contains a method list. The interface runtime computes the itable by looking for each method listed in the interface type's method table in the concrete type's method table. The runtime caches the itable after generating it, so that this correspondence need only be computed once.

In our simple example, the method table for Stringer has one method, while the table for Binary has two methods. In general there might be ni methods for the interface type and nt methods for the concrete type. The obvious search to find the mapping from interface methods to concrete methods would take O(ni × nt) time, but we can do better. By sorting the two method tables and walking them simultaneously, we can build the mapping in O(ni + nt) time instead.

Memory Optimizations

The space used by the implementation described above can be optimized in two complementary ways.

First, if the interface type involved is empty—it has no methods—then the itable serves no purpose except to hold the pointer to the original type. In this case, the itable can be dropped and the value can point at the type directly:

请输入图片描述

Whether an interface type has methods is a static property—either the type in the source code says interface{} or it says interace{ methods... }—so the compiler knows which representation is in use at each point in the program.

Second, if the value associated with the interface value can fit in a single machine word, there's no need to introduce the indirection or the heap allocation. If we define Binary32 to be like Binary but implemented as a uint32, it could be stored in an interface value by keeping the actual value in the second word:
请输入图片描述

Whether the actual value is being pointed at or inlined depends on the size of the type. The compiler arranges for the functions listed in the type's method table (which get copied into the itables) to do the right thing with the word that gets passed in. If the receiver type fits in a word, it is used directly; if not, it is dereferenced. The diagrams show this: in the Binary version far above, the method in the itable is (*Binary).String, while in the Binary32 example, the method in the itable is Binary32.String not (*Binary32).String.

Of course, empty interfaces holding word-sized (or smaller) values can take advantage of both optimizations:
请输入图片描述

Method Lookup Performance

Smalltalk and the many dynamic systems that have followed it perform a method lookup every time a method gets called. For speed, many implementations use a simple one-entry cache at each call site, often in the instruction stream itself. In a multithreaded program, these caches must be managed carefully, since multiple threads could be at the same call site simultaneously. Even once the races have been avoided, the caches would end up being a source of memory contention.

Because Go has the hint of static typing to go along with the dynamic method lookups, it can move the lookups back from the call sites to the point when the value is stored in the interface. For example, consider this code snippet:

1   var any interface{}  // initialized elsewhere
2   s := any.(Stringer)  // dynamic conversion
3   for i := 0; i < 100; i++ {
4       fmt.Println(s.String())
5   }

In Go, the itable gets computed (or found in a cache) during the assignment on line 2; the dispatch for the s.String() call executed on line 4 is a couple of memory fetches and a single indirect call instruction.

In contrast, the implementation of this program in a dynamic language like Smalltalk (or JavaScript, or Python, or ...) would do the method lookup at line 4, which in a loop repeats needless work. The cache mentioned earlier makes this less expensive than it might be, but it's still more expensive than a single indirect call instruction.

Of course, this being a blog post, I don't have any numbers to back up this discussion, but it certainly seems like the lack of memory contention would be a big win in a heavily parallel program, as is being able to move the method lookup out of tight loops. Also, I'm talking about the general architecture, not the specifics o the implementation: the latter probably has a few constant factor optimizations still available.

More Information

The interface runtime support is in $GOROOT/src/pkg/runtime/iface.c. There's much more to say about interfaces (we haven't even seen an example of a pointer receiver yet) and the type descriptors (they power reflection in addition to the interface runtime) but those will have to wait for future posts.

Code

Supporting code (x.go):

package main

import (
 "fmt"
 "strconv"
)

type Stringer interface {
 String() string
}

type Binary uint64

func (i Binary) String() string {
 return strconv.Uitob64(i.Get(), 2)
}

func (i Binary) Get() uint64 {
 return uint64(i)
}

func main() {
 b := Binary(200)
 s := Stringer(b)
 fmt.Println(s.String())
}

Selected output of 8g -S x.go:

0045 (x.go:25) LEAL    s+-24(SP),BX
0046 (x.go:25) MOVL    4(BX),BP
0047 (x.go:25) MOVL    BP,(SP)
0048 (x.go:25) MOVL    (BX),BX
0049 (x.go:25) MOVL    20(BX),BX
0050 (x.go:25) CALL    ,BX

The LEAL loads the address of s into the register BX. (The notation n(SP) describes the word in memory at SP+n. 0(SP) can be shortened to (SP).) The next two MOVL instructions fetch the value from the second word in the interface and store it as the first function call argument, 0(SP). The final two MOVL instructions fetch the itable and then the function pointer from the itable, in preparation for calling that function.