Welcome To Golang By Example

Understanding Rune in Golang

Overview

rune in Go is an alias for int32 meaning it is an integer value. This integer value is meant to represent a Unicode Code Point. To understand rune you have to know what Unicode is. Below is short description but you can refer to the famous blog post about it –

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

What is Unicode

Unicode is a superset of ASCII characters which assigns a unique number to every character that exists. This unique number is called Unicode Code Point.For eg

Visit https://en.wikipedia.org/wiki/List_of_Unicode_characters to know about Unicode Point of other characters. But Unicode doesn’t talk about how these code points will be saved in memory. This is where utf-8 comes into picture

UTF-8

utf-8 saves every Unicode Point either using 1, 2, 3 or 4 bytes. ASCII points are stored using 1 byte. That is why rune is an alias for int32 because a Unicode Point can be of max 4 bytes in Go as source code in GO is encoded using utf-8, hence every string is also encoded in utf-8

Every rune is intended to refer to one Unicode Point. For eg if you print a string after typecasting it to a rune array then it will print the Unicode Point for each of character. For for below string “0b£” output will be – [U+0030 U+0062 U+00A3]

fmt.Printf("%U\n", []rune("0b£"))

When to Use

Declare Rune

A rune is declared using a character between single quotes like below declaring a variable named ‘rPound’

rPound := '£'

After declaring Rune you can perform below things as well

fmt.Printf("Type: %s\n", reflect.TypeOf(rPound))
fmt.Printf("Unicode CodePoint: %U\n", rPound)
fmt.Printf("Character: %c\n", r)

Code:

Below is the code illustrating each point we discussed

package main

import (
    "fmt"
    "reflect"
    "unsafe"
)

func main() {
    r := 'a'
    
    //Print Size
    fmt.Printf("Size: %d\n", unsafe.Sizeof(r))
    
    //Print Type
    fmt.Printf("Type: %s\n", reflect.TypeOf(r))
    
    //Print Code Point
    fmt.Printf("Unicode CodePoint: %U\n", r)
    
    //Print Character
    fmt.Printf("Character: %c\n", r)
    s := "0b£"
    
    //This will print the Unicode Points
    fmt.Printf("%U\n", []rune(s))
    
    //This will the decimal value of Unicode Code Point
    fmt.Println([]rune(s))
}

Output:

Size: 4
Type: int32
Unicode CodePoint: U+0061
Character: a
[U+0030 U+0062 U+00A3]
[48 98 163]

Rune array to string and vice versa

Rune array to string

package main

import "fmt"

func main() {
    runeArray := []rune{'a', 'b', '£'}
    s := string(runeArray)
    fmt.Println(s)
}

Output:

ab£

String to Rune Array

package main

import "fmt"

func main() {
    s := "ab£"
    r := []rune(s)
    fmt.Printf("%U\n", r)
}

Output:

[U+0061 U+0062 U+00A3]