Archive?

Archive files, composed of one or more computer files along with metadata, are used to collect multiple data files together into a single file for easier portability & storage.

Tar?

According to the man pages:

GNU tar is an archiving program designed to store multiple files in a single file (an archive), and to manipulate such archives. The archive can be either a regular file or a device, e.g. a tape drive, hence the name of the program, which stands for (t)ape (ar)chiver, which can be located either on the local or on a remote machine.

The following is a snippet of Golang code that creates a tarball containing 2 text files:

package main

import "log"
import "archive/tar"
import "os"
import "io"


func check(e error) {
    if e != nil {
        panic(e)
    }
}

func main(){
     dest, err := os.OpenFile("data.tar", os.O_RDWR|os.O_CREATE, 0666)
     check(err)
     defer dest.Close()

     tw := tar.NewWriter(dest)
     defer tw.Close()

     paths := []string{
         "install.txt",
         "readme.txt",
     }

     for i := range paths {
         f, err := os.Open(paths[i])
         check(err)
         defer f.Close()

         if stat, err := f.Stat(); err == nil {
             header := new(tar.Header)
             header.Name = stat.Name()
             header.Size = stat.Size()
             header.Mode = int64(stat.Mode())
             header.ModTime = stat.ModTime()

             err = tw.WriteHeader(header)
             check(err)

             _, err := io.Copy(tw, f)
             check(err)

         }
    }

}


Error?

func check(e error) {
    if e != nil {
        panic(e)
    }
}

To handle, the idiomatic way, errors in Go you have to compare the returned error to nil. A nil value indicates that no error has occurred and a non nil value indicates the presence of an error. The check function will be used so that, if an error occurs it calls panic to print the error message, the stack trace and stop the execution.


Where do we want to create our tarball?

     dest, err := os.OpenFile("data.tar", os.O_RDWR|os.O_CREATE, 0666)
     check(err)
     defer dest.Close()

First, we'll open a file read-write, the file will be created if none exists. It's the destination file, where everything will be written by the Writer. The defer statement will make sure the dest file will be closed at the end of the execution.

Who will be writing to the destination?

     tw := tar.NewWriter(dest)
     defer tw.Close()

Writer is an interface that give us the Write method. The NewWriter method takes an argument that should implement the Writer interface. Nice, because OpenFile returns an object of type File. And a file object implement the Writer interface.

What are the files we want inside our tarball?

     paths := []string{
         "install.txt",
         "readme.txt",
     }

paths is an array of string. Each string represent the file we want inside our tarball, we assume those files exists in the current working directory. The expression on the right-hand side of the above statement is called a slice literal.

How do we add the files to the tarball?

Do we really need to comment what's going on?

     for i := range paths {
         f, err := os.Open(paths[i])
         check(err)
         defer f.Close()

         if stat, err := f.Stat(); err == nil {
             header := new(tar.Header)
             header.Name = stat.Name()
             header.Size = stat.Size()
             header.Mode = int64(stat.Mode())
             header.ModTime = stat.ModTime()

             err = tw.WriteHeader(header)
             check(err)

             _, err := io.Copy(tw, f)
             check(err)

         }
    }


For each file we want to add into the tarball:

To understand the for loop above, I hope the following picture will be worth a thousand words: Tar format
Source

Going further?

Tar file format doesn't feature any native data compression. so Tar archives are often compressed with an external utility like GZip, BZip2, XZ, & similar tools to reduce archive's size.

The following slightly modified script will create then compress the archive using GNU Zip:

package main

import "compress/gzip"
import "archive/tar"
import "os"
import "io"


func check(e error) {
    if e != nil {
        panic(e)
    }
}


func main(){
     dest, err := os.OpenFile("data.tar.gz", os.O_RDWR|os.O_CREATE, 0666)
     check(err)
     defer dest.Close()

     gw := gzip.NewWriter(dest)
     defer gw.Close()

     tw := tar.NewWriter(gw)
     defer tw.Close()

     paths := []string{
         "install.txt",
         "readme.txt",
     }

     for i := range paths {
         f, err := os.Open(paths[i])
         check(err)
         defer f.Close()

         if stat, err := f.Stat(); err == nil {
             header := new(tar.Header)
             header.Name = stat.Name()
             header.Size = stat.Size()
             header.Mode = int64(stat.Mode())
             header.ModTime = stat.ModTime()

             err = tw.WriteHeader(header)
             check(err)

             _, err := io.Copy(tw, f)
             check(err)
         }
    }
}


Double check?

We see that the not compressed archive is heavier than the compressed one:

-rw-rw-r-- 1 nsukami nsukami 1,2K août  11 06:28 create_tar.go
-rw-rw-r-- 1 nsukami nsukami 3,0K août  11 06:51 data.tar
-rw-rw-r-- 1 nsukami nsukami  168 août  11 12:38 data.tar.gz
-rw-rw-r-- 1 nsukami nsukami   26 août  10 12:32 install.txt
-rw-rw-r-- 1 nsukami nsukami   24 août  11 06:25 readme.txt

We can open a terminal and type the following command to check the content of our archive:

~/go/foo/bar
 tar -tf data.tar
install.txt
readme.txt

~/go/foo/bar
 tar -tf data.tar.gz
install.txt
readme.txt


godoc vs go doc? That is not the question:

According to the godoc documentation:

Godoc extracts and generates documentation for Go programs.

It has two modes.

Without the -http flag, it runs in command-line mode and prints plain text documentation to standard output and exits. If both a library package and a command with the same name exists, using the prefix cmd/ will force documentation on the command rather than the library package.

According to the go doc documentation:

go doc prints the documentation comments associated with the item identified by its arguments (a package, const, func, type, var, method, or struct field) followed by a one-line summary of each of the first-level items "under" that item (package-level declarations for a package, methods for a type, etc.).

The difference between the two is clearly and nicely summarized here

My first feelings:

  • $ godoc -http=:6060 to browse the documentation locally, nice! \o/
  • The recommended directory layout is strange, I won't deny!


Conclusion?

This is my first serious experiment with Golang. I think I want to learn a little bit more.


More on the topic:





Unexpected Quote:

"One is never afraid of the unknown; one is afraid of the known coming to an end." ― Jiddu Krishnamurti