Shuhei Kagawa

2017 in Review

Dec 25, 2017


I moved to Berlin from Tokyo at the end of September 2016. 2017 was my almost first year in Berlin.

I like the city so far. It is more relaxed than Tokyo and other big cities in Europe. Summer is especially nice. BBQ makes it even better. After my office moved to a building in front of Spree River, I enjoy my commute crossing Oberbaum Bridge and walking along the river.


I traveled more than ever. The destinations were Germany (Dresden, Heidelberg, Frankfurt, Köln), Italy (Venice, Florence, Bologna), France (Paris), UK (London), Portugal (Lisbon) and Japan (Tokyo). I had fun in each of them, but if I have to choose one, I will name Lisbon. The city is full of what I miss in Berlin. Fresh and inexpensive seafood, views from hills, cute ceramic tiles, and beautiful weather. The sky was clear on every single day while I was there, and the highest temperature was 18 degrees in December!


I am glad to have found Fuerst Wiacek. Their German Movies is my No.1 beer so far. Biererei is a gem in Berlin, where I can buy fresh craft beers from Europe with growlers.

British ale was a discovery to me. I liked pubs in London a lot. I also attended the first craft beer hackathon in the world and won 12 crates of craft beer...!


I bought an ergonomic keyboard and a neck support pillow. Both of them lifted up my quality of life. My body is getting older.

Language Learning

I learned a bit of German Language. I finished A1 in May and started A2 after a pause of 5 months. While the learning process is prolonged, now German feels less cryptic to me.


I was lucky to join an awesome team. We work together and hang out together. A research at Google shows that psychological safety is a key to team effectiveness. I feel it on my team.

On the technical side, my team joined a relatively large project and completed it on time. I worked mostly in architecture, performance optimization, type checking with Flow, SRE, etc. for apps with React and Node.js. I also helped my colleagues to start building an internal tool with Elm.

Side Projects

I enjoyed working with Elm. I wrote a mobile weather app, flew to Paris for Elm Europe 2017, built a mobile-friendly pixel editor and talked about it at Elm Berlin Meetup. I also helped an experiment of its compiler-side in Haskell, although it is still pending.

I didn't do much with JavaScript for side projects but wrote a tiny library for server-side rendering with tagged template literals while hanging out with friends at a cafe. It's used in the choo ecosystem now.

Aside from building things, I learned monad transformers, etc. from Haskell Book and machine learning with neural networks from Deep Learning Specialization on Coursera.


After all, I lived a year in a new country and enjoyed it. I have settled down, and now I feel prepared for new challenges next year. Let's see what is going to happen!

Getting Memory Usage in Linux and Docker

May 28, 2017 - Linux, Docker

Recently I started monitoring a Node.js app that we have been developing at work. After a while, I found that its memory usage % was growing slowly, like 20% in 3 days. The memory usage was measured in the following Node.js code.

const os = require('os');

const total = os.totalmem();
const free = os.freemem();
const usage = (free - total) / total * 100;

So, they are basically from OS, which was Alpine Linux on Docker in this case. Luckily I also had memory usages of application processes recorded, but they were not increasing. Then why is the OS memory usage increasing?

Buffers and Cached Memory

I used top command with Shift+m (sort by memory usage) and compared processes on a long-running server and ones on a newly deployed server. Processes on each side were almost same. The only difference was that buffers and cached Mem were high on the long-running one.

After some research, or googling, I concluded that it was not a problem. Most of buffers and cached Mem are given up when application processes claim more memory.

Actually free -m command provides a row for used and free taking buffers and cached into consideration.

$ free -m
             total  used  free  shared  buffers cached
Mem:          3950   285  3665     183       12    188
-/+ buffers/cache:    84  3866
Swap:         1896     0  1896

So, what are they actually? According to the manual of /proc/meminfo, which is a pseudo file and the data source of free, top and friends:

Buffers %lu
       Relatively temporary storage for raw disk blocks that
       shouldn't get tremendously large (20MB or so).

Cached %lu
       In-memory cache for files read from the disk (the page
       cache).  Doesn't include SwapCached.

I am still not sure what exactly Buffers contains, but it contains metadata of files, etc. and it's relatively trivial in size. Cached contains cached file contents, which are called page cache. OS keeps page cache while RAM has enough free space. That was why the memory usage was increasing even when processes were not leaking memory.

If you are interested, What is the difference between Buffers and Cached columns in /proc/meminfo output? on Quora has more details about Buffers and Cached.


So, should we use free + buffers + cached? /proc/meminfo has an even better metric called MemAvailable.

MemAvailable %lu (since Linux 3.14)
       An estimate of how much memory is available for
       starting new applications, without swapping.
$ cat /proc/meminfo
MemTotal:        4045572 kB
MemFree:         3753648 kB
MemAvailable:    3684028 kB
Buffers:           13048 kB
Cached:           193336 kB

Its background is explained well in the commit in Linux Kernel, but essentially it excludes non-freeable page cache and includes reclaimable slab memory. The current implementation in Linux v4.12-rc2 still looks almost same.

Some implementation of free -m have available column. For example, on Boot2Docker:

$ free -m
       total  used  free  shared  buff/cache  available
Mem:    3950    59  3665     183         226       3597
Swap:   1896     0  1896

It is also available on AWS CloudWatch metrics via --mem-avail flag.

Some background about Docker

My another question was "Are those metrics same in Docker?". Before diving into this question, let's check how docker works.

According to Docker Overview: The Underlying Technology, processes in a Docker container directly run in their host OS without any virtualization, but they are isolated from the host OS and other containers in effect thanks to these Linux kernel features:

  • namespaces: Isolate PIDs, hostnames, user IDs, network accesses, IPC, etc.
  • cgroups: Limit resource usage
  • UnionFS: Isolate file system

Because of the namespaces, ps command lists processes of Docker containers in addition to other processes in the host OS, while it cannot list processes of host OS or other containers in a docker container.

By default, Docker containers have no resource constraints. So, if you run one container in a host and don't limit resource usage of the container, and this is my case, the container's "free memory" is same as the host OS's "free memory".

Memory Metrics on Docker Container

If you want to monitor a Docker container's memory usage from outside of the container, it's easy. You can use docker stats.

$ docker stats
fc015f31d9d1  0.00%  220KiB / 3.858GiB  0.01%  1.3kB / 0B  0B / 0B    2

But if you want to get the memory usage in the container or get more detailed metrics, it gets complicated. Memory inside Linux containers describes the difficulties in details.

/proc/meminfo and sysinfo, which is used by os.totalmem() and os.freemem() of Node.js, are not isolated, you get metrics of host OS if you use normal utilities like top and free in a Docker container.

To get metrics specific to your Docker container, you can check pseudo files in /sys/fs/cgroup/memory/. They are not standardized according to Memory inside Linux containers though.

$ cat /sys/fs/cgroup/memory/memory.usage_in_bytes
$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes

memory.limit_in_bytes returns a very big number if there is no limit. In that case, you can find the host OS's total memory with /proc/meminfo or commands that use it.


It was a longer journey than I initially thought. My takeaways are:

  • Available Memory > Free Memory
  • Use MemAvailable if available (pun intended)
  • Processes in a Docker container run directly in host OS
  • Understand what you are measuring exactly, especially in a Docker container

HTTP request timeouts in JavaScript

May 14, 2017 - JavaScript

These days I have been working on a Node.js front-end server that calls back-end APIs and renders HTML with React components. In this microservices setup, I am making sure that the server doesn't become too slow even when its dependencies have problems. So I need to set timeouts to the API calls so that the server can give up non-essential dependencies quickly and fail fast when essential dependencies are out of order.

As I started looking at timeout options carefully, I quickly found that there were many different kinds of timeouts even in the very limited field, HTTP request with JavaScript.

Node.js "http"/"https"

Let's start with the standard library of Node.js. http and https provide request() function, which makes HTTP requests.

Timeouts on http.request()

http.request() takes a timeout option. Its documentation says:

timeout <number>: A number specifying the socket timeout in milliseconds. This will set the timeout before the socket is connected.

So what does it actually do? It internally calls net.createConnection() with its timeout option, which eventually calls socket.setTimeout() before the socket starts connecting.

There is also http.ClientRequest.setTimeout(). Its documentation says:

Once a socket is assigned to this request and is connected socket.setTimeout() will be called.

So this also calls socket.setTimeout().

Either of them doesn't close the connection when the socket timeouts but only emits a timeout event.

So, what does socket.setTimeout() do? Let's check.


The documentation says:

Sets the socket to timeout after timeout milliseconds of inactivity on the socket. By default net.Socket does not have a timeout.

OK, but what does "inactivity on the socket" exactly mean? In a happy path, a TCP socket follows the following steps:

  1. Start connecting
  2. DNS lookup is done: lookup event (Doesn't happen in HTTP Keep-Alive)
  3. Connection is made: connect event (Doesn't happen in HTTP Keep-Alive)
  4. Read data or write data

When you call socket.setTimeout(), a timeout timer is created and restarted before connecting, after lookup, after connect and each data read & write. So the timeout event is emitted on one of the following cases:

  • DNS lookup doesn't finish in the given timeout
  • TCP connection is not made in the given timeout after DNS lookup
  • No data read or write in the given timeout after connection, previous data read or write

This might be a bit counter-intuitive. Let's say you called socket.setTimeout(300) to set the timeout as 300 ms, and it took 100 ms for DNS lookup, 100 ms for making a connection with a remote server, 200 ms for the remote server to send response headers, 50 ms for transferring the first half of the response body and another 50 ms for the rest. While the entire request & response took more than 500 ms, timeout event is not emitted at all.

Because the timeout timer is restarted in each step, timeout happens only when a step is not completed in the given time.

Then what happens if timeouts happen in all of the steps? As far as I tried, timeout event is triggered only once.

Another concern is HTTP Keep-Alive, which reuses a socket for multiple HTTP requests. What happens if you set a timeout for a socket and the socket is reused for another HTTP request? Never mind. timeout set in an HTTP request does not affect subsequent HTTP requests because the timeout is cleaned up when it's kept alive.

HTTP Keep-Alive & TCP Keep-Alive

This is not directly related to timeout, but I found Keep-Alive options in http/https are a bit confusing. They mix HTTP Keep-Alive and TCP Keep-Alive, which are completely different things but coincidentally have the same name. For example, the options of http.Agent constructor has keepAlive for HTTP Keep-Alive and keepAliveMsecs for TCP Keep-Alive.

So, how are they different?

  • HTTP Keep-Alive reuses a TCP connection for multiple HTTP requests. It saves the TCP connection overhead such as DNS lookup and TCP slow start.
  • TCP Keep-Alive closes invalid connections, and it is normally handled by OS.


http/https use socket.setTimeout() whose timer is restarted in stages of socket lifecycle. It doesn't ensure a timeout for the overall request & response. If you want to make sure that a request completes in a specific time or fails, you need to prepare your own timeout solution.

Third-party modules

"request" module

request is a very popular HTTP request library that supports many convenient features on top of http/https module. Its README says:

timeout - Integer containing the number of milliseconds to wait for a server to send response headers (and start the response body) before aborting the request.

However, as far as I checked the implementation, timeout is not applied to the timing of response headers as of v2.81.1.

Currently this module emits the two types of timeout errors:

  • ESOCKETTIMEDOUT: Emitted from http.ClientRequest.setTimeout() described above, which uses socket.setTimeout().
  • ETIMEDOUT: Emitted when a connection is not established in the given timeout. It was applied to the timing of response headers before v2.76.0.

There is a GitHub issue for it, but I'm not sure if it's intended and the README is outdated, or it's a bug.

By the way, request provides a useful timing measurement feature that you can enable with time option. It will help you to define a proper timeout value.

"axios" module

axios is another popular library that uses Promise. Like request module's README, its timeout option timeouts if the response status code and headers don't arrive in the given timeout.

Browser APIs

While my initial interest was server-side HTTP requests, I become curious about browser APIs as I was investigating Node.js options.


XMLHttpRequest.timeout aborts a request after the given timeout and calls ontimeout event listeners. The documentation does not explain the exact timing, but I guess that it is until readyState === 4, which means that the entire response body has arrived.


As far as I read fetch()'s documentation on MDN, it does not have any way to specify a timeout. So we need to handle by ourselves. We can do that easily using Promise.race().

function withTimeout(msecs, promise) {
  const timeout = new Promise((resolve, reject) => {
    setTimeout(() => {
      reject(new Error('timeout'));
    }, msecs);
  return Promise.race([timeout, promise]);

withTimeout(1000, fetch(''))

This kind of external approach works with any HTTP client and timeouts for the overall request and response. However, it does not abort the underlying HTTP request while preceding timeouts actually abort HTTP requests and save some resources.


Most of the HTTP request APIs in JavaScript doesn't offer timeout mechanism for the overall request and response. If you want to limit the maximum processing time for your piece of code, you have to prepare your own timeout solution. However, if your solution relies on a high-level abstraction like Promise and cannot abort underlying TCP socket and HTTP request when timeout, it is nice to use an existing low-level timeout mechanisms like socket.setTimeout() together to save some resources.