---
layout: post
title: "Jare.io, an Instant and Free CDN"
date: 2016-03-30
place: Palo Alto, CA
tags: pets
description: |
If you want your images, script, or CSS files
be available via a CDN, use jare.io, an instant
and free service.
keywords:
- CDN
- simple CDN
- quick CDN
- fast CDN
- free CDN
social:
- hackernews: https://news.ycombinator.com/item?id=11394981
---
{% badge https://www.jare.io/images/logo.svg 92 https://www.jare.io %}
CDN stands for a Content Delivery Network. Technically, it is a bunch
of servers located in different countries and continents. You give them
your `logo.gif` and they give you a URL, which resolves to a different
server depending on who is trying to resolve it. As a result, the
file is always close to the end-user and your website loads much faster
than without a CDN. Sounds good, but all CDN providers want money
for their service and usually a rather complex setup and registration
procedure. My pet project [jare.io](https://www.jare.io) is a free
CDN that is simple to configure. It utilizes AWS CloudFront resources.
First, let me show how it works and then, if you're interested in
the details, I will explain how it's done internally. Say you have
this HTML:
```html
```
I want this `logo.svg` to be delivered via a CDN. There are two steps.
First, I register my domain at [jare.io](https://www.jare.io):
{% figure /images/2016/03/jare-1.png 600 %}
Second, I change my HTML:
```html
```
That's it.
Try it with your own resources and you will see how
much faster they will be loaded.
It's absolutely free, but I ask you to be reasonable. If your traffic
is huge, you need your own account in CloudFront or somewhere else. My
service is for small projects.
Now for more technical details, if you want to know how technically
this solution works. First, let's discuss what CDN is and how it works.
## URL, DNS, TCP, HTTP
When your browser wants to load an image, it has a [URL](https://en.wikipedia.org/wiki/Uniform_Resource_Locator)
for that, like in the example above. This is the URL: `https://www.teamed.io/image/logo.svg`.
There are three important parts in this address. First is `http`, the
[protocol](https://en.wikipedia.org/wiki/Communications_protocol).
Second is `www.teamed.io`, the [host](https://en.wikipedia.org/wiki/Host_%28network%29) name,
and the tail `/images/logo.svg`, which is the
[path](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax).
To load the image, the browser has to open a
[socket](https://en.wikipedia.org/wiki/Network_socket),
connecting your computer and the server, which has the image. To open a
socket, the browser needs to know the
[IP address](https://en.wikipedia.org/wiki/IP_address) of the server.
There is no such address in that URL. In order to find the IP address,
the browser is doing what is called a lookup. It connects to the
nearest [name server](https://en.wikipedia.org/wiki/Name_server)
and asks "what is the IP address of www.teamed.io?"
The answer usually contains a single IP address:
```text
$ nslookup www.teamed.io
Server: 172.16.0.1
Address: 172.16.0.1#53
Non-authoritative answer:
www.teamed.io canonical name = teamed.github.io.
teamed.github.io canonical name = github.map.fastly.net.
Name: github.map.fastly.net
Address: 199.27.79.133
```
IP address of `www.teamed.io` is `199.27.79.133`, at the time of writing.
When the address is known, the browser opens a new socket and sends
an [HTTP request](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_message)
through it:
```text
GET /images/logo.svg HTTP/1.1
Host: www.teamed.io
Accept: image/*
```
The server responds with an
[HTTP response](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Response_message):
```text
HTTP/1.1 200 OK
Content-Type: image/svg+xml
[SVG image content goes here, over 1000 bytes]
```
That is the [SVG](https://en.wikipedia.org/wiki/Scalable_Vector_Graphics)
image we're looking for. The browser renders it on the web page
and that's it.
## The Network of Edge Servers
So far so good, but if the distance between your browser and that IP address
is rather large, loading the image will take a lot of time. Well, hundreds
of milliseconds. Try to load this image, which is located on a server
that is hosted in Prague, Czech Republic (I'm using `curl` as suggested
[here](https://josephscott.org/archives/2011/10/timing-details-with-curl/)):
```text
$ curl -w "@f.txt" -o /dev/null -s \
https://www.vlada.cz/images/vlada/vlada-ceske-republiky_en.gif
time_namelookup: 0.005
time_connect: 0.376
time_pretransfer: 0.377
time_starttransfer: 0.566
----------
time_total: 0.567
```
I'm trying to do it from Palo Alto, California, which is about half a globe
away from Prague. As you can see, it takes over 500ms. That's too much, especially
if a web page contains many images. Overall, page loading may take seconds,
just because the server is too far away from me. Well, it will inevitably be
too far away from some users, no matter where we host it. If we host it here
in California, it will be close enough to me and the image will be loaded
instantly (less than 50ms). But then it will be too slow for users in Prague.
This problem has no solutions if the server generates images or pages
on the fly in some unique way and if we can't install a number of servers
in different countries and continents. But in most cases, such as our logo example, this is not a problem.
This logo doesn't need to be unique for
each user. It is a very _static_ resource, which needs to be created only once
and be delivered to everybody, without any changes.
So, how about we install a server somewhere here in California and let
Californian users connect to it. When a request for `logo.gif` comes
to one of the _edge_ servers, it will connect to the central server
in Prague and load the file. This will happen only once. After that, the
edge server will not request the file
from the central server. It will return it immediately, from its internal cache.
We need to have many edge servers, preferably in all countries where
our users may be located. The first request will take longer, but all others
will be much faster because they will be served from the closest edge server.
Now, the question is how the browser will know which edge server is the closest, right?
We simply trick the domain name resolution process. Depending on who is asking,
the DNS will give different answers. Let's take `cf.jare.io`, for example (it
is the name of all edge servers responsible for delivering our content
in AWS CloudFront, a CNAME for `djk1be5eatcae.cloudfront.net`).
If I'm looking it up from California, I'm getting the following answer:
```text
$ nslookup cf.jare.io
Server: 192.168.1.1
Address: 192.168.1.1#53
Non-authoritative answer:
cf.jare.io canonical name = djk1be5eatcae.cloudfront.net.
Name: djk1be5eatcae.cloudfront.net
Address: 54.230.141.211
```
An edge server with IP address `54.230.141.211` is located in
[San Francisco](https://db-ip.com/54.230.141.211). This
is rather close to me, less than fifty miles. If I do the same operation
from a server in Virginia, I get a different response:
```text
$ nslookup cf.jare.io
Server: 172.16.0.23
Address: 172.16.0.23#53
Non-authoritative answer:
cf.jare.io canonical name = djk1be5eatcae.cloudfront.net.
Name: djk1be5eatcae.cloudfront.net
Address: 52.85.131.217
```
An edge server with IP address `52.85.131.217` is located in
[Washington](https://db-ip.com/52.85.131.217), which is far away from
me, but very close to the server I was making the lookup from.
There are thousands of name servers around the world and all of them
have different information about where that edge server `cf.jare.io`
is physically located. Depending on who is asking, the answer will be different.
## AWS CloudFront
[CloudFront](https://aws.amazon.com/cloudfront/) is one of the simplest
CDN solutions. All you have to do to start delivering your content
through their edge nodes is to create a "distribution" and configure it.
A distribution is basically a connector between content origin and
edge servers:
{% plantuml style="width:75%" %}
skinparam componentStyle uml2
Browser -right-> [Edge]
[Edge] -right-> [Central]
[Central] -right-> [Origin]
{% endplantuml %}
One of edge servers receives an HTTP request. If it already has that
`logo.svg` in its cache, it immediately returns an HTTP response with its
content inside. If its cache is empty, the edge server makes an HTTP
request to the central server. This server knows about the "distribution"
and its configuration. It makes an HTTP connection to the origin server,
which is `www.teamed.io` and asks it to return `logo.svg`. When done,
the image is returned to the edge server, where it is cached.
This looks rather simple, but it's not free and it's not that quick to configure.
You have to create an account with CloudFront, register your credit card
there, and get an approval. Then you have to create a distribution and configure it.
You should then create that CNAME in your name server. If you're doing it
for a single website, it's not a big deal. If you have a dozen websites,
it's a time consuming operation.
## Jare.io, a Middle Man
Jare.io is an extra component in that diagram, which makes your
life easier:
{% plantuml style="width:75%" %}
skinparam componentStyle uml2
Browser -right-> [Edge]
[Edge] -right-> [Central]
[Central] -down-> [Relay]
[Relay] -right-> [Origin]
{% endplantuml %}
Jare.io has a "relay," which acts as an origin server for CloudFront. All
requests that arrive to `cf.jare.io` are dispatched to the relay. The relay
decides what to do with them. The decision is based on the information
from the HTTP request URI. For example, the request from the browser
has this URI path:
```text
/?u=https://www.teamed.io/images/logo.svg
```
Remember, the request is made to `cf.jare.io`, which is the address of
the edge server. This exact URI arrives at `relay.jare.io`. The URI contains
enough information to make a decision about which file has to be returned.
The relay makes a new HTTP request to `www.teamed.io` and retrieves the image.
The beauty of this solution is that it's easy. For small websites,
it is a free and quick CDN.
By the way, when we query the same image through jare.io (and CloudFront),
it comes back much faster:
```text
$ curl -w "@f.txt" -o /dev/null -s \
http://cf.jare.io/?u=www.vlada.cz/images/vlada/vlada-ceske-republiky_en.gif
time_namelookup: 0.005
time_connect: 0.021
time_pretransfer: 0.021
time_starttransfer: 0.041
----------
time_total: 0.041
```
Most of the work is done by AWS CloudFront, while jare.io is just a
relay that makes its configuration more convenient. Besides, it makes
it free, because jare.io is sponsored by
[Zerocracy](https://www.zerocracy.com). In other words, my company will
pay for your usage of CloudFront. I would appreciate if you kept that in
mind and didn't use jare.io for traffic-intensive resources.