ep-tcp-file-system

Example Project for a Go File System server that broadcasts to Android clients.

View the Project on GitHub tobiasbriones/ep-tcp-file-system

TCP/IP Protocol

A general framework of TCP networking applications is given as an insight to refresh basic networking concepts useful for this project.

Introduction

Computer applications need to send data to other computers in order to transform the information into something useful in the other end of the communication. For this to be possible, it’s key to understand the standards that govern the internet, a.k.a. the biggest network, or the network of networks.

The concerning standard this time is TCP/IP that allows to send data streams from one computer or device to another via the internet.

Many considerations must be taken to undertake these designs in real life, like the 7-layer network architecture 1, and the error detection and correction 2.

We should also understand how data is packet, encoded, transmitted. So for example, we can design a protocol that adds a line break \n to tell the receiver to split the data into tokens delimited by the line feed character.

Protocol Definition

The TCP/IP protocols are fundamental for transferring data over the internet. These are detailed below.

TCP

TCP stands for Transfer Control Protocol, and it is the standard that make possible sending large amounts of data over the internet. It can be implemented by any programmer, and it is the basis of sending data over the network.

This protocol works with the IP protocol to transport data over the network.

Data are separated into fragments called packets, these packets are transmitted over the network and glued together in the receiver to create the original information.

Packets are sent via different mediums, some are faster or shorter than others, and can be traced with a technique called packet tracer. Packet tracing is something done in networking courses, for example, employing the Cisco Packet Tracer software.

Be careful as TCP works with data streams, so one packet sent does not mean one packet received.

Handshaking

Handshaking is a signal that is sent from one computer to another to establish a contract. So for example, we can start the communication with one handshake, and end it when we finish the data transaction with another handshake.

Handshakes are commonly used in web sockets connections. They are important here to avoid the overhead of sending HTTP requests for each message. The overhead is only in certain parts (beginning, end) of the communication instead.

IP

IP stands for Internet Protocol, and it is the standard to send those packages to the correct destination address. The currently used version of this protocol is IPv4, but IPv6 is the future as it allows practically and infinite amount of addresses.

If we use IPv6, we won’t have to give private addresses to local devices with DHCP, they can have their own IP address instead. IPv4 only supports an octet or one byte in the following structure xxxx.xxxx.xxxx.xxxx so that is the address of the receiver that the IP protocol will send that data forward.

TCP vs UDP

The other established protocol UDP (User Datagram Protocol) is used to send datagrams that allow faster real time communications. They can drop information to be able to send data faster, so common applications are video meetings or live-streaming when we don’t care about some pixels lost.

TCP has to assemble the packages sent over the network and perform error correction. Another protocol for error correction is the SCTP (Stream Control Transmission Protocol)

FTP

FTP (File Transfer Protocol) is used to send files easily via TCP. They have users with password login, but it is not secure. The SFTP (SSH File Transfer Protocol) is to be used for secure applications instead.

Web Socket

Web Socket is the technology used for client-server real-time communication. It performs handshakes and enables a port to open to establish the communication.

Web Sockets applications can vary. They’re commonly used for chats, social media, collaborative tools, multiplayer games, etc.

TCP-Based Protocol

TCP is the best spec to use to transfer data because of its standardization and integrity. Data is going to be alright due to that integrity that TCP verifies.

In order to make use of this protocol, we need to design a contract to make machines able to communicate in the same language.

A use case of TCP (check the bibliography) to send bytes or data can be too low-level like loading buffer arrays and sending these.

Now we can create an abstraction over the above idea, that is, our protocol.

A system that has defined data types, payload, errors, types of messages, etc., is the abstraction that will allow the computers to adequately understand the communication.

One important part of this implementation is IO blocking. Concurrent programming is key for network applications as we don’t want to block the application thread when reading from disk or database and managing concurrent users.

Sending and receiving messages is almost the same.

There are many built in abstractions like buffer readers (e.g. Java’s BufferedReader) and third-party libraries for implementing concurrent networking applications.

With this insight, a TCP-based protocol can be designed.

Bibliography

  1. The 7-layer architecture is more academic than real, a more pragmatic approach is taken in real implementations 

  2. Read my course project Reed-Muller Codes