Micro data center

The Micro Data Center is mainly dedicated for the development purposes. The users can test the software on different architectures, optimize it and finally evaluate the execution in the context of energy efficiency. The whole testbed consists of multiple monitoring tools that provide feedback on resource consumption. This page describes briefly the computational resources that are available for the user as well as the monitoring tools, from both software and hardware perspectives.

Micro Data Center

The description of our computational infrastructure is divided into two sections. The first section describes the high performance nodes, while the second section focuses on low power microservers. The useres may select either type of nodes for their experiments.

High performance Xeon-based nodes

The high performance infrastructure consists of two types of servers with different types of cooling. The first type is Direct Liquid Cooling (DLC) while the second is more traditional air-based cooling (AC). The DLC setup uses Ebullient equipment supporting two-phase cooling process based on liquid and vapor heat removal. At the server level DLC is applied to processors, GPUs and memory modules. The second type of servers (AC) use standard, fan-based scheme.

Image - dlc.jpg

This setup enables the users to test various research scenarios and answer corresponding questions.g.:

How the cooling system affects the temperature of the processing elements? How much energy needs to be consumed to cool down the servers in both scenarios (including air condition system)?
What is the efficiency of computations for different temperature levels? Is it possible to tweak up the frequency of the processor?
What is the energy efficiency of the computations? Does the temperature impact the actual performance? What frequencies result in the optimal energy efficiency?

These and many other problems can be investigated by scientists with our infrastructure. In particular, the following hardware is available. As described above, half of the system is cooled with air and the other half with liquid.

There are 16 blade-like nodes with the following configuration:

processors: 2x Intel Xeon E5-2600 v3, 8 cores, @2.6 GHz - 3.4GHz; Haswell architecture
memory: 64GB DDR4, 2133MHz
network: 2x10GbE

There are 2 server nodes with the following configuration:

processors: 2x Intel Xeon E5-2600 v3, 8 cores, @2.6 GHz - 3.4GHz; Haswell architecture
memory: 64GB DDR4, 2133MHz
network: 2x10GbE
accelerator: 2x NVIDIA Tesla K80

RECS®|Box - high density microservers

The second part of the system is based on the RECS®|Box servers from the Christmann company. RECS®|Box are high density multi-node servers that consist of up to 72 Apalis-based (or up to 18 COM-Express) microservers within one Rack Unit. To enable the user to have a fine-grained monitoring and control system, each RECS®|Box has a dedicated master-slave system of integrated micro-controllers that can gather different metrics directly without the need of polling every single node or its operation system. This enables us to gather many metrics like power usage, status and temperature for every node via only one request. The main advantage of this integrated monitoring system is to avoid the potentially significant overhead which could be caused by measuring and transferring data on the operating system level. Importantly, RECS®|Box can be equipped with a diverse set of computing nodes ranging from high performance Intel i7 or Xeon processors down to Intel Atom CPUs or even embedded ARMs.

Image - IMG_1585a.jpg

The LABEE testbed consists of 8 RECS®|Box systems equipped with diverse kinds of CPUs: Intel i7/i5, AMD Fusion, Intel Atom, ARMv7 as well as various GPU and FPGA accelerators. In addition to standard power and temperature sensors, more ambitious measurements are provided too, e.g. inlet and outlet air temperature for each microserver. This testbed provides very detailed information about energy and thermal impact of workloads and applied management techniques.

In particular it consists of:

RECS®|Box 3.0 Antares, 6 nodes with AMD R-464L
RECS®|Box 3.0 Antares, 6 nodes with Intel(R) Core(TM) i5-4400E + one NVIDIA Tesla M2070
RECS®|Box 3.0 Antares, 12 Toradex Apalis T30 nodes (with NVIDIA Tegra 3, quad-core ARM Cortex-A9) and 2 nodes with Apalis Exynos 5250 (dual-core ARM Cortex A-15)
RECS®|Box 3.0 Arneb, 4 nodes with Intel(R) Core(TM) i7-4700EQ and NVIDIA Tesla K40
RECS®|Box 3.0 Arneb, 24 Toradex Apalis T30 nodes (with NVIDIA Tegra 3, quad-core ARM Cortex-A9)
RECS®|Box 2.0 Arneb, 18 nodes with Intel(R) Atom N2600
RECS®|Box 2.0 Arneb, 18 nodes with AMD G-T40N
RECS®|Box 2.0 Arneb, 18 nodes with Intel(R) Core(TM) i7-3615QE (x14) and Intel(R) Core(TM) i7-2715QE (x4)

Less

Networking

The whole testbed is interconnected with 4 switches. The backbone network is based on 10Gb links, and individual nodes are connected to switches usually with 1Gb links.

The main features of the network:

there are two networks within the system, administration network and computational one
the networks are separated on the second layer (vlans) and the third layer; this helps to protect fragile parts of the system in case one of the servers being compromised
the user can be granted an accest to desired systems within the administrative network
there are 4 switches within the network, 3x 1Gb, with 10Gb backbone interconnect and one 10Gb switch
high performance Xeon servers are interconected within 10Gb network and connected to other switches with 10Gb link
1Gb switches are interconnected with 10Gb links, except one connection which had to be limited to 4Gb (trunk of 1Gb links), due to compatibility issues
the headnode which serves the filesystem for all the servers is connected to one of the 1Gb switches with 2x 10Gb links (trunk)
all the nodes (except high performance Xeon nodes) are connected to the network with 1Gb links
the general architecture of the network can be described as chain-like structure

Less

Monitoring tools

The presented infrastructure is equipped with tools that are capable of gathering information from different sensors and storing them in a unified way in a single database. To achieve this goal a number of scripts have been developed that collect data from different sources.

The data samples are collected from:

RECS®|Box 3.0 REST API: https://recswiki.christmann.info/wiki/doku.php?id=documentation:rest_api
NVIDIA driver (NVIDIA Management Library - NVML) - GPU load, memory, temperature
Intel Running Average Power Limit (RAPL) - estimates the power consumption in the case of the Intel-based systems
collectd/Linux - a collection of statistics gathered from the operating system
a collection of various sensors to monitor the tempreture, humidity, airflow, vibrations, air parameters, pressure of the fluid within the liquid cooling systems, and more...

The data is gathered from all the computational nodes and stored on a dedicated server using Graphite software (http://graphite.wikidot.com). This tool was designed to for storing time-series data, but it also has the possibility to render graphs on-demand. In order to provide both high sampling rate and long storage the data base is organized in a way that the metrics are remembered:

with 1 second resolution for 8 hours
with 5 second resolution for 1 day
with 1 minute resolution for 30 days
with 30 minute resolution for 1 year

Most of the statistics are gathered with 5s interval, but the most important ones (like power and temperature) are stored with 1s interval.

All the nodes have time synchronized with local ntp server to provide the accurate measurements.

Most of the interesting statistics are presented on the designated webpage (see: Tools->Viewer). The system was designed with clearness and usability as main objectives. It can be accessed on request for registered users. For more detailed statistics, authorized users can also access even more detailed parameters of the systems and design graphs on their own (via Grafana web service).

Less