Sistemas Informáticos Europeos, has recently installed a new cluster for the Center for Advanced Studies in ICT (CEATIC) of the University of Jaén. The funds come from the last competitive call for the State Subprogram for Research Infrastructures and Scientific-Technical Equipment of the Ministry of Science, Innovation and Universities, co-financed with FEDER funds.
The report was presented by Drs. Francisco R. Feito Higueruela and Luis A. Ureña López, with the collaboration of Juan M. Jurado, Master's in Computer Engineering. The system will allow progress in the various lines of research at CEATIC:
The cluster was assigned to Sistemas Informáticos Europeos on a competitive basis, having considered the best offer.
The implemented computing system has low latency Infiniband connectivity, through Mellanox FDR cards and switch at 56 Gb/s speed, which allows computing between different nodes as if it were the same system. The tools of Ladon OS 7 v12 are adapted to the new Intel Xeon Cascadelake processors on Centos 7.6 and allow you to get the most out of the system, while monitoring all the control parameters in terms of temperature, disk capacity, speed of fan rotation etc. and implemented on the IPMI 3.3 protocol with the new generation Gigabyte platforms and the GSM (Gigabyte Server Management) control system, compatible with the new Red Fish standard.
The complete cluster provides a computing capacity of 133.632 CUDA cores, provided by Tesla Volta and RTX Turing cards, which are managed with the tools available in CUDA 10.1, compatible with both generations.
From the point of view of HPC, since it is focused on carrying out calculations on GPUs, it is important to highlight that it provides in this technology, total calculation powers of more than 400 Tflops. For Machine Learning and Deep Learning calculations, the cards provided also have Tensor Cores, bringing a total of more than 3,4 Exaflops to the system.
This allows facing both problems of the CEATIC of the University of Jaén and research in the areas of Artificial Intelligence, Big Data and Analytics, providing new knowledge to companies and society in general.
Thanks to the RDMA protocol supported by Mellanox cards since the ConnectX-3 generation, it is even possible to calculate by adding the GPU cores found in different computers.
The cluster has a storage capacity of 80 TB in the Home to safeguard the results, with RAID 6 security (allows the failure of up to 2 hard drives without RAID degradation) and redundant power supplies.
Another advantage provided by the SIE Ladon system is the choice of platforms Gigabyte G291-281, assembled and tested by our company as System Integrator in Spain, which guarantees the quality of all the components and allows us to offer a standard 3-year on-site warranty (at no additional cost) and an optional extension to 5 years. The choice of high quality components from companies such as Intel, PNY, Micron/Crucial, allow us to offer these quality standards, with a very low incidence of failures.
In addition, the chosen platforms provide three fundamental advantages:
This project has been possible, thanks to the collaboration of the teams of professionals from the University of Jaén and Sistemas Informáticos Europeos, who have jointly defined the needs to be covered, the deployment schedule, training (which will be carried out during this month), communication and computing needs, etc.
All of this makes it possible to offer the customer turnkey solutions that maximize the best performance of the computer systems and reduce the start-up time and optimization of the system, so that it offers the department the possibility of working from the first day.
SIE generates successful projects that are fully operational and in production in the record time of 2 months, being a reference brand in the Scientific Calculation environment, both in Spain and in other parts of the world.