Welcome address by MICIU, FCT, CSIC, EC and CESGA
This presentation provides an overview of the central role of
distributed data processing to support scientific excellence of
international collaboration in the past decade.
We present the architecture and governance model of EGI, the European infrastructure for exabyte-scale computing, and we demonstrate how open science has been benefiting from the power delivered by the EGI Federation,...
During the last years, Big Data technologies, and in particular Hadoop and HBase, have enabled us to expand enormously the information that we collect and store from all our servers and infrastructures. We no longer need to discard old data using round-robin-databases or restrict the number of active nagios-style checks.
Now we can take full advantage of a metric collection infrastructure...
Our institution, Port d'Informació Científica (PIC), is an innovative centre for supporting research and provides support to scientific groups working in projects which require large amount of computing resources for the analysis of massive sets of distributed data. PIC is the Spanish Tier-1 center for the Large Hadron Collider, the main (Tier-0) data center for the MAGIC telescopes and the...
GÉANT has carried out a European wide Framework Procurement for an Infrastructure as a Service (IaaS) cloud portfolio for the European research and education sector. The result was a multi-supplier framework whereby a number of IaaS cloud vendors were awarded framework contracts. Under this framework academic and research organizations from European Union countries can directly contract cloud...
Serverless computing is evolving from the initial Functions as a Service (FaaS) approach to also embrace the execution of containerised applications without the user managing the underlying computing infrastructure. Indeed, the main public cloud providers such as Amazon Web Services or Google Cloud have already started to offer services in this regard. This is the case of AWS Fargate or Google...
udocker (https://github.com/indigo-dc/udocker) is a tool that addresses the problematic of executing Linux containers in user space, i.e. without installing additional system software, without requiring administrative privileges and respecting resource usage policies, accounting and process controls. udocker empowers users to execute applications encapsulated in containers easily across a wide...
Virtualization technologies are a fundamental element in cloud computing. Docker is the most known and used container platform worldwide. It is designed for microservices virtualization and application delivery but its model does not fit well with High-Performance Computing (HPC) platforms. HPC environments are multi-user systems where users should only have access to their own data and...
Instituto Hidrografico is a Portuguese State Laboratory founded in 1960 which have as main mission the monitoring and study of the marine environment in order to support the Portuguese Navy and to contribute to the national development in the areas of Marine Sciences and Marine Technologies. The activity of Instituto Hidrografico covers domains such as hydrography/cartography, physical...
SOCIB (Balearic Islands Coastal Observing and Forecasting System, www.socib.es) is a coastal ocean observing and forecasting infrastructure located in the Western Mediterranean Sea. SOCIB collects and distributes data from near-shore to open ocean through the operation of multi-platform observing systems from fixed moorings, drifting buoys, research vessel, gliders, HF radar, animal tracking...
The computational resources necessary to address major environmental scientific questions are seldom available in-house, making shared e-infrastructures a well-suited medium for performing complex model simulations, analyzing large datasets and applying decision support tools. Despite this potential, the technical expertise required to use these computational resources and to build products on...
Following ITER, DEMO reactor is expected to demonstrate the feasibility of safe,
environmentally friendly and economically viable fusion power generation. During operation
of DEMO, the materials will be exposed to a particular hostile environment as a consequence
of the energetic neutrons created by fusion reactions in the plasma. The level of damage
expected in fusion conditions is such that...
The use of Artificial Intelligence (AI) over medical data allows the extraction of features associated to the disease from medical images using data-characterisation and modelling algorithms. The use of advanced machine learning algorithms is changing the way image processing is performed, evolving from analytic solutions to models built up with supervised training techniques working in...
The DEEP-Hybrid-DataCloud project researches on intensive computing
techniques such as deep learning, that require specialized GPU hardware
to explore very large datasets, through a hybrid-cloud approach that
enables the access to such resources. DEEP is built on User-centric
policy, i.e. we understand the needs of our user communities and help
them to combine their services in a way that...
INCD - National Distributed Computing Infrastructure is a Portuguese digital infrastructure designed to support the national scientific community, providing computing and storage services to the national scientific and academic community in all areas of knowledge. LNEC – National Laboratory for Civil Engineering is one of the partners that collaborate in this initiative, developing use cases...
Coastal systems are among the most productive ecosystems in the world, providing multiple resources and guaranteeing the resilience of the coastal communities. Climate change (e.g., sea level rise) represents a major threat to the world’s coastal systems, via potential increases in salinity, acceleration in the nutrients cycling and disruption of aquatic ecosystems. Also, recent and predicted...
Climate change (CC) adaptation plays an important role in city and services management and resilience building, targeting the mitigation and adaptation to potential hazards in urban areas. Information technologies can play a leading role to promote fast adoption of the most relevant measures towards CC preparedness. In this paper, a web application is presented with the objective of empowering...
We present a technology transfer project where Cloud Computing and Open Data play a crucial role. Our aim is to accurately and efficiently model data from the Spanish car insurance sector. Due to the vast amount of data and the complexity of the models, the use of Cloud Computing is needed to ensure not only an efficient but also a feasible implementation of the model. The system was deployed...
Road traffic is among the main sources of air pollution, and taking into account that air pollution causes 400 000 deaths per year, making it first environmental cause of premature death in Europe, environmental impacts of traffic are of major concern throughout many European metropolitan areas.
In February 2017, the European Commission warned five countries, among which Spain and Italy, of...
Releasing the “A set of Common Software Quality Assurance Baseline Criteria for Research Projects” document (hereby referred to as “SQA baseline criteria”) resulted from the need of filling up an uncovered gap in the European research software engineering ecosystem. This document sets a Software Quality Assurance (SQA) plan that maintains a pragmatic set of requirements, best practices and...
Presenting the goal of session and contributors
This presentation will provide an overview of the ongoing and new developments coming to 3 of the EGI computing services: Cloud Compute — which offers a federated multi-cloud IaaS — , Cloud Container Compute — which offers a Kubernetes-based platform for running docker applications — and Notebooks, a completely managed interactive computing service based on Jupyter.
With the latest missions launched by ESA or NASA, such as Sentinel or Landsat, equipped with the latest technologies in multispectral sensors, we face an unprecedented amount of satellite data never reached before. Exploring the potential of this data with state-of-the-art Artificial Intelligence techniques such as “Deep Learning" could potentially change the way we understand the Earth system...
The presentation provides an overview of the requirements gathered during the Data Management Workshop where XDC, ESCAPE and EGI met three important user communities to design with them some Research Infrastructure specific solutions and pilot activities. After this the EGI data-related services and their status are presented and it eventually looks forward presenting some scouting activities...
The CERN analysis preservation portal (CAP) comprises a set of tools and services aiming to assist researchers in describing and preserving all the components of a physics analysis such as data, software and computing environment. Together with the associated documentation, all these assets are kept in one place so that the analysis can be fully or partially reused even several years after the...
The EGI Check-in service is an Identity and Access Management solution that makes it easy to secure access to services and resources. Check-in is one of the enabling services for the EOSC-hub AAI following the architectural and policy recommendations defined in the AARC project. Through Check-in, users are able to authenticate with the credentials provided by the IdP of their Home Organisation...
The DEEP-Hybrid-DataCloud is providing a set of comprehensive services for machine learning and deep learning, allowing scientists to train, test, evaluate, share and exploit their models over distributed e-Infrastructures. New advancements, will be presented and described, future exploitation of the solutions proposed
High Energy Physics is a big data task that requires modern data science tools for storage, processing and analyzes. In this contribution we aim to overview the applications of machine learning, namely the modern deep learning approach, to aid research in collider physics and related topics. More specifically we will show how Convolutional Neural Networks can help us learn about new...
The eXtreme DataCloud (XDC) project is aimed at developing data management services capable to cope with very large data resources allowing the future e-infrastructures to address the needs of the next generation extreme scale scientific experiments. Started in November 2017, XDC is combining the expertise of 8 large European research organisations, the project aims at developing scalable...
Being one of the largest international scientific collaborations, CMS faces many challenges. To serve the computational needs of every researcher working around the world within the Collaboration, CMS relies on distributed computing technology for both computing power and data storage. The Large Hadron Collider (LHC) schedule alternates between data-taking periods and long shutdowns for...
Overview of the EOSC-hub Technology Committee (TCOM) work done for the Software Quality Assurance area with special focus on the EOSC-hub technical workshop that served as input for EOSC architecture and service roadmap.
Following ITER, DEMO reactor is expected to demonstrate the feasibility of safe,
environmentally friendly and economically viable fusion power generation. During operation
of DEMO, the materials will be exposed to a particular hostile environment as a consequence
of the energetic neutrons created by fusion reactions in the plasma. The level of damage
expected in fusion conditions is such that...
Data analysis in climate has been traditionally done in two different environments, local workstations and HPC infrastructures. Local workstations provide a non scalable environment in which data analysis is restricted to small datasets that are previously downloaded. On the other hand, HPC infrastructures provide high computation capabilities by making use of parallel file systems and...