LAGO and EOSC SYNERGY integrate cosmic ray data for the whole community

Antonio J. Rubio-Montero and Rafael Mayo-García tell us how the project will integrate the knowledge provided by this collaboration within EOSC.

What is LAGO?

The Latin American Giant Observatory (LAGO) is an extended cosmic ray observatory, currently composed of a network of ten water-Cherenkov detectors (WCD) spanning over different sites located at significantly different altitudes (from sea level up to more than 5,000m) and latitudes across Latin America. Due to their extreme locations, data coming from WCD must be safely stored in repositories. However these raw data are not directly usable, first it should be pre-processed to clean noise from measurements, and finally analysed to become publicly available to the scientific community after a small waiting period. Furthermore, the complete LAGO dataset not only refers to these direct measurements performed by detectors, but also to the simulation of different cosmic ray phenomena in some energy ranges of interest.

How is EOSC Synergy involved?

The final purpose is to enable the long-term curation and re-use of data within and outside LAGO through a Virtual Observatory.

Data storage and simulations have been arbitrary run by scientists in their computing resources. The EOSC SYNERGY service of LAGO will be based on exposing the repositories, CORSIKA (Cosmic Ray Simulation for Kascade), ROOT Data Analysis framework, and other hand-made codes to the EOSC paradigm.

LAGO is also a good example for the advanced usage of modern AAI (Authentication and Authorisation Infrastructures), based on the AARC Blueprint Architecture (https://aarc-community.org/architecture). As such LAGO AAI combines different services from different organisations with partially competitive service offerings.

●           User Membership Management: is provided by the GEANT-provided eduTEAMS (https://eduteams.org) service. Here eduTEAMS provides the role of the so called “Community-AAI”.

●           Infrastructure Access: In the context of EOSC-Synergy, the EGI Federated Cloud, and DataHub are used. This infrastructure is exclusively accessible via the EGI AAI, provided by the EGI Check-in service, that acts as an “Infrastructure Proxy”.

As a result, this allows communities that organise themselves in the eduTEAMS to access services that are available as an EGI Infrastructure. This possibility is based on Trust Models and an Architecture, that were refined and specified within the AARC Project (https://aarc-community.org).

LAGO’s thematic service is focused on providing a standardised way to curate and reuse measurements, analysis and simulations. To achieve this task, it follows the basic design recommended by EGI/EOSC for cloud: core intelligence packed in docker images, being able to automatically check, store and publish their results in DataHub, with enough metadata to be used by official harvesters (B2FIND), which will act as virtual observatories.  

As the whole computation is self-contained in the image, the production can be easily performed by services such as EC2/IM or even manually in private clusters.

LAGO thematic service includes or will include the following services listed in the EOSC marketplace:

●           EGI Check-in (through EduTeams Perun at GEANT):  it is needed for accessing any EOSC service, in particular for obtaining a OneData token. Managing the VO with Perun at GEANT was considered because of flexibility and their long-term support to Latin American users.

●           EGI DataHub:  OneData allows researchers several ways to access the data and metadata of their interest. Collaboration members can directly explore the directory tree at or mount it on their PC’s. Meanwhile, the general public will get published data through B2FIND. On the other hand, OneData eases storing results without modifying simulation/processing codes, as well as to maintain usable replicas around the world.

●           The EOSC Cloud services (IM and EC3) will be explored in the coming months to validate the deployment of batch or Kubernetes clusters.

Additionally, LAGO plans to explore other services such as B2FIND, B2HANDLE and DIRAC4EGI.

Keen to learn more about this? Follow us via our Twitter account where we’ll notify you on the latest news.