7. Sloop run : Run preprocessing, model and postprocessing

The next step is to run the differents tasks of preprocessing, model and postprocessing

To do so, use sloop run.

Description

Available tasks for Croco

The differents tasks are listed in the following table

Note

All tasks can be launched in stand alone mode (as python binary)

Available tasks for Croco
Generic name of the task	Description	Python binary name
ibc-collect	Collect and format data from CDOCO PSY4 model	sloop-croco-run-ibc-0-collect.py
init	build an initial condition from psy4 model output	sloop-croco-run-ibc-1-init.py
obc	build boundary conditions from psy4 model output	sloop-croco-run-ibc-2-obc.py
atmfrc	collect and format data from CDOCO to build atmospheric forcing file	sloop-croco-run-atmfrc.py
nudging	Build the forcing file for Nudging run (extract on the whole Ogcm file)	sloop-croco-nudging
rivers	Format the rivers data from CDOCO format to Croco netcdf format	sloop-croco-run-rivers.py
model	Launch croco on PBS cluster	sloop-croco-run-model.py
postprod	Format outputs and store it	sloop-croco-run-postprod-0-format-data.py

Warning

sloop run has to be executed in the directory of the experiment created by sloop init.

Syntax

$ sloop run --help
usage: sloop run [-h] [--rundate RUNDATE] [-b BEGINDATE] [-e ENDDATE]
                 [--ncycle NCYCLE] [--freq FREQ] [-j JOB] [-w WORKFLOW]
                 [--exp-dir EXP_DIR] [--nodependency NODEPENDENCY]
                 [--res-xpid RES_XPID]
                 [--res-type {hindcast_free,hindcast_spnudge,nrt,None}]

optional arguments:
  -h, --help            show this help message and exit
  --rundate RUNDATE     rundate of the experiment
  -b BEGINDATE, --begindate BEGINDATE
                        begindate of the experiment
  -e ENDDATE, --enddate ENDDATE
                        enddate of the experiment
  --ncycle NCYCLE       number of cycle
  --freq FREQ           number of days between two cycles
  -j JOB, --job JOB     name of the job to be submitted
  -w WORKFLOW, --workflow WORKFLOW
                        name of the workflow to be submitted
  --exp-dir EXP_DIR     experiment directory
  --nodependency NODEPENDENCY
                        True or False
  --res-xpid RES_XPID   `path@location` or `xpid`
  --res-type {hindcast_free,hindcast_spnudge,nrt,None}
                        hindcast or near real time chaine

Run one task

There are 2 ways to launch the tasks listed above :

Launch one task with sloop that will send a PBS job

sloop --novortex -c croco_med.cfg run -j ibc-collect -b 2017-06-01,00:00 -e 2017-06-05,00:00

Note

“–novortex” option is mandatory for CROCO and use on Datarmor

Launch the task directly with the python routine in interactive mode

sloop-croco-run-ibc-0-collect.py exp_dir  -b 2017-06-01,00:00 -e 2017-06-05,00:00 --model glorys12v1 --grid croco_grd.nc

Note

for the second one you MUST be connected to a compute node (qsub -I command)

Chain your tasks with a workflow

There is a way to chain your tasks thanks to workflow capacity : env

First define your workflow like this one in a config file :

data=ibc-collect,atmfrc
preproc=init,obc
mod=model,
postprod=postprod,

The tasks in this file are those listed in the table on the top of the page : the names must be exactly the same as in the table
On each line on this file, the tasks have no dependencies
Each line (and all tasks contained in it) depend on the previous one and will be launched after

The workflow can be launched in several ways :

With a begin date and end date
-b 2017-06-01,12:00 -e 2017-06-05,00:00
With a number of cycle and a frequency of this cycle
--ncycle 4 --freq 7  (4 cycles of 7 days, frequency can be ommitted and will be 7 by default)
With the rundate of the experiment
--rundate  (==1 cycle of default freq)

Custom options

Template editing

The user might need to make a change in croco.in file before running : The template file is TPL/croco.in.tpl
Change the queues of tasks on Datarmor

If the user want to change for example the number of nodes on which Croco should run

The information is provided in the following file conf/croco_med.ini (med is the name of your application)

The names of the section match with those given in the table tasklist

What to do on Datarmor

Be sure to be logged on a interactive node and not on the loggin node
```
qsub -X -I -l mem=20g -l walltime=02:00:00
```
Load your sloop environment (see env)

Launch your workflow

sloop --novortex -c croco_med.cfg run -w workflow_med -b 2017-06-01,12:00 -e 2017-06-05,00:00

Launch one by one

sloop --novortex -c croco_med.cfg run -j ibc-collect -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j init -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j nudging -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j obc -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j atmfrc -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j model -b 2017-06-01,12:00 -e 2017-06-05,00:00
sloop --novortex -c croco_med.cfg run -j postprod -b 2017-06-01,12:00 -e 2017-06-05,00:00

Warning

Be carfeul of the format of the date. It Must be YYYY-MM-DD,HH:MM

TIPS

Be careful of the start hour of your IBC forcing model. It must match the start hour of your model. For example, glorys12v1 start at 12:00, so you must start your model at 12:00.
If you want to use your own initial file generated from an other script you just have to make a link or copy it in FORCING directory with this name (just put the right date)

ln -s my_ic_file.nc FORCING/croco_ic_20220503T1200.nc

To change the MPI procs :
- edit your config file croco_config.cfg
- Compile your model with sloop
- In your experiment directory, edit the conf/croco_conf.ini and adjust the queue’s name for Datarmor
Available models for IBC and Atm forcing are described in sloop/sloop/data directory. You’ll find a config file by model

Format data in postprocessing

If the user want to split the output files by dates and/or by area , it’s possible with the sloop-croco-run-postprod-0-format-data routine.

The options are given in the croco_conf.cfg file:

Either write the whole domain splitted in a file by date
Either split the file in sub area of interest and split it by date If the user want a subdomain he must fill a subsection in the cfg like this

[postprod]
postprod_data_dir="./DATA_OUTPUTS"
write_whole_domain=False
   [[zone1]]
    activate=True
    [[[bounds_lon]]]
       min = -4.2
       max = 2.0
    [[[bounds_lat]]]
       min = 35.4
       max = 40.6
   [[zone2]]
    activate=True
    [[[bounds_lon]]]
       min = 14.2
       max = 20.0
    [[[bounds_lat]]]
       min = 32.4
       max = 42.6

Where to see the results

Check sloop.log file to see if something went wrong
Check the logs directory that contains a log per task
Preprocessing : see FORCING directory that should contain all IBC/RIVERS/ATM/NUDGING files
Model run : for each a run a directory with the start and end dates simulation is created
Model outputs after postprod : Check DATA_OUTPUTS

Other features

MPI NO LAND

If the user want to optmize the number of cores to use on the computer he should use the MPI_NOLAND cpp key

Get the program in croco src directory (it’s called MPI_NOLAND)
Compile it
Fill the namelist with the maximum number of procs you want to use and the name of your croco grid file
run mpp_optimiz and get the optimized layout
You should find a result with a multiple of the number of cores in one node in your HPC (28 for Datarmor)
From the output file title you get the NP_XI,NP_ETA, N_NODES params and you copy it in your config file croco_med.cfg
Re compile the model with SLOOP by adding the MPI_NOLAND cppkey in your TPL/cppdefs.h.tpl file
Change the required number of nodes and MPI queue in conf/croco_med.ini in model section

AGRIF

What to do in case of Agrif configuration ?

Compile the model with AGRIF cpp keys
Define the number of childs grid by the variable nb_agrif in croco_config.cfg as shown below
Give the path of each grid in the croco_config.cfg with the number of the grid at the end of the name
Put the path of required (not mandatory) inputs files as shown in the full example below
Don’t forget to prepare a croco.in.tpl per grid and put in TPL directory if it’s not in your config directory (see sloop constants)
When launching sloop run init a initial file by grid will be build

[run_model]
[[croco_inputs]]
start_date="01-01-2016"
end_date="03-01-2016"
freq_rst=24
freq_avg=24
freq_output=72
nb_agrif=2
croco_forcing_dir= /home1/datawork/mcaillau/CROCO/GIBRALTAR_3GRIDS/INPUT/

croco_frc_grid= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc
croco_frc_grid1= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc.1
croco_frc_grid2= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc.2
croco_frc_tides=${croco_forcing_dir}/croco_frc_GFS_282.nc
croco_frc_tides1=${croco_forcing_dir}/croco_frc_GFS_282.nc.1
croco_frc_tides2=${croco_forcing_dir}/croco_frc_GFS_282.nc.2
croco_frc_blk=${croco_forcing_dir}/croco_blk_GFS_282.nc
croco_frc_blk1=${croco_forcing_dir}/croco_blk_GFS_282.nc.1
croco_frc_blk2=${croco_forcing_dir}/croco_blk_GFS_282.nc.2
croco_frc_clm=${croco_forcing_dir}/croco_clm_cmems-global_282.nc
croco_frc_clm1=${croco_forcing_dir}/croco_clm_cmems-global_282.nc.1
croco_frc_clm2=${croco_forcing_dir}/croco_clm_cmems-global_282.nc.2
croco_frc_ini=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc
croco_frc_ini1=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc.1
croco_frc_ini2=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc.2
croco_frc_obc=${croco_forcing_dir}/croco_bry_cmems-medsea_282.nc

croco_child_grids=${croco_frc_grid1} ${croco_frc_grid2}

[[tpl_files]]
croco_tpl_dir= /home1/datawork/mcaillau/CROCO/GIBRALTAR_3GRIDS/
croco_input_file=${croco_tpl_dir}/croco.in.tpl
croco_input_file1=${croco_tpl_dir}/croco.in.1.tpl
croco_input_file2=${croco_tpl_dir}/croco.in.2.tpl