7. Sloop run : Run preprocessing, model and postprocessing

The next step is to run the differents tasks of preprocessing, model and postprocessing

To do so, use sloop run.

Description

Available tasks for Croco

The differents tasks are listed in the following table

Note

All tasks can be launched in stand alone mode (as python binary)

Available tasks for Croco

Generic name of the task

Description

Python binary name

ibc-collect

Collect and format data from CDOCO PSY4 model

sloop-croco-run-ibc-0-collect.py

init

build an initial condition from psy4 model output

sloop-croco-run-ibc-1-init.py

obc

build boundary conditions from psy4 model output

sloop-croco-run-ibc-2-obc.py

atmfrc

collect and format data from CDOCO to build atmospheric forcing file

sloop-croco-run-atmfrc.py

nudging

Build the forcing file for Nudging run (extract on the whole Ogcm file)

sloop-croco-nudging

rivers

Format the rivers data from CDOCO format to Croco netcdf format

sloop-croco-run-rivers.py

model

Launch croco on PBS cluster

sloop-croco-run-model.py

postprod

Format outputs and store it

sloop-croco-run-postprod-0-format-data.py

Warning

sloop run has to be executed in the directory of the experiment created by sloop init.

Syntax

$ sloop run --help
usage: sloop run [-h] [--rundate RUNDATE] [-b BEGINDATE] [-e ENDDATE]
                 [--ncycle NCYCLE] [--freq FREQ] [-j JOB] [-w WORKFLOW]
                 [--exp-dir EXP_DIR] [--nodependency NODEPENDENCY]
                 [--res-xpid RES_XPID]
                 [--res-type {hindcast_free,hindcast_spnudge,nrt,None}]

optional arguments:
  -h, --help            show this help message and exit
  --rundate RUNDATE     rundate of the experiment
  -b BEGINDATE, --begindate BEGINDATE
                        begindate of the experiment
  -e ENDDATE, --enddate ENDDATE
                        enddate of the experiment
  --ncycle NCYCLE       number of cycle
  --freq FREQ           number of days between two cycles
  -j JOB, --job JOB     name of the job to be submitted
  -w WORKFLOW, --workflow WORKFLOW
                        name of the workflow to be submitted
  --exp-dir EXP_DIR     experiment directory
  --nodependency NODEPENDENCY
                        True or False
  --res-xpid RES_XPID   `path@location` or `xpid`
  --res-type {hindcast_free,hindcast_spnudge,nrt,None}
                        hindcast or near real time chaine

Run one task

There are 2 ways to launch the tasks listed above :

  • Launch one task with sloop that will send a PBS job

    sloop --novortex -c croco_med.cfg run -j ibc-collect -b 2017-06-01,00:00 -e 2017-06-05,00:00
    

Note

“–novortex” option is mandatory for CROCO and use on Datarmor

  • Launch the task directly with the python routine in interactive mode

    sloop-croco-run-ibc-0-collect.py exp_dir  -b 2017-06-01,00:00 -e 2017-06-05,00:00 --model glorys12v1 --grid croco_grd.nc
    

Note

for the second one you MUST be connected to a compute node (qsub -I command)

Chain your tasks with a workflow

There is a way to chain your tasks thanks to workflow capacity : env

  • First define your workflow like this one in a config file :

data=ibc-collect,atmfrc
preproc=init,obc
mod=model,
postprod=postprod,
  • The tasks in this file are those listed in the table on the top of the page : the names must be exactly the same as in the table

  • On each line on this file, the tasks have no dependencies

  • Each line (and all tasks contained in it) depend on the previous one and will be launched after

  • The workflow can be launched in several ways :

    • With a begin date and end date

      -b 2017-06-01,12:00 -e 2017-06-05,00:00
      
    • With a number of cycle and a frequency of this cycle

      --ncycle 4 --freq 7  (4 cycles of 7 days, frequency can be ommitted and will be 7 by default)
      
    • With the rundate of the experiment

      --rundate  (==1 cycle of default freq)
      

Custom options

  • Template editing

    The user might need to make a change in croco.in file before running : The template file is TPL/croco.in.tpl

  • Change the queues of tasks on Datarmor

    If the user want to change for example the number of nodes on which Croco should run

    The information is provided in the following file conf/croco_med.ini (med is the name of your application)

    The names of the section match with those given in the table tasklist

What to do on Datarmor

  1. Be sure to be logged on a interactive node and not on the loggin node

    qsub -X -I -l mem=20g -l walltime=02:00:00
    
  2. Load your sloop environment (see env)

  3. Launch your workflow

    sloop --novortex -c croco_med.cfg run -w workflow_med -b 2017-06-01,12:00 -e 2017-06-05,00:00
    
  4. Launch one by one

    sloop --novortex -c croco_med.cfg run -j ibc-collect -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j init -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j nudging -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j obc -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j atmfrc -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j model -b 2017-06-01,12:00 -e 2017-06-05,00:00
    sloop --novortex -c croco_med.cfg run -j postprod -b 2017-06-01,12:00 -e 2017-06-05,00:00
    

Warning

Be carfeul of the format of the date. It Must be YYYY-MM-DD,HH:MM

TIPS

  1. Be careful of the start hour of your IBC forcing model. It must match the start hour of your model. For example, glorys12v1 start at 12:00, so you must start your model at 12:00.

  2. If you want to use your own initial file generated from an other script you just have to make a link or copy it in FORCING directory with this name (just put the right date)

ln -s my_ic_file.nc FORCING/croco_ic_20220503T1200.nc
  1. To change the MPI procs :

    • edit your config file croco_config.cfg

    • Compile your model with sloop

    • In your experiment directory, edit the conf/croco_conf.ini and adjust the queue’s name for Datarmor

  2. Available models for IBC and Atm forcing are described in sloop/sloop/data directory. You’ll find a config file by model

  3. Format data in postprocessing

    If the user want to split the output files by dates and/or by area , it’s possible with the sloop-croco-run-postprod-0-format-data routine.

    The options are given in the croco_conf.cfg file:

    • Either write the whole domain splitted in a file by date

    • Either split the file in sub area of interest and split it by date If the user want a subdomain he must fill a subsection in the cfg like this

    [postprod]
    postprod_data_dir="./DATA_OUTPUTS"
    write_whole_domain=False
       [[zone1]]
        activate=True
        [[[bounds_lon]]]
           min = -4.2
           max = 2.0
        [[[bounds_lat]]]
           min = 35.4
           max = 40.6
       [[zone2]]
        activate=True
        [[[bounds_lon]]]
           min = 14.2
           max = 20.0
        [[[bounds_lat]]]
           min = 32.4
           max = 42.6
    

Where to see the results

  • Check sloop.log file to see if something went wrong

  • Check the logs directory that contains a log per task

  • Preprocessing : see FORCING directory that should contain all IBC/RIVERS/ATM/NUDGING files

  • Model run : for each a run a directory with the start and end dates simulation is created

  • Model outputs after postprod : Check DATA_OUTPUTS

Other features

MPI NO LAND

If the user want to optmize the number of cores to use on the computer he should use the MPI_NOLAND cpp key

  1. Get the program in croco src directory (it’s called MPI_NOLAND)

  2. Compile it

  3. Fill the namelist with the maximum number of procs you want to use and the name of your croco grid file

  4. run mpp_optimiz and get the optimized layout

  5. You should find a result with a multiple of the number of cores in one node in your HPC (28 for Datarmor)

  6. From the output file title you get the NP_XI,NP_ETA, N_NODES params and you copy it in your config file croco_med.cfg

  7. Re compile the model with SLOOP by adding the MPI_NOLAND cppkey in your TPL/cppdefs.h.tpl file

  8. Change the required number of nodes and MPI queue in conf/croco_med.ini in model section

AGRIF

What to do in case of Agrif configuration ?

  1. Compile the model with AGRIF cpp keys

  2. Define the number of childs grid by the variable nb_agrif in croco_config.cfg as shown below

  3. Give the path of each grid in the croco_config.cfg with the number of the grid at the end of the name

  4. Put the path of required (not mandatory) inputs files as shown in the full example below

  5. Don’t forget to prepare a croco.in.tpl per grid and put in TPL directory if it’s not in your config directory (see sloop constants)

  6. When launching sloop run init a initial file by grid will be build

[run_model]
[[croco_inputs]]
start_date="01-01-2016"
end_date="03-01-2016"
freq_rst=24
freq_avg=24
freq_output=72
nb_agrif=2
croco_forcing_dir= /home1/datawork/mcaillau/CROCO/GIBRALTAR_3GRIDS/INPUT/

croco_frc_grid= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc
croco_frc_grid1= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc.1
croco_frc_grid2= ${croco_forcing_dir}/croco_GIB100_BR4_N40_grd.nc.2
croco_frc_tides=${croco_forcing_dir}/croco_frc_GFS_282.nc
croco_frc_tides1=${croco_forcing_dir}/croco_frc_GFS_282.nc.1
croco_frc_tides2=${croco_forcing_dir}/croco_frc_GFS_282.nc.2
croco_frc_blk=${croco_forcing_dir}/croco_blk_GFS_282.nc
croco_frc_blk1=${croco_forcing_dir}/croco_blk_GFS_282.nc.1
croco_frc_blk2=${croco_forcing_dir}/croco_blk_GFS_282.nc.2
croco_frc_clm=${croco_forcing_dir}/croco_clm_cmems-global_282.nc
croco_frc_clm1=${croco_forcing_dir}/croco_clm_cmems-global_282.nc.1
croco_frc_clm2=${croco_forcing_dir}/croco_clm_cmems-global_282.nc.2
croco_frc_ini=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc
croco_frc_ini1=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc.1
croco_frc_ini2=${croco_forcing_dir}/croco_ini_cmems-medsea_282.nc.2
croco_frc_obc=${croco_forcing_dir}/croco_bry_cmems-medsea_282.nc

croco_child_grids=${croco_frc_grid1} ${croco_frc_grid2}

[[tpl_files]]
croco_tpl_dir= /home1/datawork/mcaillau/CROCO/GIBRALTAR_3GRIDS/
croco_input_file=${croco_tpl_dir}/croco.in.tpl
croco_input_file1=${croco_tpl_dir}/croco.in.1.tpl
croco_input_file2=${croco_tpl_dir}/croco.in.2.tpl