Technical Working Group Meeting, October 2018

Minutes

Date: 16th September 2018
Attendees:

  • Marshall Ward (MW) (Chair), Rui Yang (RY), NCI
  • Aidan Heerdegen (AH) and Andrew Kiss (AK), CLEX ANU
  • Russ Fiedler (RF), Matt Chamberlain (MC), CSIRO Hobart
  • Nic Hannah (NH) Double Precision
  • Peter Dobrohotoff (PD), CSIRO Aspendale

TWG Organisation

MW: Taken position at GFDL. Starting 3-6 months. Need new TWG Chairman. Need to organise meetings. Not much communication with other working groups. AH: Anyone who is interested think about it, we can decide at a subsequent meeting.

MW: As I am leaving, noone left at NCI following ocean model development. NCI will appoint a new person, but RY is attending for some knowledge transfer.

OM2/CM2 MOM5 Harmonisation

AH: there is a cm2_release_candidate branch on MOM5 repository. Contains all substantive code changes from Hailin’s fork on Peter’s repo.

AH: Need a rose suite to support MOM5 compile script. Might get Scott Wales to help make the suite. MW: I might be able to help AH: Used original MOM_compile script? MW: Not sure. AH: Currently pulls in a build script from a totally different svn branch.

PD: Yes MOM5 in git repo. One of the directories (exp) has the same build script as you’re using AH. PD: I cloned your repository, copied over compile script and environment file, pressed go and it compiled. Problem at link time. Don’t have an opinion about build script being in repo. Rose suites do “blossom”. Ok to compile from command line at the moment. Can consult with AH offline.

MW: Are AH and RF happy with the code changes itself? AH: last set of changes are that crucial. Steve Griffies would have liked more atomic changes. Need to run, see if it is different, if it is, figure out how different and if it is important.

AH: Next harmonisation target is ACCESS-ESM-1.5, adding WOMBAT BGC. This will go into the main MOM5 repo. In theory will also be in the CM2 version of MOM5. It won’t be turned on, but we should check that it doesn’t make a difference to CM2 results.

AH: Seems straightforward, as MC had already put WOMBAT BGC into MOM5, but there have been some changes since then. MC: Pull 3 years of MOM5 changes into my own branch. RF: ocean_sbc is what hooks into WOMBAT. The components we’ve added in, like 10m winds and sea ice coverage is what WOMBAT wants. What we’ve got there now is compatible, except WOMBAT assumes 10m winds aren’t masked, and uses sea ice coverage to do masking. MC: Yep. RF: The way we do it, it is already masked. So might need a change to WOMBAT, or a flag. MC: does multiple masking matter? RF: if it’s multiplying by ice fraction, don’t want to multiply a second time. MC: Around the fringes? RF: No difference to open ocean or full ice coverage. RF: Pretty close to correct. Changed the interfaces. A lot of things in ocean_model can be kept in ocean_sbc. I can go with it with Aidan.

MW: Only time pressure is when adopted in CLEX? AH: No. Some people would like this to be in the ACCESS-ESM-1.5 CMIP runs. I don’t know what the politics situation is like. MC: Tilo is anxious to get control runs going ASAP. If there is a changed to a stable version he will run with it. Catia and Fabio are anxious to get extra diagnostics in for their experiments, but not central to ESM effort. Tilo will start as soon as he has his carbon cycle stuff fixed. MW: Pressure point on RF? RF: I’ll look at it. Just need to throw in a couple of the hooks into WOMBAT, but think they’re there. Should be straightforward.

AH: made a PR, link on TWG slack channel. Cherry picked out commits that seemed necessary. If make code changes please pull down latest code before submitting changes. Can delete fork if necessary and start again. RF: Yes, done that a few times.

MW: Harmonisation on track? AH: Holger is working on payu version for ACCESS-ESM-1.5. MW: CLEX specific? PD: CLEX is picking up ESM as climate model. We are all working in the same direction. Lots of non-CMIP science coming out of these models. Shouldn’t dismiss payu as something we don’t care about.

COSIMA Models

NH: Running minimal 0.1 degree config. Around 2K cores. Maybe not actual minimum, but decent compromise. Good efficiency. With dt=600s, around 5KSU/month. Models well balanced. Ice model not slowing things down and only using 350 cores. MW: sectrobin? NH: yes but probably doesn’t matter.

NH: Thanks for heads up for NCAR tripolar efficiency fix for CICE. RF: Surprised it makes a difference at low core counts. NH: Not sure it does, just wanted everyone to know it is now in the code. NCAR say they have checked they get identical results, confirmed no difference. One month in 2.5 hours with dt=600s. Can’t squeeze in 2 months/run. AK: What diagnostics? NH: Just monthly. Same as AK’s, changed daily to monthly, just in ice. AK: Currently have 3D daily prognostic fields. NH: Might slow things down a bit. Because this config is small it is nicely balanced. Fitting so much work into each ICE PE, there is more chance they are balanced. Using 8 blocks per core. AK: ndtd=3? NH: no, try with ndtd=2 to begin with, and seems to be going ok.

NH: Currently crashing off tip of Severny Island. High velocities at tip. Crashing after 14 submits (months). Surprised it took so long to crash. Done some work smoothing bathymetry. Doesn’t seem to have helped, now trying Rayleigh damping. RF: What month? NH: October RF: Is there ice there? NH: Don’t think so RF: Had a look at other months. A jet of warm salty water coming up from the south along the coast. Those sea mounts are there. NH: Almost completely levelled them. Still a dip. Cleared seamounts before and in the dip. Velocities are very high there. Highest velocities that far north by a long way. Wondering if it is an extreme situation. AH: I tried the truncate_velocity option north of a certain latitude. Didn’t work, had a temp or salt blow up, so don’t bother. MW: usually a no-no. AH: Had the same issue with MOM-SIS-01 with CORE-II NYF, same crash, same time every year. RF: Interesting that same problem with a different bathymetry. AH: Severny Island pokes a long way north, any flow coming that direction gets funnelled along the coast. Could stop crashes with Rayleigh damping at depth in small area NE of sea mounts. Steve not happy as a solution, but one small spot places ocean timestep limit on the whole global model. I think we should use Rayleigh damping if it stops this. RF, NH: Agreed.

AK: Same crashes in same location when I’ve attempted 600s timestep, so wound it back. Put Rayleigh drag in Kara Stratit NH: Yes I have those. AK: Can give some idea of scale of drag required. Also that drag might be pushing more water around the Severny Island. AH: You already have Rayleigh drag in your model? NH: yes, all of AK’s additions. Understand some of the frustration with this model. Small config, easier to run and test. Want to push timestep as far as possible. AK: Sounds like a good strategy. Though concerned by oscillations in vorticity field in shallow area south of Bearing Strait. Some sort of numerical glitch. Goes away with 450s timestep. Seem to get stuff like this when timestep is pushed up. AH: Any idea where it is coming from? AK: Not sure which terms/equations involved. Dispersion gets worse as CFL gets higher. Not sure. NH: Explore some of these things, as MOM-SIS-01 was running at 600s right? AH: Yes with Rayleigh damping. AK: Fanghua was using MOM-SIS-01 with this bathymetry, couldn’t go higher than 450s. Added damping and did a lot of work to track down issues. AH: Bathymetry has changed since then? AK: Yes, problem with ocean that shouldn’t have been. NH: Didn’t realise Fanghua used same bathymetry. AK: Similar. Would have had one full of potholes.

RF: Anyone used new bathymetry I made? Couple of cells filled also, but mostly partial cells. In bathymetry directory, added about a month ago. NH: Will try it.

NH: Want to get recent CICE changes into 6K PE model using one of AK’s restarts. Crashing with ice remap transport errors. MW: Include tripole changes? NH: yes. Also sectrobin code change (also doesn’t change answers). Experimenting with sectrobin and blocks to get a more efficient setup. MW: That is what I am running and trying to understand. If I do a git pull from yours will I expect crashes? NH: Crashes not due to code, just model instability. Tested that code doesn’t change answers. MW: Will try that.

AH: Which is the correct bathymetry file? Some discussion, turns out the new file is

/g/data3/hh5/tmp/cosima/bathymetry/topog_05_09_2018_1m_partial.nc

AK: To overcome ice crashes like that, use ndtd=3 to give ice more time. NH: You haven’t had ice remap crash since using this? AK: Correct. CFL issue, ice moving more than one grid cell per timestep. NH: Ice is going unrealistically fast, 35 m/s. MW: How does it do this? AH: Instability? NH: Yes. AK: Is sea surface slope high? RF: Diagnoses slope, derives slope assuming geostrophic properties. Not passing slope from ocean model. If you do, get checkerboard unless smoothed.

AH: Is ratio of PEs in minimal model same as for large model? NH: In 1/10 ratio is about 1:4 ice:ocean. Minimal model it is 1:5.

RF: Bugfixes found in CICE6 should be back ported. Were using the wrong mask in the EVP solver for updating the halos. Stops bit reproducibility. NH: I saw that bug list. Know where they are. Will bring them across. RF: Found different types in u and t masks (one logical, one 0/1).

MW: Latest profiling shows EVP taking most of the time, and in particular EVP halos. Wonder if these have any effectives RF: Purely a masking issue. Could be the cause of the strange stuff due to tripolar join. Only 5 lines of code. MW: Huge patch? NH: No. Not messy. This is not a big change of code.

AH: With CM2 with old versions of CICE5 with UM hooks etc. How serious an issue before back port to CM2 version? MW: Not time to go into that too far.

NH: Since CICE6 is just incremental improvement of CICE5, maybe we should use that in future?

Miscellaneous

MW: Ben arranging meeting with Team Leaders in this space. Set meeting on Nov 7. NH to be contacted? NH: I think I am going. MW: Discussing infrastructure needs for next 10 years. Would be good to have a consistent view on what is required. Meeting at a high level. MW: RY and I are going.

AH: Doing another payu training for CLEX, covering mppnccombine-fast, file tracking and ACCESS-OM2 configs, how to get them and what to do. Anyone at CSIRO interested?

MW: Will go over more profiling info on slack.

MW: Will merge latest payu versions. Can run without patching python version. AH: Yes can also run in a conda environment, which maybe tick’s portability box for NH

AH: people on payu/dev should move to payu/0.10.

PD: COSIMA meeting where harmonised code delivered. Amazing! Well done.

Actions

New:

  • Check ACCESS-ESM-1.5 PR / WOMBAT integration (RF, AH)
  • Backport CICE6 bugs into CICE5 (NH)
  • Forward training email to PD (AH)

Existing:

  • Create even 5 blocks per PE map for CICE (RF)
  • Update model name list and other configurations on OceansAus repo (AK)
  • Shared google doc on reproducibility strategy (AH)
  • Pull request for WOMBAT changes into MOM5 repo (AH, RF)
  • Compare out OASIS/CICE coupling code in ACCESS-CM2 and ACCESS-OM2 (RF)
  • After FMS moved to submodule, incorporate MPI-IO changes into FMS (MW)
  • Incorporate WOMBAT into CM2.5 decadal prediction codebase and publish to Github (RF)
  • Move FMS to submodule of MOM5 github repo (MW)
  • Make a proper plan for model release — discuss at COSIMA meeting. Ask students/researchers what they need to get started with a model (MW and TWG)
  • Blog post around issues with high core count jobs and mxm mtl (NH)
  • Look into OpenDAP/THREDDS for use with MOM on raijin (AH, NH)
  • Add RF ocean bathymetry code to OceansAus repo (RF)
  • Add MPI barrier before ice halo updates timer to check if slow timing issues are just ice load imbalances that appear as longer times due to synchronisation (NH).
  • Redo SSS restoring with patch smoothing (AH)
  • Get Ben/Andy to endorse provision of MAS to CoE (no-one assigned)
  • CICE and MATM need to output namelists for metadata crawling (AK)
  • Provide 1 deg RYF ACCESS-OM-1.0 config to MC (AK)
  • Update ACCESS-OM2 model configs (AK)