{"id":756,"date":"2019-09-10T12:47:03","date_gmt":"2019-09-10T02:47:03","guid":{"rendered":"http:\/\/cosima.org.au\/?p=756"},"modified":"2019-09-10T12:47:03","modified_gmt":"2019-09-10T02:47:03","slug":"technical-working-group-meeting-august-2019","status":"publish","type":"post","link":"https:\/\/cosima.org.au\/index.php\/2019\/09\/10\/technical-working-group-meeting-august-2019\/","title":{"rendered":"Technical Working Group Meeting, August 2019"},"content":{"rendered":"<h2>Minutes<\/h2>\n<p>Date: 14th August, 2019<br \/>\nAttendees:<\/p>\n<ul>\n<li>Aidan Heerdegen (AH) CLEX ANU, Angus Gibson (AG) RSES ANU, Andrew Kiss (AK) \u00a0COSIMA ANU<\/li>\n<li>Russ Fiedler (RF), Matt Chamberlain (MC) CSIRO Hobart<\/li>\n<li>Rui Yang (RY) NCI<\/li>\n<li>Marshall Ward (MW) GFDL<\/li>\n<li>Nich Hannah (NH), Double Precision<\/li>\n<li>James Munroe (JM), COSIMA<\/li>\n<\/ul>\n<h2>PIO work with CICE<\/h2>\n<div><\/div>\n<div>NH: PIO code in CICE not as complete or thorough as netCDF code. Nothing to suggest it won\u2019t work. Relies on NCAR PIO library, and a CESM utility library. Dependencies which are not part of CICE. Built PIO dependency on raijin, ran into CESM dependency. Can either remove\u00a0dependency or remove code.<\/div>\n<div><\/div>\n<div>NH: Initially thought to use the MOM approach.\u00a0Tile and collate. Russ\u2019 comments encouraged to try PIO. Will be supported in future and will be supported in CICE6. Nothing working, but will soon test with 1 degree.<\/div>\n<div><\/div>\n<div>RF: Real bottleneck with high freq output. Worth a go. Attempt to put this into FMS by Hartnett. AH: Different to parallel netCDF? NH: PIO is wrapper around parallel netcdf. Written by NCAR to simplify parallel netcdf.\u00a0Another layer. On GitHub, continuing to be maintained. RY: Wrapper that does work to match computing to IO domain. Not so useful for MOM5 as it has io_layout already.<\/div>\n<div><\/div>\n<div>MW: Harntnett motivated by FE3 (forecast model) rather than ocean.\u00a0Not sure what project even involved in.<\/div>\n<div><\/div>\n<div>NH: Big test is handling interesting CICE layout, difference between cartesian grid and PE layout. MW: PIO will support explicit\u00a0decomposition and other approaches.<\/div>\n<div><\/div>\n<div>NH: Parallel netCDF version on raijin only links with OpenMPI3.0. RY:\u00a0New\u00a0machine launched soon. OpenMPI 1.* will be dropped. No new\u00a0software depending on 1. MW: OpenMPI 2 is not good. Should use 3.<\/div>\n<div><\/div>\n<div>NH: Probably have to test this with OpenMPI 3.0 RY: 3.1.3. Switch everything to that.\u00a0Good test for new machine. AH: Working now? RY: My fault. Used unmatched openMPI library. Everything looks fine. OpenMPI 2\/3\/4 with Intel 19. All working. 1 deg &amp; 0.25 deg working. Tenth not working. MW: I was able to run tenth with 3.1.2\/3.1.3.<\/div>\n<div><\/div>\n<div>MW: One of the intel compilers broke MOM. A compiler bug with types in types.<\/div>\n<div><\/div>\n<div>AH: Should \u00a0start an issue for testing RY: Will email MW directly. RY: Not a MOM bug.<\/div>\n<div><\/div>\n<div>MW: Tried MOM-SIS tenth? Good test. RY: From earlier\u00a0this year do have this\u00a0working.\u00a0This is\u00a0testing\u00a0for new machine, so ACCESS-OM2.<\/div>\n<div><\/div>\n<div><\/div>\n<h2>OMIP date restart protocol<\/h2>\n<div><\/div>\n<div>RF: Talked to Griffies. GFDL take ensemble approach. Run for N years using true dates. At finish reset back to start date with correct calendar. Storing new stuff in different directory. End up with 5\u00a0sequences of 55 years. All dates are correct. No issues with leap years going wrong. Think this is the best way to go.<\/div>\n<div><\/div>\n<div>AK: Came to conclusion that this was\u00a0right way to go, mostly due to leap year issue. Problem is, can we get the model to do that, but Maurice and Ryan had issues. Issue with CICE getting the correct date. CICE has a flag\u00a0\u201cuse_restart_dates\u201d. Suggested set this to false, and set the dates in access_restart.nml, but CICE is not picking up dates. Looks like libaccessom2 is not passing them on to CICE. Some confusion about exactly what they have done. Some instructions on Wiki for restarting,\u00a0from restarting IAF from RYF at tenth, but doesn\u2019t work for other people. NH: I\u2019ll look at it. AK: Will send issue. NH: Didn\u2019t realise it was happening. CICE date handling is not great.<\/div>\n<div><\/div>\n<div>AH: Downside\u00a0with ensemble, difficult to get metrics across the whole time series. RF: Need extra\u00a0meta-data added in. Maybe which cycle you\u2019re in. An extra variable which gives the actual\u00a0number of days since the start of the run. Down\u00a0with post-processing.\u00a0Might be able to concatenate files using extra meta-data. AH: Always have issues with missing leap years if it spans a century. But only daily is an issue. AK: Cookbook do something. MC: Pretend it is no leap? JM: Data looking at\u00a0as time series? AH: Extra metadata, say offset day is a good idea. RF: Add buffer in netCDF file so don\u2019t need copies. mppnccombine can add padding. usually done with nccreate, make sure the header has some space.\u00a0hbuf?<\/div>\n<div><\/div>\n<h2>Strategy for CICE updates for flexibly adding fields<\/h2>\n<div><\/div>\n<div>RF: Way CICE drivers work, variables you want are either hard coded, or muck\u00a0around with pre-processing to compile them in and out. Wondering if anyone looked at doing it on the fly. Using error codes coming back\u00a0when setting up variables, so have flexible number of variables passed in and out. Would like this to pass\u00a0total wind speed, to harmonise code. Also Hakase wants it for some BGC stuff. Phytoplankton through to the ice. So specify the variables, work out if they\u2019re there or not.<\/div>\n<div><\/div>\n<div>NH: Would want the exe to handle configuration with different sets of coupling fields. Sometimes\u00a0include total wind speed, sometimes not. RF: would know complete set, if not there skip it. Currently have to be hard wired in,\u00a0or make another driver. NH: Way to do it,\u00a0start with superset in namcouple, and code would exclude certain variables. RF: Maybe if variable not in namcouple, return an error code, but ignore error. NH: Shouldn\u2019t be too hard to do. NH: OASIS does return error codes that\u00a0could be used. Either abort or return error code. If aborting could change that. AH: Restart fields? NH: Should do behind the\u00a0scenes.<\/div>\n<div><\/div>\n<h2>Paths for JRA55-do forcing files. Some changes to support v1.4<\/h2>\n<div><\/div>\n<div>AH: JRA55-do not part of Input4MIPs, part of CMIP6. Have to use the copy that is CMIP6. Encodes all the metadata in filename, consequently doesn&#8217;t currently work with YATM. Circumvented by creating symbolic links that worked with YATM. When I did this couldn&#8217;t reproduce. Not sure if this is actually an issue with the fields being different or not.<\/div>\n<div><\/div>\n<div>AH: Tried to use testing framework NH developed for this using jenkins. The historical test that tests against known checksums doesn&#8217;t seem to actually compare them. Not sure if that is intentional. Would like to use framework, as NH has done a great job with it.<\/div>\n<div><\/div>\n<div>MH: MOM6 has\u00a0diag_mediator, supports CMOR name alongside internal model name. Porting to MOM5 is a big task, but idea is good and saved them a lot of work. Could create a thin wrapper to translate to CMOR name if that helps. AK: How integrate with YATM? MW: Don&#8217;t know. At FMS level, so only help with 1 model (MOM). AK: YATM access the JRA files. So libaccessom2 change.\u00a0AH: Looked at YATM code. Generates filename form date. Input4MIPS has current year and next year, so would require code changes. Might just be easier to create a file with date-&gt;filename mapping?\u00a0AH: Possible to do. Would need to add a token for year+1. Possible to do. Probably best to do it that way.<\/div>\n<div><\/div>\n<div>AK:\u00a0Also need code changes with v1.4. Solid and liquid runoff are separate. What to do with solid runoff? Griffies either use iceberg model, or melt them and add them to runoff. Take account latent heat of fusion? Assuming solid\u00a0runoff is at zero, which could be a problem. Put in a request to download v1.4. Scripts they have should automatically download it, but not. MW: Think GFDL only has v1.3.<\/div>\n<div><\/div>\n<div>MW: Fields go to end of 2017, is 2018 downloaded? Looking in wrong place? Looking in ua8. AK: Should look in qv56. AK: qv56 up to feb 2018. AH: If not\u00a0automatically downloading, we\u00a0should ask. What does the OMIP protocol say about end date? AK: JRA55 can find out about 2018. RF: It is specified, but would like latest for ongoing runs.<\/div>\n<div><\/div>\n<h2>Testing FMS merge<\/h2>\n<div><\/div>\n<div>AH: Putting FMS in as a sub-repo. Just needs testing. If it reproduces checksums for a month we&#8217;re sure it is ok? Is that sufficient?<\/div>\n<div><\/div>\n<div>NH: When Marshall upgraded FMS, went through every MOM test. Including 0.25. Can&#8217;t recall how strict we were. AH: Testing framework still there? NH: It is there.\u00a0Because it never gets used, might be rotted a bit. Can give\u00a0Jenkins URL of PR and it would do it. We should work together to get that working.<\/div>\n<div><\/div>\n<h2>New NCI HPC hardware announcement<\/h2>\n<div><\/div>\n<div>RY: System by end of the year. 2 phases, install new machine with\u00a0Cascade Lake nodes. Short period gabi and raijin run simultaneously. After that skylake and broadwell will be merged with new machine and SandyBridge nodes removed. 100 GPU installed. 16 skylake k-80 nodes. PBS pro again. Storage and network infiniband. 200GB\/s transfer speed. OS is CentOS 8. AH: Trying to figure out total core count for new machine. Do you know what core count will be? RY: Not clear on exact number. Can check with system guys if they know the exact number. If 32 cores\/node, 150+K processors. AH: Will runtimes be extended for new machine. Find 5 hours too low for high core count jobs. Reduces flexibility. RY: Queue time limits are per project. Quite flexible. Contact NCI help. AH: Have asked for time limit changes in past, but usually time limited. RY: Have been asked by other users, not sure about the policy. Good time to ask and get a better policy for the new machine.<\/div>\n<div><\/div>\n<div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Minutes Date: 14th August, 2019 Attendees: Aidan Heerdegen (AH) CLEX ANU, Angus Gibson (AG) RSES ANU, Andrew Kiss (AK) \u00a0COSIMA ANU Russ Fiedler (RF), Matt Chamberlain (MC) CSIRO Hobart Rui Yang (RY) NCI Marshall Ward (MW) GFDL Nich Hannah (NH), Double Precision James Munroe (JM), COSIMA PIO work with CICE NH: PIO code in CICE&hellip;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[4,3],"_links":{"self":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/756"}],"collection":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/comments?post=756"}],"version-history":[{"count":1,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/756\/revisions"}],"predecessor-version":[{"id":757,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/756\/revisions\/757"}],"wp:attachment":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/media?parent=756"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/categories?post=756"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/tags?post=756"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}