{"id":662,"date":"2019-02-20T14:50:00","date_gmt":"2019-02-20T03:50:00","guid":{"rendered":"http:\/\/cosima.org.au\/?p=662"},"modified":"2019-02-20T14:50:00","modified_gmt":"2019-02-20T03:50:00","slug":"technical-working-group-meeting-february-2019","status":"publish","type":"post","link":"https:\/\/cosima.org.au\/index.php\/2019\/02\/20\/technical-working-group-meeting-february-2019\/","title":{"rendered":"Technical Working Group Meeting, February 2019"},"content":{"rendered":"<div>\n<h2>Minutes<\/h2>\n<p>Date: 14th February, 2019<br \/>\nAttendees:<\/p>\n<ul>\n<li>Marshall Ward (MW) (Chair) NCI<\/li>\n<li>Aidan Heerdegen (AH) CLEX, Andrew Kiss (AK) \u00a0COSIMA, ANU<\/li>\n<li>Russ Fiedler (RF), Matt Chamberlain(MC) CSIRO Hobart<\/li>\n<li>Peter Dobrohotoff (PD), CSIRO Aspendale<\/li>\n<\/ul>\n<h3>TWG Meta Stuff<\/h3>\n<\/div>\n<div><\/div>\n<div>AH\u00a0will redo MOM5 governance doc for next meeting.<\/div>\n<div><\/div>\n<div>AH finding minutes a burden, MW suggested exploring other options.<\/div>\n<div><\/div>\n<div><\/div>\n<div>\n<div>MW: Will leave at the end of March. Will maybe try and attend. Given time.<\/div>\n<div><\/div>\n<div>Even in anarchy someone has to send out the email.<\/div>\n<\/div>\n<h3>CICE Meeting<\/h3>\n<div>MW: CICE meeting. AK going. Going to Hobart Ocean Workshop? MC &amp; RF not registered, might drop in.<\/div>\n<div><\/div>\n<div>MW: Also a VC: chat with Elizabeth Hunke. Who is going to attend? Just me? AK: Yes. MW: AH \u00a0come too? Ben Evans asked Rui to come. Not sure about NH. Assume interested.<\/div>\n<div><\/div>\n<div>MW: What to ask her about? Agenda? What motivated it? AK: Just that Elizabeth is around and could chat. AK: Any point me turning up a day before, more talking about Petra, but Petra thought it might be useful. Could show her how we set stuff up, some results?<\/div>\n<div><\/div>\n<div>RF: Anyone from Aspendale coming down? PD: Not sure. MC: Simon Marsland and Siobhan on the attendance list.<\/div>\n<div><\/div>\n<div>MW: Might ask about using latest GitHub branch (cice6). If we were to use it what should we do? Incorporate changes from OM2 codebase? Others more interested in physics?<\/div>\n<div><\/div>\n<div>AK: Might be interested in scaling work. Hoping to put some in my talk. MW: Fine with me.<\/div>\n<div><\/div>\n<div>MW: Not done as much as Tony Craig (?) on load balancing.<\/div>\n<div><\/div>\n<div>Monday 18th @3pm with Elizabeth (2 hours)<\/div>\n<div><\/div>\n<div>AK: Valuable networking opportunity.<\/div>\n<div><\/div>\n<div>MW: Would be great for NH to come.<\/div>\n<div><\/div>\n<div>MW: Maybe AK give a run down of some of the runs, start from there.<\/div>\n<div><\/div>\n<div><\/div>\n<h3>MOM5 Pull Requests<\/h3>\n<div><\/div>\n<div>MW: RF been busy<\/div>\n<div><\/div>\n<div>RF: Bug in one of those in GW scheme. Was testing temperature in the wrong direction. Also something odd happens to temp rebinning at the bottom of a level compared to density. Missing value is zero. Interpolates first non-zero temperature to below bottom level. Because density in the rock is zero, can\u2019t get a bounding. Problem with the way the diagnostic is originally done.<\/div>\n<div><\/div>\n<div>RF: Calculates transport in density one don\u2019t account for transport in lower half of bottom cell, but temperature remapping you do. MW: Haven\u2019t looked at the patch yet. Is this what Ryan Holmes was asking about? RF: This would speed up Ryan\u2019s remapping. His PR was different. Trying to remap onto different levels. He sort of fudged the code. Take code from remapping onto density levels, and made something spoof, pretends neutral density is temp or salt. Don\u2019t like what he\u2019s done. Probably works, but not totally sure, but my optimisations might break some of the things he does. AH: Your optimisations are field dependent? RF: Yes. Assume it is density, with assumption density increases as you get deeper. \u00a0MW: He added a neutral density thing? RF: Trying to trick the code into something else.<\/div>\n<div><\/div>\n<div>RF: Can\u2019t do it on more than one variable.<\/div>\n<div><\/div>\n<div>AH: Might be worth telling Ryan this might break his code.<\/div>\n<div><\/div>\n<div>RF: I thought he had put the commit in there. AH: No deleted the PR. He doesn\u2019t have commit rights.<\/div>\n<div><\/div>\n<div>MW: Has a hard coded neutral density point that he has defined.<\/div>\n<div><\/div>\n<div>AH: RF still thought worthwhile? RF: Yeah, have a general thing, remap to level? A lot of code would be copy\/paste. Could be a lot of work. AH: classes of rebinning?<\/div>\n<div><\/div>\n<div>MW: Not sure I understand exactly what RF\u2019s commit does. Not sure I can add value.<\/div>\n<div><\/div>\n<div>RF: Just a lot faster.<\/div>\n<div><\/div>\n<div>AH: How did you pick up the error? RF: Was worried about it. Hadn\u2019t checked rebinning to temperature. Wasn\u2019t sure I had accounted for reverse in signs. In transport beta _ gm. Neutral physics utilities module. Checking for maximum and minimum temperatures on wrong levels. Hadn\u2019t tested that diagnostic. Missed temperature. When tested failed. Doesn\u2019t alter results of simulation, diagnostic slightly wrong. Other things were bit repro, all checksums were identical.<\/div>\n<div><\/div>\n<div>AH: So when code changes are made to diagnostics make sure those diagnostics. Make sure we paste in pics of diagnostics. Made sure to double precision in `diag_table`.<\/div>\n<div><\/div>\n<h3>MOM5 Governance<\/h3>\n<div><\/div>\n<div>Last month agreed to tackle PRs. MW: Paul never answered. Other didn\u2019t answer. AH: AK didn\u2019t answer! ?<\/div>\n<div><\/div>\n<div>MW: A lot of weird hard constants in FMS. Data structures are weird.<\/div>\n<div><\/div>\n<div>MW: Other PRs when we got no answer? Ask for an update without interaction without a month? Have some policy? Paul looks more valuable. Other one is more FMS. Could call phone.<\/div>\n<div><\/div>\n<div>General approach for non-responding PRs: Get in contact again. Warn it will be closed. Close and say they can reopen.<\/div>\n<div><\/div>\n<div>MW: Sometimes got good ideas with poor implementation, accepted and completed reimplemented. RF: Short one best to redo a different way, and reject the FMS stuff.\u00a0Contact Paul and get it done?<\/div>\n<div><\/div>\n<div>MW: No answer after prolonged time, incorporate good ideas in a different branch.<\/div>\n<div><\/div>\n<div>AH: Why coding now? RF: Had these ideas for ages, but noticed low hanging fruit. Remapping and submeso scale. Knew we could make significant time savings. Knew about these ages ago. Similar with tidal mixing. AH: Uses MOM timings? RF: It was slow, and looked at it and wondered about looping. MC: With changes what improvements? RF: 20-30% in each module. I run short cases, so data writing might dominate a bit. Will depend on the size of the model. Time spend on each tile proportional to mixed layer. MW: Shallow levels will be a big improvement? RF: yes. MW: Not \u00a0iterating where there aren\u2019t values? RF: Yes. Two types of tests, check if entire tile can be topped, other times if a latitude can be stopped. RF: Did test of 1200 cpu job on OFAM grid too 30% off those routines.<\/div>\n<div><\/div>\n<div>AH: submeso is 10% of total ocean runtime.<\/div>\n<div><\/div>\n<div>RF: Starting a big run, good time to get it in.<\/div>\n<div><\/div>\n<div>MW; Sometimes said MOM was well balanced. Aggressively masks everything.<\/div>\n<div>RF: Imbalance comes through the parameterisation code. KPP, Tidal mixing. Found another weird thing in the barotropic routines. Takes a lot of time. eta and pbot diagnose. No reason to diagnose the pressure at bottom on a u cell. Except if you\u2019re writing the diagnostic. AH: standard for the code to check if diagnostic used before calculating? RF: Required for restart file. Check at restart stage and write it out that time. AH: don;\u2019t restarts have to be field_table? RF: No<\/div>\n<div><\/div>\n<div>AH: If they don\u2019t affect science can add to 0.1 at any time.<\/div>\n<div><\/div>\n<div>Ocean eta and pbot diagnose 10% of runtime.<\/div>\n<div><\/div>\n<div>AH: should we prioritise any changes. RF: just the ones I have put in. Others not so much. I\u2019ll fix up the PR. Just got compiled and testing.<\/div>\n<div><\/div>\n<h3>netCDF\u00a0Parallel MPI IO<\/h3>\n<div><\/div>\n<div>MW: Parallel IO stuff looking good and nearly done. Getting parallel IO without collation. Even restarts. A few masked cases where things look odd \u00a0with completely missing values.<\/div>\n<div><\/div>\n<div>MW: Fill value versus zero over land? If I do mppnccombine intelligently turns zero over land into missing values.<\/div>\n<div><\/div>\n<div>RF: When MOM sends diagnostics sends a mask with the call.<\/div>\n<div><\/div>\n<div>MW: Should land be zero or fill value? RF: should be fill. MC: What about restarts? RF: Used to have zero and then changed. Turned up in the density restarts.<\/div>\n<div><\/div>\n<div>AH: Performance?<\/div>\n<div><\/div>\n<div>MW: As fast as the number of disks. Can be subtle to configure. Have to balance the nodes with io_layout with ncpus on node. Negligible with 0.25 deg. Write speeds at about speed of lustre (half speed x number of disks).<\/div>\n<div><\/div>\n<div>PD: Fan of missing_value stuff. Parallel IO work from Dale.<\/div>\n<div><\/div>\n<div>MW: Rui will know about timing variance. Worried GFDL will find it slow and reject. Rui looked into compressed parallel IO. Interesting results. Reasonably fast. It\u2019s half the speed of non-compressed. What is the serial (offline) compression time? No idea. AK: Is speed MB\/s. Or twice as slow for total data file? MW: Twice as slow as the entire dataset.<\/div>\n<div><\/div>\n<div>MW: Currently uncompressed. Can then compress.RF: Need to work for regional output. MW: Do at FMS level. AH: Should test for regional output. RF: Regional output done by geographic rather than index. If by index would make it easier. MW: If you can get that for a test.<\/div>\n<div><\/div>\n<div>\n<div>\n<h3>Actions<\/h3>\n<div>\n<p>New:<\/p>\n<ul>\n<li>Amend MOM5 governance doc\u00a0(AH)<\/li>\n<li>Feedback to RF PRs\u00a0(MW+AH)<\/li>\n<li>Check back on Paul&#8217;s PR (MW)<\/li>\n<\/ul>\n<p>Existing:<\/p>\n<ul>\n<li>Shared google doc on reproducibility strategy (AH)<\/li>\n<li>Pull request for WOMBAT changes into MOM5 repo (MC, MW)<\/li>\n<li>After FMS moved to submodule, incorporate MPI-IO changes into FMS\u00a0(MW)<\/li>\n<li>Incorporate WOMBAT into CM2.5 decadal prediction codebase and publish to Github (RF)<\/li>\n<li>Move FMS to submodule of MOM5 github repo (MW)<\/li>\n<li>Make a proper plan for model release \u2014 discuss at COSIMA meeting. Ask students\/researchers what they need to get started with a model (MW and TWG)<\/li>\n<li>Blog post around issues with high core count jobs and mxm mtl (NH)<\/li>\n<li>Look into OpenDAP\/THREDDS for use with MOM on raijin (AH, NH)<\/li>\n<li>Add RF ocean bathymetry code to OceansAus repo (RF)<\/li>\n<li>Add MPI barrier before ice halo updates timer to check if slow timing issues are just ice load imbalances that appear as longer times due to synchronisation (NH).<\/li>\n<li>CICE and MATM need to output namelists for metadata crawling (AK)<\/li>\n<li>Provide 1 deg RYF ACCESS-OM-1.0 config to MC (AK)<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Minutes Date: 14th February, 2019 Attendees: Marshall Ward (MW) (Chair) NCI Aidan Heerdegen (AH) CLEX, Andrew Kiss (AK) \u00a0COSIMA, ANU Russ Fiedler (RF), Matt Chamberlain(MC) CSIRO Hobart Peter Dobrohotoff (PD), CSIRO Aspendale TWG Meta Stuff AH\u00a0will redo MOM5 governance doc for next meeting. AH finding minutes a burden, MW suggested exploring other options. MW: Will&hellip;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[4,3],"_links":{"self":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/662"}],"collection":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/comments?post=662"}],"version-history":[{"count":2,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/662\/revisions"}],"predecessor-version":[{"id":664,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/posts\/662\/revisions\/664"}],"wp:attachment":[{"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/media?parent=662"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/categories?post=662"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cosima.org.au\/index.php\/wp-json\/wp\/v2\/tags?post=662"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}