[Edserplo_users] Major update - BUDS support, tidal analysis storage, wave statistic & validation of tidal statistics

Steve Loch sgl at bodc.ac.uk
Tue Oct 23 16:26:38 BST 2007


This was intended to be the update that achieved Edserplo 2.0. 2.0 is the stage of Edserplo when it can replace Edteva. However the tidal statistics functionaliy,  which I have been working on for more than a couple of months now, still leaves a lot to be desired (see below). 

For people not interested in Edteva functionality there's no need to read beyond Wave Statistics.  For those who use BUDS read the BUDS section!

Small bugs
----------------- 
Compatibility of @ and %oq options has been addressed.
Tooltips - listing on Linux demonstrates a bug in omitting the first series, caused by the presence of a leading slash ('/'). We have added the port number and mnemonic - which incidentally have not been available anywhere else (not in Edteva functionality)  - to get round the problem.
A problem with null pointers relateded to mouse usage has been solved.


BUDS file locking
----------------------------  This is simple in concept but not so easy to program.  In BUDS you have to supply your initials (at opening time) before you can modify the file. Anyone else attempting to access the file  at the same time is blocked from doing so in Matlab. We have now extended the system to include Edserplo.

In the normal course of events if you want to look at BUDS data you are asked for the initials through a pop-up window. If the BUDS is old-style, Edserplo allows you to continue but doesn't give you write access and it tells you this through a pop-up. If you don't provide the initials the BUDS files are opened as read-only. If you are looking at files which you cannot alter because of permissions - they might belong to a different ID - then you won't be asked.  If the BUDS item is locked  you will be told who is using it via a pop-up and the BUDS data will be viewable but not be modifiable. 

NB Once the channels are read - in read-only mode - you are guaranteed an unchanging snapshot of the data but you should remember that the channels are read on demand and changes to the underlying file could occur in the interim. This is unlikely to be a problem in practice but you should be aware of it. If it is a problem we can bar access to files undergoing editing (as Matlab BUDS does for precisely this reason).

If  an Edserplo session should end prematurely the write lock will not be removed (killing the Edserplo window is actually OK as it should still remove the lock - I have tested this). In this case you can use the Matlab utility resetbudslock to remove it. 

A further point is that non bona fide BUDS data was allowed to be updated - as an ordinary QXF file. This might have led to people  wasting time. In this case a pop-up registers the fact and it is set read-only.

Wave Statistics
------------------------- I have rewritten elements of this page to allow the statistics of individual series to be looked at (F9 toggle). I've also set the minimum size as 5 metres in the vertical. Also the scale doesn't change as you select different series. Different aggregations can be got by combining different series as chosen on the Series page. The support of disaggregation wasn't in the original Fortran, hence the need for extensive changes. I've also resolved B#827.

Messaging on Linux
--------------------------------
Muhammad has had to put in a lot of work to improve the messaging on Linux. Although it looked OK on Windows it missed off the first port in the selected list when generating files from the Output page on Linux. The presentation was pretty awful as well. This is a complicated topic as it involves the difference in threading behaviour between these operating systems and is not fully impmemented at this release.

Output page
--------------------
Output page template has been adjusted with a browse facility.

Tides
--------  
 We can now store the results in Oracle and you can also specify which TAs to use for generating residuals when plotting. Actually this will be delayed a couple of days while I check it out.

Tidal Statistics
----------------------- 
Statistics code which was written in early 2005 has not been properly tested so I have set out to test it. The test dataset consists of the years of the Dataring 2000-2006 (over 300MB). This is likely to stress the code in ways which a routine monthly update is unlikely to . I compare the results with those generated by Edteva using a comparator written in Matlab. The process has been much more protracted than I could (ever) have imagined. One of the problems being that whilst Edserplo can swallow that amount of data, Edteva certainly cannot. The process of generating the Edteva side was therefore done piecemeal using a port-ordered dataset and took, in elapsed time, a matter of days.

This test has uncovered numerous problems not all of which have been solved yet. These include  tidal prediction errors for the year 2000 (a millenium bug no less!), procedural emulation problems, progress messaging inadequacies, channel aliasing problems,  (seemingly) one or more memory leaks and most problematic all, JDBC,  with crashes occurring in the Oracle-supplied code. The JDBC problems started with ' too many open cursors'. Batching the commands solved this but has led onto other problems, in particular with SQL update. The lack of Edserplo's comparison dataset stalled  the whole process of validating the Java implementation of the statistics algorithms. A number of algorithmic problems have now been corrected but others remain. 

We now use 'PreparedStatement' in place of the previous SQL commands and this has represented a substantial rewrite of these elements of the Statistics class. The result is closer in efficiency to the Fortran and seems to have eliminated most of the 'too many open cursor' problems.

The 'History' element of statistics has not been tested yet as this inevitably involves the use of SQL update. Neither Edteva nor Edserplo are engineered to deal series which span changes in the PRIMOCH table (designates which channel to use for statistics) so neither will produce 'true' statistics in these cases. At the time of writing most of the extremes and surges are correct, but there are noticeable problems with Portrush, North Shields and Leith surges and about 20% of the MSLs differ (this is the one that uses a spline that is different from the one used in the Fortran). Millimetres count in MSL. 

I've been hampered doing final checks on the production code by the recent NFS mounting problems.

----------------------------

The positive note is that Edserplo should be able to digest 100+ site years and produce the desired statistics in a minute or two but we're still at release 1.5.

Steve
   




More information about the Edserplo_users mailing list