[Edserplo_users] Update: Edserplo usage logging

Steve Loch sgl at bodc.ac.uk
Fri Apr 11 18:07:19 BST 2008



Edserplo has gone live with usage logging: the amount of time spent using Edserplo on each series is recorded in Oracle. There are certain restrictions on the scope of the operation: series data used on the Edteva side are excluded from usage logging as are any files that you cannot modify (e.g. belonging to another ID). The latter means that software developers will not generally be adding to the statistics.

The software is piggybacked on the NFS Monitoring code. The CRCHOLD table now has an additional column - SERPUSE - which records the number of minutes (rounded) spent eyeballing a series and this is incrermented at session termination. stopmonnfs being a Matlab command has no effect on Edserplo. However a centralised withdrawal of monitoring will cover Edserplo usage. We also contrive to ensure that people working outside the group do not contribute to group usage statistics (monitoring should not be active for them).

It is important to exclude inactive time. This makes for much trickier coding.  In other words if your session spans lunch Edserplo has to register that nothing is happening for the duration and not increment the relevant series timers. Time is logged when you are on a viewing page and the relevant series is selected for viewing and is modifiable; it matters not whether you are in single series mode or not or whether you modify the series or not.  As part of this update I have synonymed the CRCHOLD table in BODC so that people are in a position to see whether the logged time accords with their usage. If they see discrepancies - like no entry or much more or less than they reckon - then they need to tell us, probably via Bugzlla.

----------------------------------

This work has sprung from a chance question during the last SMA when we were asked whether we used automated screening techniques - as per the Met Office - from NERC's chief executive, no less. Our input stream is far more diverse than the MO's, one would think, and what will work in one context may not work well in the other. Now we will have the figures and we will be in a position to defend our modus operandi. Furthermore we will see whether automated flagging - due to be introduced next, as part of our ongoing development - saves us time.  If you sum over the usage 

select sum(serpuse) from crchold 

one week and do the same thing a week later you will know how much time the group has collectively spent on this operation during the week and therefore what proportion of staff time is tied up in this activity. 

There are other ways of carving up the data: one can see the relative expense of different cruises, different types of data, etc., etc. There is further work to do here in terms of preparing reporting scripts and we need a steer from management on what is needed. A weekly cron job (cf above) could generate entries of usage in another table for example. Note that when GPFS comes along the new file entry will be quite separate from the old one but the total time can still be aggregated, as needs dictate, using the gswitch software. A cron job could do this consolidation.

Lest people are concerned  I should point out that the user ID, plucked from the Edserplo session and which is  held in the CRCHOLD table, gets overwritten by the next operation, e.g., when the file is next checked on Linux with Checknqxfgui it will be the collective ID that overwrites the individual ID. 

Thanks are to due to Jonathan for the code. I counted 16 classes changed, 1 class written and 3 interfaces defined.

Steve




More information about the Edserplo_users mailing list