Commit graph

12 commits

Author SHA1 Message Date
sparky8512
f08682af30 Force line buffering when output is stdout
For some reason, I had decided that non-stdout output needed to be line buffered, but not stdout? Python will only line buffer stdout by default if it is a TTY, and there are lots of use cases where it is not. And since I started re-opening the output FD, the python executable command line option (-u) no longer works to force it to be unbuffered.

Fix #44
2022-04-13 08:54:25 -07:00
sparky8512
b11e6f4978 Catch and ignore KeyboardInterrupt exception
Per discussion in PR #40 comments. This will prevent a stack dump from littering the output when interrupting script run via Control-C. That was useful to have for a while, but probably is not anymore.

Also, fix up a few corner cases where rc could be used without it having been set.
2022-03-02 14:48:51 -08:00
sparky8512
09b8717131 Tweaks/fixes mostly related to --poll-loops
Adjust how the --poll-loops option handles the first set of polled loops for history stats, to make that set of data run against a number of samples (and therefore total time interval) that more closely matches subsequent sets of polled loops. This especially applies to the case where the stats are resuming from a prior counter, where without this logic, the first set had an unintuitively large number of samples.

Fix timestamp reporting for history stats to better reflect the actual accumulated data, for cases where later polls of the history data had failed. This introduced the potential for the status data and the history stats data to have inconsistent timestamps, and required the timestamp collection to be moved from the individual output scripts into dish_common. Rather than deal with the complication this would create for CSV output, where there is only 1 timestamp field, just disallow the combination of options that could result in different timestamps (For CSV output only).

Fix an error case where poll loop counting was not being reset correctly.

Fix explicit --samples option to be honored when resuming from counter for history stats.
2022-02-20 13:39:07 -08:00
sparky8512
32567059b8 Resume from counter for history stats CSV output
Add an option to output to a specified file instead of standard output (which is still the default), and if set, attempt to read prior end counter for use in resuming history stats computation at that point. This behavior can be disabled using the --skip-query (-k) option.

Resuming will only work for CSV files that start with a header line that matches the last line in the file, and is currently only enabled for history stats, not bulk history, because the file read operation is not at all optimized for large files. (And because I don't think anyone is really using CSV for recording bulk history data, I only implemented that because it was easy to do so and helps with testing.)

While testing this, I realized that the implementation of the --poll-loops option has an awkward interaction with resuming from prior counter value, but that impacts all the scripts that support resuming from counter, so I will address that in a subsequent change.
2022-02-19 15:51:46 -08:00
sparky8512
8655f75bab Record stats from polled data on script shutdown
If there was history data collected via the --poll-loops option, but not yet used to compute history stats, do so on script shutdown. This should reduce the amount of data lost when the script restarts, for example to reboot the system on which it runs, since --poll-loops allows for collection of history data beyond the 15-minute buffer size the dish (currently) holds.

This will only work if the script is shut down via SIGTERM or SIGINT (for example, by interrupting with Control-C).
2022-01-20 14:07:09 -08:00
sparky8512
74b0a98ffa Write empty string instead of None in text output
Not sure how I failed to notice this in testing, but for the cases where there is no data to output, the common layers pass up a None python object, and the text output was sometimes turning that into "None", whereas for CVS output, at least, just omitting any value in the field would be more appropriate.

Oddly, the bulk data path did have logic for turning None into empty string, but not the status or history stats code paths. This makes it so all text output has the same transformation logic.
2021-05-23 18:54:27 -07:00
sparky8512
55ba411db8 Go back to using message number for alert bits
SpaceX has been using inconsistent field ordering when adding alerts, so field index cannot be used to consistently identify the specific alerts. Message number is more appropriate for that, anyway, but is not guaranteed to be a low enough number to fit into a bit field. Oh well, in the unlikely event that SpaceX switches to larger message numbers, they just won't show up in the alerts bit field (but will still show up in alert_detail).

This does make the bit ordering in alerts inconsistent with prior versions of these tools, but I've never actually seen one of these alerts report true, so hopefully this doesn't impact anyone.

The alerts are still sorted by index number in the alert_detail text output, which is a problem for CSV output, but I think ordering by message number instead would be pointlessly complex. alert_detail is not a great fit for CSV output anyway, due to its variable length, so just added a warning about that in the text script module doc.
2021-03-07 09:22:52 -08:00
sparky8512
e10c9dbb7f Mostly cosmetic changes
A few things I noticed while porting this code to the JSON script. The only real change here is fixing the bulk history output to print UTC time instead of local time.
2021-02-21 13:57:48 -08:00
sparky8512
a4bf2d1625 Support for overriding dish IP and port
Probably not terribly useful unless someone needs to tunnel through a different network to get to their dish, but it makes testing the dish unreachable case a lot easier. This was complicated a bit by the fact that a channel (and therefor the dish IP and port) is needed to get the list of alert types via reflection due to prior changes.

This exposed some issues with the error message for dish unreachable, so fixed those.
2021-02-15 18:50:22 -08:00
sparky8512
94114bfd59 Add latency and usage history stat groups
Add latency and usage stat groups to the stats computed from history samples. This includes an attempt at characterizing latency under network load, too, but I don't know how useful that's going to be, so I have marked that as experimental, in case it needs algorithmic improvements.

The new groups are enabled on the command line by use of the new mode names: ping_latency, ping_loaded_latency, and usage.

Add valid_s to the obstruction details status group. This was the only missing field from everything available in the status response (other than wedge_fraction_obstructed, which seems redundant to wedge_abs_fraction_obstructed), and I only skipped it because I don't know what it means exactly. Adding it now with my best guess at a description in order to avoid a compatibility breaking change later.

Closes #5
2021-02-01 19:09:34 -08:00
sparky8512
68c1413dbd Keep grpc channel open across RPC calls
This restores the functionality that the InfluxDB status polling script had whereby instead of using a new grpc Channel for each RPC call, it would keep one open and reuse it, retrying one time if it ever fails, which can happen if the connection is lost between calls. Now all the grpc scripts have this functionality.

Also, hedge a little bit in the descriptions for what the obstruction detail fields means, given that I'm not sure my assumptions there are correct.
2021-01-30 11:24:17 -08:00
sparky8512
45b563f91a Refactor to reduce the amount of duplicate code
Combined the history and status scripts for each data backend and moved some of the shared code into a separate module. Since the existing script names were not appropriate for the new combined versions, the main entry point scripts now have new names, which better conform with Python module name conventions: dish_grpc_text.py, dish_grpc_mqtt.py, and dish_grpc_influx.py. pylint seems happier with those names, at any rate.

Switched the argument parsing from getopt to argparse, since that better facilitates sharing the common bits. The whole command line interface is now different in that the selection of data groups to process must be made as required arg(s) rather than option flags, but for the most part, the scripts support choosing an arbitrary list of groups and will process them all.

Split the monster main() functions into a more reasonable set of functions.

Added new functions to starlink_grpc to support getting the status, which returns the data in a form similar to the history data functions. Reformatted the starlink_grpc module docstring to render better with pydoc. Also changed the way sequence data field names are reported so that the consuming scripts can name them correctly without resorting to hacky special casing based on specific field names. This would subtly break the old scripts that had been expecting the old naming, but those scripts are now gone.

The code is harder to follow now, IMO, but this should allow adding of new features and/or data backends without having to make the same change in 6 places as had been the case. To that end, dish_grpc_text now supports bulk history mode, since it was trivial to add once I had it implemented in order to support that feature for dish_grpc_influx.
2021-01-29 19:25:23 -08:00