Commit graph

207 commits

Author SHA1 Message Date
sparky8512
a0366b4526
Merge pull request #30 from sparky8512/obstruction-map
Obstruction map support

Fixes #27
2021-09-08 13:49:36 -07:00
sparky8512
1a9af6ad5d Interval loop support for obstruction maps
Tracked on issue #27
2021-09-08 13:45:36 -07:00
sparky8512
e1070965f2 Initial cut of obstruction map support
Add a new command line script, dish_obstruction_map.py, that writes a PNG image based on the obstruction map data queried from the dish.

Supports color or greyscale output and either with or without alpha channel.

Does not yet support running in an interval loop, mostly because that will require templatizing the output filename in order to be useful.

Tracked on issue #27
2021-09-07 17:29:56 -07:00
sparky8512
af940a9727 Improvements to how the -o option works
Change the loop polling function (-o) to aggregate the history data each polling loop instead of just keeping the last polled history so it can be logged when reboot is detected. This allows for computing statistics across a longer period than the size of the dish's history buffer, which has been reduced to 15 minutes recently.

This change also makes it so data is not logged right away when dish reboot is detected, so the logging always happens at the specified interval whether there was a reboot or not.

Finally, change the poll loop counting so data is not emitted on the first loop when polling is configured. That made sense to do when the history buffer was large enough to have the entire period's worth of data, but now it just results in a short period in the log output every time the script is restarted.

Fixes #29
2021-09-07 12:02:14 -07:00
sparky8512
41caa76962 Add a few more fields to the status group
Add dish direction and "prolonged" obstruction info to the status mode group.

These were added to the grpc service at some point over the last several months.

Only lightly tested, given that my dish no longer reports significant periods of obstruction.

This is related to discussion in issue #27, although it doesn't address that issue in the slightest.
2021-09-05 17:41:13 -07:00
sparky8512
0f540a4b96
Add note about reduction of history buffer data size 2021-08-19 13:38:36 -07:00
sparky8512
74b0a98ffa Write empty string instead of None in text output
Not sure how I failed to notice this in testing, but for the cases where there is no data to output, the common layers pass up a None python object, and the text output was sometimes turning that into "None", whereas for CVS output, at least, just omitting any value in the field would be more appropriate.

Oddly, the bulk data path did have logic for turning None into empty string, but not the status or history stats code paths. This makes it so all text output has the same transformation logic.
2021-05-23 18:54:27 -07:00
sparky8512
d603272d90 Add option to emit booleans as numeric values
New command line option, -N or --numeric, that will cause all boolean values, including those in sequences/arrays, to be written as 1 or 0 instead of True or False.

Per request in issue 26.

WARNING: Use or non-use of this option with the database output scripts will change the schema of the data. sqlite doesn't care about that, because it stores booleans as integers, anyway, but InfluxDB will trip an error if you try to record data points with this option to a database that has data point recorded without it, or vice versa.
2021-05-23 18:45:18 -07:00
sparky8512
23b54c1344
Merge pull request #24 from sparky8512/docker-reflection
Switch docker reflection from grpcurl to yagrc
2021-03-27 08:21:03 -07:00
sparky8512
77e4046ba9 Switch docker reflection from grpcurl to yagrc
Remove grpcurl and grpcio-tools from container configuration and add yagrc, so that the direct reflection support in the Python scripts can be used. Also, pin all Python packages, including package dependencies, to specific version numbers, since that was already the case by happenstance due to the way Docker caches its build images and an undesirable version of the protobuf package was being cached.

Addresses #23, which was directly about this change.
Expected to also address #22, as a result of pinning the protobuf package version.
Should also prevent a recurrence of #18, since yagrc will automatically get any new dependant protocol files via reflection.
2021-03-23 18:32:00 -07:00
sparky8512
07389cb0d9 Remove dependence on Python 3.8 or later
statistics.quantiles was not present in Python 3.7 or earlier, which is a problem on Windows if you want to run a binary optimized version of the protobuf package, since those are not currently being posted for Python 3.8 or later.

This change switches to use the weighted median function just with equal weights. It's a bit of overkill, but it also cuts out the mess that was working around deficiencies of the statistics.quantiles implementation.
2021-03-16 13:19:06 -07:00
sparky8512
55ba411db8 Go back to using message number for alert bits
SpaceX has been using inconsistent field ordering when adding alerts, so field index cannot be used to consistently identify the specific alerts. Message number is more appropriate for that, anyway, but is not guaranteed to be a low enough number to fit into a bit field. Oh well, in the unlikely event that SpaceX switches to larger message numbers, they just won't show up in the alerts bit field (but will still show up in alert_detail).

This does make the bit ordering in alerts inconsistent with prior versions of these tools, but I've never actually seen one of these alerts report true, so hopefully this doesn't impact anyone.

The alerts are still sorted by index number in the alert_detail text output, which is a problem for CSV output, but I think ordering by message number instead would be pointlessly complex. alert_detail is not a great fit for CSV output anyway, due to its variable length, so just added a warning about that in the text script module doc.
2021-03-07 09:22:52 -08:00
sparky8512
203efaf84d Note detail of where data is recorded per backend
It came up in the course of discussion in issue #20 that I hadn't actually documented this anywhere.
2021-02-27 16:39:28 -08:00
sparky8512
be776cde1c Resume from last counter for history stats
Currently only implemented for sqlite, since with InfluxDB, there is the complication that the InfluxDB server may not be available to query at script start time. Also, it only applies when polling all samples, which is not the default, and even then can be disabled with either --skip-query or --no-counter options.

Remove the crontab instructions from the README, since the periodic loop functionality is probably now a better approach for periodic recording of stats data.
2021-02-27 15:57:35 -08:00
sparky8512
206bbbf919
Switch two more references to new script name 2021-02-21 14:11:19 -08:00
sparky8512
c14ae847ae Fix history counter query in sqlite script
Got broken when I made a separate samples option for bulk history vs history stats. Also, it looks like I never actually implemented support for the --skip-query option.
2021-02-21 14:06:01 -08:00
sparky8512
e10c9dbb7f Mostly cosmetic changes
A few things I noticed while porting this code to the JSON script. The only real change here is fixing the bulk history output to print UTC time instead of local time.
2021-02-21 13:57:48 -08:00
sparky8512
258a33d62d Port grpc history features to JSON parser script
This brings most of the history-related functionality implemented in the grpc scripts to the JSON version, but only for text output. It also renames parserJsonHistory.py to dish_json_text.py, which removes the last remaining complaint from pylint about module name not conforming to style conventions.

A lot of this is just duplicated code from dish_common and dish_grpc_text, just simplified a little where some of the flexibility wasn't needed.

This removes compatibility with Python 2.7, because I didn't feel like reimplementing statistics.pstdev and didn't think such compatibility was particularly important.
2021-02-21 13:49:45 -08:00
sparky8512
38987054b9 Add option to poll history more frequently
This further complicates the code, for functionality that probably only I care about, but when computing stats for relatively long time intervals, it really hurts when the dish reboots and up to an entire time period's worth of data is lost at exactly the point where it may have been having interesting behavior.
2021-02-19 10:56:20 -08:00
sparky8512
18829bd5cb Allow force option when schema version matches
Since the alert types are determined dynamically from the protocol definition, the status schema may need to be updated even if nothing changed in the scripts, when the dish software adds a new alert type (which just happened, say hello to the "mast_not_near_vertical" alert). This allows the manual override for that case, not just schema version downgrade.
2021-02-15 19:23:37 -08:00
sparky8512
a4bf2d1625 Support for overriding dish IP and port
Probably not terribly useful unless someone needs to tunnel through a different network to get to their dish, but it makes testing the dish unreachable case a lot easier. This was complicated a bit by the fact that a channel (and therefor the dish IP and port) is needed to get the list of alert types via reflection due to prior changes.

This exposed some issues with the error message for dish unreachable, so fixed those.
2021-02-15 18:50:22 -08:00
sparky8512
1659133168 Switch reflect usage to yagrc's new lazy importer
This makes the normal imports a bit more readable.

Lazy import requires yagrc v1.1.0, so bumped requirements.txt entry for that.
2021-02-14 17:22:52 -08:00
sparky8512
30e4b27516 Fix handling of 0 history samples range
This can happen when polling the history buffer very frequently (<= 1 second).
2021-02-14 13:47:24 -08:00
sparky8512
2ac5944824 Fix statistics error
statistics.quantiles doesn't handle the case of having only one data sample.
2021-02-13 17:21:11 -08:00
sparky8512
80e752a510 Counter state tracking for non-bulk history data
with option to disable to get prior behavior of fixed number of samples per loop iteration.
2021-02-13 10:17:42 -08:00
sparky8512
67b0045ac8 Add the non-abs wedge_fraction_obstructed status
I had only added the one that the Starlink app uses to show obstructions, because it didn't seem like the other one was all that useful, but people seem to be interested in studying the difference between the 2, so might as well have it. This is in the obstruction_detail group, along with the other one. I'm kinda regretting naming the first one as I did, though, because it's now a little confusing between my naming and the naming in the grpc message.

Since this is a new field, also had to implement schema updates for the sqlite script.
2021-02-12 19:53:28 -08:00
sparky8512
ec61333710 Two more places that need the grpc imports 2021-02-12 13:48:55 -08:00
sparky8512
491227ddb4 Remove need to generate grpc modules via protoc
If the spacex grpc modules are not available in the import path, will now fall back to using reflection to get them dynamically. I'm not real happy with the mess this made of the import lines, though (and neither is pylint...), so I may hack on that a little further when I get the time.

Add a requirements.txt file to enable installation of all prerequisites so users don't have to follow the individual instructions for each dependency package. At some point, I'll really need to add proper Python packaging so the whole thing can just be installed via pip.

Rearrange the README a bit, since some of the sections have increased or decreased in relevance over time.
2021-02-12 13:11:54 -08:00
sparky8512
381cfbee00 Hedge a little more on a technical detail
Specifically, on what wedges_fraction_obstructed means exactly, because I suspect the max values may vary based on weighting of the raw numbers.
2021-02-11 18:47:26 -08:00
sparky8512
1ed0adc55a
Add new proto file to protoc instructions 2021-02-09 21:42:14 -08:00
sparky8512
062f705ada
Merge pull request #19 from neurocis/main
Fix issue #18
2021-02-09 21:39:56 -08:00
Leigh Phillips
29ab50278f
Add transceiver.proto
Upstream issue #18
2021-02-09 21:26:46 -08:00
Leigh Phillips
8d80c6b1e1
Merge pull request #8 from sparky8512/main
Bring current.
2021-02-09 21:22:29 -08:00
sparky8512
a4adf3c383 Clarify a technical detail
... that I'm sure nobody cares about.
2021-02-04 20:05:04 -08:00
sparky8512
db83a7f042 Switch generated protobuf module imports around
The way I had them, it was hiding the fact that there was no explicit import for the spacex.api.device.dish_pb2 module.
2021-02-04 20:02:45 -08:00
sparky8512
549a46ae56 Add new dish_grpc script for sqlite output
I'm sure this isn't a particularly optimal implementation, but it's functional.

This required exporting knowledge about the types that will be returned per field from starlink_grpc and moving things around a little in dish_common.
2021-02-03 17:23:01 -08:00
sparky8512
188733e4fb Fix "usage" mode for InfluxDB
I forgot this script knows about the category labels.
2021-02-02 18:03:47 -08:00
sparky8512
ed2ef50581 Catch GrpcError in new example
Apparently, it was a little _too_ simple.

Also, update the description for seconds_to_first_nonempty_slot field to reflect some behavior this script was able to capture.
2021-02-02 09:07:42 -08:00
sparky8512
f5f1bbdb84 Clean up the simple example script and add new one
This adds an example script for the starlink_grpc module. It's a kinda silly thing, but I threw it together to better understand some of the status data, so I figured I'd upload it, since the other example is for direct grpc usage (or for starlink_json if parseJsonHistory can be considered an example).

Rename dishDumpStatus so pylint will stop complaining about the module name. The only script left with my old naming convention now is parseJsonHistory.py.
2021-02-01 20:47:18 -08:00
sparky8512
94114bfd59 Add latency and usage history stat groups
Add latency and usage stat groups to the stats computed from history samples. This includes an attempt at characterizing latency under network load, too, but I don't know how useful that's going to be, so I have marked that as experimental, in case it needs algorithmic improvements.

The new groups are enabled on the command line by use of the new mode names: ping_latency, ping_loaded_latency, and usage.

Add valid_s to the obstruction details status group. This was the only missing field from everything available in the status response (other than wedge_fraction_obstructed, which seems redundant to wedge_abs_fraction_obstructed), and I only skipped it because I don't know what it means exactly. Adding it now with my best guess at a description in order to avoid a compatibility breaking change later.

Closes #5
2021-02-01 19:09:34 -08:00
sparky8512
27a98f936b
Merge pull request #17 from sparky8512/unify-refactor
This merge refactors the grpc scripts to reduce the amount of duplicate code.

By necessity, this changes the command line interface in an incompatible way, although most of the options stayed the same and there is a straightforward mapping of old script to new script + mode arg(s).

It also changes the starlink_grpc module interface slightly in that some field names now have a suffix indicating size of sequence type data that will have to be parsed out to get the actual name by itself.
2021-01-30 14:08:56 -08:00
sparky8512
076e80c84a
Point out how to find the group/field descriptions 2021-01-30 13:29:17 -08:00
sparky8512
5c6a191660 Updates for new script naming and CLI
For now, the default docker command includes the altert detail but not the obstruction detail, because that's what the old dishStatusInflux.py script had.
2021-01-30 13:17:42 -08:00
sparky8512
68c1413dbd Keep grpc channel open across RPC calls
This restores the functionality that the InfluxDB status polling script had whereby instead of using a new grpc Channel for each RPC call, it would keep one open and reuse it, retrying one time if it ever fails, which can happen if the connection is lost between calls. Now all the grpc scripts have this functionality.

Also, hedge a little bit in the descriptions for what the obstruction detail fields means, given that I'm not sure my assumptions there are correct.
2021-01-30 11:24:17 -08:00
sparky8512
45b563f91a Refactor to reduce the amount of duplicate code
Combined the history and status scripts for each data backend and moved some of the shared code into a separate module. Since the existing script names were not appropriate for the new combined versions, the main entry point scripts now have new names, which better conform with Python module name conventions: dish_grpc_text.py, dish_grpc_mqtt.py, and dish_grpc_influx.py. pylint seems happier with those names, at any rate.

Switched the argument parsing from getopt to argparse, since that better facilitates sharing the common bits. The whole command line interface is now different in that the selection of data groups to process must be made as required arg(s) rather than option flags, but for the most part, the scripts support choosing an arbitrary list of groups and will process them all.

Split the monster main() functions into a more reasonable set of functions.

Added new functions to starlink_grpc to support getting the status, which returns the data in a form similar to the history data functions. Reformatted the starlink_grpc module docstring to render better with pydoc. Also changed the way sequence data field names are reported so that the consuming scripts can name them correctly without resorting to hacky special casing based on specific field names. This would subtly break the old scripts that had been expecting the old naming, but those scripts are now gone.

The code is harder to follow now, IMO, but this should allow adding of new features and/or data backends without having to make the same change in 6 places as had been the case. To that end, dish_grpc_text now supports bulk history mode, since it was trivial to add once I had it implemented in order to support that feature for dish_grpc_influx.
2021-01-29 19:25:23 -08:00
sparky8512
a692f8930a
Add another potential TODO 2021-01-27 16:26:23 -08:00
Leigh Phillips
2b8c2991f1
Merge pull request #7 from sparky8512/main
Bring Current.
2021-01-22 20:05:05 -08:00
sparky8512
36f433aebd
Merge pull request #15 from sparky8512/working
Bulk history mode for InfluxDB script
2021-01-22 18:51:40 -08:00
sparky8512
e16649fbf1 Change name of "current" to "end_counter"
Since "current" got added to the global data group returned from getting the history stats in non-bulk mode, it was being output by all 3 of the history scripts, and the name "current" was a little confusing when looking at prior output, since old values would no longer be current. The description of it in the start param of history_bulk_data was confusing, too.
2021-01-22 18:43:51 -08:00
sparky8512
2e045ade16 Add tracking of counter across script invocations
Write the sample counter value corresponding with the last recorded data point into the database along with the rest of the sample data so that it can be read out on next invocation of the script and data collection resumed where it left off.

Switch default sample count to all samples when in bulk mode, which now really means all samples since the last one recorded already.

Switch the time precision to be 1 second. Data points are only written one per second, anyway, and this way if there is any overlap due to counter tracking failure, the existing data will just get overwritten instead of creating duplicates.

Add a maximum queue length, so the script doesn't just keep using more memory if it persistently (>10 days) fails writing to the InfluxDB server.

Hack around some issues I ran into with the influxdb-python client library, especially with respect to running queries against InfluxDB 2.0 servers.

This concludes the functionality related to bulk collection of history data discussed on issue #5
2021-01-21 20:39:37 -08:00