Commit graph

158 commits

Author SHA1 Message Date
sparky8512
2e045ade16 Add tracking of counter across script invocations
Write the sample counter value corresponding with the last recorded data point into the database along with the rest of the sample data so that it can be read out on next invocation of the script and data collection resumed where it left off.

Switch default sample count to all samples when in bulk mode, which now really means all samples since the last one recorded already.

Switch the time precision to be 1 second. Data points are only written one per second, anyway, and this way if there is any overlap due to counter tracking failure, the existing data will just get overwritten instead of creating duplicates.

Add a maximum queue length, so the script doesn't just keep using more memory if it persistently (>10 days) fails writing to the InfluxDB server.

Hack around some issues I ran into with the influxdb-python client library, especially with respect to running queries against InfluxDB 2.0 servers.

This concludes the functionality related to bulk collection of history data discussed on issue #5
2021-01-21 20:39:37 -08:00
sparky8512
ab335e9227 Fix time base setting when not verbose 2021-01-19 19:05:41 -08:00
sparky8512
edcf2a2ee4 Detect and correct dish time getting out of sync
If the number of samples reported in the history varies from the number of seconds elapsed as detected on the script host's system clock by more than +/- 2 seconds, forcibly correct the time base back to current system time. This doesn't seem to trigger on hosts with NTP synced system clocks (possibly because the dish's clock is, too), so no attempt is made to make this graceful, there will just be a discontinuity in the timestamps assigned to samples if this correction is made.

Also, enforce a maximum batch size for data points writes to the InfluxDB server. It's somewhat arbitrarily set at 5000 data points. A write of the full 12 hour history buffer would be 43200 data points, so this will break that up a bit.

Related to issue #5
2021-01-19 18:49:46 -08:00
sparky8512
9b96c5dcc6 Implement sample counter tracking in bulk mode
Add tracking of exactly which samples have already been sent off to InfluxDB so that samples are neither missed nor repeated due to minor time deltas in OS task scheduling. For now, this is only being applied to bulk mode.

Make the -s option only apply to the first loop iteration for bulk mode, since subsequent loops will want to pick up all samples since prior iteration.

Also, omit the latency field from the data point sent to InfluxDB for samples where the ping drop is 100%. The raw history data apparently just repeats prior value in this case, probably because it cannot just leave a hole in the data array and there is no good way to indicate invalid.

Related to issue #5
2021-01-18 13:30:34 -08:00
sparky8512
0663008be7 Add first pass of bulk data mode
Adds a new option, -b, that will write the individual sample data to InfluxDB server, instead of summary data.
2021-01-17 16:29:56 -08:00
Leigh Phillips
c516fcc1e7
Merge pull request #6 from sparky8512/main
Bring current.
2021-01-16 23:55:55 -08:00
sparky8512
e684085c5a
Merge pull request #14 from sparky8512/working
Merge some small Docker-related changes
2021-01-16 20:15:11 -08:00
sparky8512
9a57c93d73 Back out changing the default command options
Per review feedback, this could have interfered with the ability to set this option via environment variable. It was a bit messy, anyway.
2021-01-16 20:12:22 -08:00
sparky8512
9b04e8387c Minor changes based on thorough proof read 2021-01-16 10:33:22 -08:00
sparky8512
2e71acbbdb Changes to work better with Docker containers
Handle SIGTERM to enable graceful script shutdown when a container is stopped. This currently only matters for the InfluxDB scripts, and only when they run in a loop, since if the script is hard-terminated, it won't flush out any queued data points to the InfluxDB server. This also required changing the entrypoint script to exec python instead of running it as a child process of the shell running entrypoint.sh, since Docker will only deliver SIGTERM to the parent process it started directly.

Also, add -t 30 to the default Docker command to match the script default behavior prior to the changes in 46f65a6214
2021-01-16 10:17:32 -08:00
sparky8512
51f1193bd2
Merge pull request #13 from sparky8512/main
Make current with main branch
2021-01-16 07:34:24 -08:00
sparky8512
3fafcea882 Fix remaining pylint and yapf nits 2021-01-15 19:27:10 -08:00
sparky8512
46f65a6214 Implement periodic loop option
Add an interval timing loop for all the grpc scripts that did not already have one. Organized some of the code into functions in order to facilitate this, which caused some problems with local variables vs global ones, so moved the script code into a proper main() function. Which didn't really solve the access to globals issue, so also moved the mutable state into a class instance.

The interval timer should be relatively robust against time drift due to the loop function running time and/or OS scheduler delay, but is far from perfect.

Retry logic is now in place for both InfluxDB scripts. Retry for dishStatusInflux.py is slightly changed in that a failed write to InfluxDB server will be retried on every interval, rather than waiting for another batch full of data points to write, but this only happens once there is at least a full batch (currently 6) of data points pending. This new behavior matches how the autocommit functionality on SeriesHelper works.

Changed the default behavior of dishStatusInflux.py to not loop, in order to match the other scripts. To get the old behavior, add a '-t 30' option to the command line.

Closes #9
2021-01-15 18:39:33 -08:00
Leigh Phillips
e999fbf637
Merge pull request #5 from sparky8512/main
Bring Current
2021-01-12 21:30:05 -08:00
sparky8512
a589a75ce5 Revamp error printing
Closes #8
2021-01-12 19:51:38 -08:00
sparky8512
9ccfeb8181 Correct one more int/float mixage
Also, pull the change into the JSON parser for consistency.

Related to issue #12
2021-01-12 11:23:37 -08:00
Leigh Phillips
f9f0da9acb
Merge pull request #4 from sparky8512/main
Pull in latest.
2021-01-11 22:09:43 -08:00
sparky8512
fcbcbf4ef7 Make sure ints stay ints and floats stay floats
Specific history data patterns would sometimes lead to some of the stats switching between int and float type even if they were always whole numbers. This should ensure that doesn't happen.

I think this will fix #12, but will likely require deleting all the spacex.starlink.user_terminal.ping_stats data points from the database before the type conflict failure will go away.
2021-01-11 22:04:39 -08:00
sparky8512
3528542410
Merge pull request #11 from neurocis/main
Add a Grafana dashboard, update README.
2021-01-11 19:15:49 -08:00
Leigh Phillips
0ed3e074db
Update README.md 2021-01-11 17:24:10 -08:00
Leigh Phillips
37988e0f4c
Add Grafana dashboard for Starlink Stats. 2021-01-11 17:15:09 -08:00
Leigh Phillips
8f8bbbd353
Merge pull request #3 from sparky8512/main
Bump up-to-date
2021-01-11 17:13:35 -08:00
sparky8512
b06a5973c1 Change default database name to starlinkstats
The README instructions @neurocis  added for the Docker container recommend this name, and I like that better than dishstats, so now it's the default.

"dish" can be useful to differentiate between the Starlink user terminal (dish) and the Starlink router, both of which expose gRPC services for polling status information, but that's more applicable to the measurement name (AKA series_name) and a hypothetical database that contained both would be more appropriately labelled "Starlink".
2021-01-11 13:03:19 -08:00
sparky8512
4af576fbae
Merge pull request #7 from neurocis/main
Migrate to in-script scheduling.
2021-01-11 11:10:47 -08:00
Leigh Phillips
eecc65a5ba
Update README.md 2021-01-11 00:01:50 -08:00
Leigh Phillips
d21b196f78
Update README.md 2021-01-11 00:00:49 -08:00
Leigh Phillips
d0fd5a0a2b
Update entrypoint.sh
Change to passthrough all args.
2021-01-10 23:57:06 -08:00
Leigh Phillips
7fb595bbda
Update Dockerfile 2021-01-10 23:41:44 -08:00
Leigh Phillips
21e9c010e2
Create entrypoint.sh 2021-01-10 23:41:05 -08:00
Leigh Phillips
ee1d19ce35
Delete dishStatusInflux_cron.py 2021-01-10 23:40:02 -08:00
Leigh Phillips
ff2d0eacb1
Merge pull request #2 from sparky8512/main
SSL/TLS support for InfluxDB and MQTT scripts
2021-01-10 21:53:32 -08:00
sparky8512
ce44f3c021 SSL/TLS support for InfluxDB and MQTT scripts
SSL/TLS support for InfluxDB and MQTT scripts

Copy the command line option handling into the status scripts to facilitate this. Also copy the setting from env from dishStatusInflux_cron.py.

Better error handling for failures while writing to the data backend. Error printing verbosity is now a bit inconsistent, but I'll address that separately.

Still to be done is dishStatusInflux_cron.py, pending a decision on what to do with that script, given that dishStatusInflux.py can now be run in one-shot mode.

This is related to issue #2.
2021-01-10 21:36:44 -08:00
Leigh Phillips
27ce46cb3c
Update README.md
Name & run docker in daemon mode.
2021-01-09 13:22:55 -08:00
Leigh Phillips
9e09b64881
Update Dockerfile
Name & run in daemon mode
2021-01-09 13:22:22 -08:00
Leigh Phillips
1996ad26d2
Merge pull request #1 from sparky8512/main
Bring up to date.
2021-01-09 13:21:25 -08:00
sparky8512
f067f08952 Add InfluxDB and MQTT history stats scripts
Unlike the status info scripts, these include support for setting host and database parameters via command line options. Still to be added is support for HTTPS/SSL.

Add a get_id function to the grpc parser module, so it can be used for tagging purposes.

Minor cleanups in some of the other scripts to make them consistent with the newly added scripts.
2021-01-09 12:03:37 -08:00
sparky8512
253d6e9250
Fix spelling error 2021-01-09 11:43:25 -08:00
sparky8512
66a5c05d95
Merge pull request #6 from neurocis/main
Dockerize ready for InfluxDB

Related to issue #1
2021-01-09 11:42:00 -08:00
Leigh Phillips
c28e025893
Update README.md 2021-01-08 22:41:57 -08:00
Leigh Phillips
fe3cf90612
Create dishStatusInflux_cron.py 2021-01-08 22:27:54 -08:00
Leigh Phillips
49cdcaa18c
Create Dockerfile 2021-01-08 22:26:52 -08:00
sparky8512
0ee39f61fd Small change to the interface between parser and calling scripts
Move the "samples" stat out of the ping drop group of stats and into a new general stats group.

This way, it  will make more sense if/when additional stat groups are added, since that stat will apply to all of them.
2021-01-08 19:17:34 -08:00
sparky8512
96b7634a3d
Merge pull request #4 from sparky8512/working
Mostly cosmetic cleanups and module reorganization
2021-01-06 11:59:24 -08:00
sparky8512
170dd2daae Reorganize history parsing logic into a separate module
Moves the parsing logic that will be shared by some upcoming scripts out into a separate module.

There's still just as much duplication between the JSON parser and the grpc parser as there was before, but this should at least prevent further duplication of this logic.

This also adds some proper documentation for what each of the stats means.
2021-01-06 11:46:50 -08:00
sparky8512
d165791559 Readability improvements (or so PEP 8 style guide claims...) 2021-01-06 10:12:56 -08:00
sparky8512
a5036db9e0 Don't allow a sample to be both unscheduled and obstructed
Doesn't ever seem to happen, but in case it does in the future, treat that case as just unscheduled. This way, the unclassified ping loss (AKA "Beta downtime") can be computed from the totals.
2020-12-30 13:09:24 -08:00
sparky8512
4ff6cfb5fa Fix line endings to be consistent with the other scripts 2020-12-30 13:01:41 -08:00
sparky8512
5c762e754b
Merge pull request #3 from sparky8512/working
Add scripts for status output to CSV, InfluxDB, and MQTT
2020-12-30 11:57:05 -08:00
sparky8512
f206a3ad91 Add a short blurb for the recently added scripts 2020-12-30 11:47:03 -08:00
sparky8512
e1a4c473c8 Handle errors on the gRPC connection
Also, actually do the thing I said I was doing in the prior checkin by writing state as a string instead of integer.

And a bit more cleanup.
2020-12-30 10:17:02 -08:00