A few tweaks to the Prometheus exporter script

Move the global state onto the http server object so it doesn't have to be accessed as module globals.

Limit the mode groups that can be selected via command line args to the ones that are actually parsed. There are a few other options added in dish_common that don't really apply to this script, but they are mostly harmless, whereas some of the other mode groups will cause this script to throw an exception.

Reject access to "/favicon.ico" path, so testing from a web browser does not result in running the dish queries twice, and thus confusing the global state a little.

Add a lock to serialize calls to dish_common.get_data. That function is not thread-safe, even with CPython's Global Interpreter Lock, because the starlink_grpc functions it calls block. This script is really not meant for concurrent HTTP access, given that the usage stats are reported as usage since last access (by default), but since it's technically supported, might as well have it work properly.

Add the same handling of keyboard interrupt (Ctrl-C) and SIGTERM signal as the other grpc scripts, along with proper shutdown.
This commit is contained in:
sparky8512 2022-12-21 10:25:57 -08:00
parent ab9c26e517
commit dc4ff85dbe
3 changed files with 59 additions and 30 deletions

View file

@ -43,7 +43,7 @@ return data in a format Prometheus can scrape.
All these scripts support processing status data and/or history data in various modes. The status data is mostly what appears related to the dish in the Debug Data section of the Starlink app, whereas most of the data displayed in the Statistics page of the Starlink app comes from the history data. Specific status or history data groups can be selected by including their mode names on the command line. Run the scripts with `-h` command line option to get a list of available modes. See the documentation at the top of `starlink_grpc.py` for detail on what each of the fields means within each mode group. All these scripts support processing status data and/or history data in various modes. The status data is mostly what appears related to the dish in the Debug Data section of the Starlink app, whereas most of the data displayed in the Statistics page of the Starlink app comes from the history data. Specific status or history data groups can be selected by including their mode names on the command line. Run the scripts with `-h` command line option to get a list of available modes. See the documentation at the top of `starlink_grpc.py` for detail on what each of the fields means within each mode group.
`dish_grpc_prometheus.py` has only been tested with the modes `status`, `usage`, and `alert_detail`. `dish_grpc_prometheus.py` only allows the modes `status`, `usage`, and `alert_detail`.
For example, data from all the currently available status groups can be output by doing: For example, data from all the currently available status groups can be output by doing:
```shell script ```shell script
@ -52,12 +52,14 @@ python3 dish_grpc_text.py status obstruction_detail alert_detail
By default, `dish_grpc_text.py` will output in CSV format. You can use the `-v` option to instead output in a (slightly) more human-readable format. By default, `dish_grpc_text.py` will output in CSV format. You can use the `-v` option to instead output in a (slightly) more human-readable format.
By default, all of these scripts will pull data once, send it off to the specified data backend, and then exit. They can instead be made to run in a periodic loop by passing a `-t` option to specify loop interval, in seconds. For example, to capture status information to a InfluxDB server every 30 seconds, you could do something like this: By default, most of these scripts will pull data once, send it off to the specified data backend, and then exit. They can instead be made to run in a periodic loop by passing a `-t` option to specify loop interval, in seconds. For example, to capture status information to a InfluxDB server every 30 seconds, you could do something like this:
```shell script ```shell script
python3 dish_grpc_influx.py -t 30 [... probably other args to specify server options ...] status python3 dish_grpc_influx.py -t 30 [... probably other args to specify server options ...] status
``` ```
Some of the scripts (currently only the InfluxDB ones) also support specifying options through environment variables. See details in the scripts for the environment variables that map to options. The exception to this is `dish_grpc_prometheus.py`, for which the timing interval is determined by whatever is polling the HTTP page it exports.
Some of the scripts (currently only the InfluxDB and MQTT ones) also support specifying options through environment variables. See details in the scripts for the environment variables that map to options.
#### Bulk history data collection #### Bulk history data collection

View file

@ -94,7 +94,7 @@ def create_arg_parser(output_description, bulk_history=True):
return parser return parser
def run_arg_parser(parser, need_id=False, no_stdout_errors=False): def run_arg_parser(parser, need_id=False, no_stdout_errors=False, modes=None):
"""Run parse_args on a parser previously created with create_arg_parser """Run parse_args on a parser previously created with create_arg_parser
Args: Args:
@ -104,17 +104,20 @@ def run_arg_parser(parser, need_id=False, no_stdout_errors=False):
no_stdout_errors (bool): A flag set in options to protect stdout from no_stdout_errors (bool): A flag set in options to protect stdout from
error messages, in case that's where the data output is going, so error messages, in case that's where the data output is going, so
may be being redirected to a file. may be being redirected to a file.
modes (list[str]): Optionally provide the subset of data group modes
to allow.
Returns: Returns:
An argparse Namespace object with the parsed options set as attributes. An argparse Namespace object with the parsed options set as attributes.
""" """
all_modes = STATUS_MODES + HISTORY_STATS_MODES + UNGROUPED_MODES if modes is None:
if parser.bulk_history: modes = STATUS_MODES + HISTORY_STATS_MODES + UNGROUPED_MODES
all_modes.append("bulk_history") if parser.bulk_history:
modes.append("bulk_history")
parser.add_argument("mode", parser.add_argument("mode",
nargs="+", nargs="+",
choices=all_modes, choices=modes,
help="The data group to record, one or more of: " + ", ".join(all_modes), help="The data group to record, one or more of: " + ", ".join(modes),
metavar="mode") metavar="mode")
opts = parser.parse_args() opts = parser.parse_args()

View file

@ -2,17 +2,28 @@
"""Prometheus exporter for Starlink user terminal data info. """Prometheus exporter for Starlink user terminal data info.
This script pulls the current status info and/or metrics computed from the This script pulls the current status info and/or metrics computed from the
history data and makes it available via HTTP in the format Prometeus expects. history data and makes it available via HTTP in the format Prometheus expects.
""" """
import logging
import sys
from http import HTTPStatus from http import HTTPStatus
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
import logging
import signal
import sys
import threading
import dish_common import dish_common
class Terminated(Exception):
pass
def handle_sigterm(signum, frame):
# Turn SIGTERM into an exception so main loop can clean up
raise Terminated
class MetricInfo: class MetricInfo:
unit = "" unit = ""
kind = "gauge" kind = "gauge"
@ -125,13 +136,7 @@ class MetricValue:
return f"{label_str} {self.value}" return f"{label_str} {self.value}"
opts = None
gstate = None
def parse_args(): def parse_args():
global opts
parser = dish_common.create_arg_parser( parser = dish_common.create_arg_parser(
output_description="Prometheus exporter", bulk_history=False output_description="Prometheus exporter", bulk_history=False
) )
@ -140,12 +145,10 @@ def parse_args():
group.add_argument("--address", default="0.0.0.0", help="IP address to listen on") group.add_argument("--address", default="0.0.0.0", help="IP address to listen on")
group.add_argument("--port", default=8080, type=int, help="Port to listen on") group.add_argument("--port", default=8080, type=int, help="Port to listen on")
opts = dish_common.run_arg_parser(parser) return dish_common.run_arg_parser(parser, modes=["status", "alert_detail", "usage"])
def prometheus_export(): def prometheus_export(opts, gstate):
global opts, gstate
raw_data = {} raw_data = {}
def data_add_item(name, value, category): def data_add_item(name, value, category):
@ -155,9 +158,10 @@ def prometheus_export():
def data_add_sequencem(name, value, category, start): def data_add_sequencem(name, value, category, start):
raise NotImplementedError("Did not expect sequence data") raise NotImplementedError("Did not expect sequence data")
rc, status_ts, hist_ts = dish_common.get_data( with gstate.lock:
opts, gstate, data_add_item, data_add_sequencem rc, status_ts, hist_ts = dish_common.get_data(
) opts, gstate, data_add_item, data_add_sequencem
)
metrics = [] metrics = []
@ -243,7 +247,15 @@ def prometheus_export():
class MetricsRequestHandler(BaseHTTPRequestHandler): class MetricsRequestHandler(BaseHTTPRequestHandler):
def do_GET(self): def do_GET(self):
content = prometheus_export() path = self.path.partition("?")[0]
if path.lower() == "/favicon.ico":
self.send_error(HTTPStatus.NOT_FOUND)
return
opts = self.server.opts
gstate = self.server.gstate
content = prometheus_export(opts, gstate)
self.send_response(HTTPStatus.OK) self.send_response(HTTPStatus.OK)
self.send_header("Content-type", "text/plain") self.send_header("Content-type", "text/plain")
self.send_header("Content-Length", len(content)) self.send_header("Content-Length", len(content))
@ -252,16 +264,28 @@ class MetricsRequestHandler(BaseHTTPRequestHandler):
def main(): def main():
global opts, gstate opts = parse_args()
parse_args()
logging.basicConfig(format="%(levelname)s: %(message)s", stream=sys.stderr) logging.basicConfig(format="%(levelname)s: %(message)s", stream=sys.stderr)
gstate = dish_common.GlobalState(target=opts.target) gstate = dish_common.GlobalState(target=opts.target)
gstate.lock = threading.Lock()
httpd = ThreadingHTTPServer((opts.address, opts.port), MetricsRequestHandler) httpd = ThreadingHTTPServer((opts.address, opts.port), MetricsRequestHandler)
httpd.serve_forever() httpd.daemon_threads = False
httpd.opts = opts
httpd.gstate = gstate
signal.signal(signal.SIGTERM, handle_sigterm)
print("HTTP listening on port", opts.port)
try:
httpd.serve_forever()
except (KeyboardInterrupt, Terminated):
pass
finally:
httpd.server_close()
httpd.gstate.shutdown()
sys.exit() sys.exit()