statsd

Version:

Network daemon for the collection and aggregation of realtime application metrics

72 lines (49 loc) • 2.97 kB

Markdown

# TCP Stats Interface A really simple TCP management interface is available by default on port `8126` or overriden in the configuration file. Inspired by the memcache stats approach this can be used to monitor a live StatsD server. You can interact with the management server by telnetting to port `8126`, the following commands are available based on the running server. ## Common commands * health [up|down] - a way to get/set the health status of StatsD. Alone will get you the current health status. Passing a second command will set the status to the new value. Accepted values are _up_ and _down_. * config - a dump of the current configuration * quit - close the connection from the server side ## StatsD specific commands * stats - some stats about the running server * counters - a dump of all the current counters * gauges - a dump of all the current gauges * timers - a dump of the current timers * delcounters - delete a counter or folder of counters * delgauges - delete a gauge or folder of gauges * deltimers - delete a timer or folder of timers The stats output currently will give you: * uptime: the number of seconds elapsed since StatsD started * messages.last_msg_seen: the number of elapsed seconds since StatsD received a message * messages.bad_lines_seen: the number of bad lines seen since startup You can use the del commands to delete an individual metric like this : #to delete counter sandbox.test.temporary echo "delcounters sandbox.test.temporary" | nc 127.0.0.1 8126 Or you can use the del command to delete a folder of metrics like this : #to delete counters sandbox.test.* echo "delcounters sandbox.test.*" | nc 127.0.0.1 8126 Each backend will also publish a set of statistics, prefixed by its module name. Graphite: * graphite.last_flush: unix timestamp of last successful flush to graphite * graphite.last_exception: unix timestamp of last exception thrown whilst flushing to graphite * graphite.flush_length: the length of the string sent to graphite * graphite.flush_time: the time it took to send the data to graphite Those statistics will also be sent to graphite under the namespaces `stats.statsd.graphiteStats.last_exception` and `stats.statsd.graphiteStats.last_flush`. A simple nagios check can be found in the `utils/` directory that can be used to check metric thresholds, for example the number of seconds since the last successful flush to graphite. The health output: * the health command alone allows you to see the current health status. * using health up or health down, you can change the current health status. * the healthStatus configuration option allows you to set the default health status at start. ## StatsD Proxy specific commands * status - the status of the current server The __status__ output currently will give you: * uptime: the number of seconds elapsed since StatsD proxy started * nodes: a space separated list of host:port for each active node in the ring