A metric can be defined as a single form of data that allows businesses to measure their
operations to achieve growth and optimize performance. Businesses collect data to organize and query through the data to form metrics that can help in achieving their goals. For instance, an e-commerce platform collects customer data to create metrics that represents user clicks on an ad campaign.
Metrics extract a value existing in a system at a specific point in time such as a number of
users logged in to a mobile application. Thus, metrics are collected once per second, per
minute, or at a regular interval to monitor a system over time. There are two categories of
metrics; work metrics and resource metrics. Both the categories are useful as per the
software infrastructure.
image: plahotya
Work Metrics
Work metrics stipulates the top hierarchy of the system to measure the useful output. Work
metrics are important for observability as they are big measures that allow the users to
respond to the issues of a system’s internal health and performance. The work metrics are
further divided into four subcategories:
Throughput – it is the amount of the work done on a system per unit time and it is usually
recorded as an absolute number.
Success – this represents the success rate of the work in percentage.
Error – this captures the number of errors in the result, also presented as the rate of error
per unit time. When there are various potential sources of error, the error metrics are
extracted separately from success metrics.
Performance – this quantifies the efficiency of the performance of a component. Latency is
the most common performance metrics that constitute the time required to finish a unit of
work. It is either represented as an average or percentile.
Resource Metrics
In the software infrastructure, most components serve as a resource to other systems. For
example, some resources are low level that includes CPU, disks, memory, and network
interfaces. For a higher level components like geolocation microservice or databases can be considered as a resource for another system that requires components to perform a task. Users must collect metrics from the following key areas:
Utilization – it is the percentage of time to indicate that a resource is busy or to show the
capacity of the resource under use.
Saturation – it measures the amount of work requested that cannot be catered by a
resource, usually queued.
Errors – it represents errors that are internal and cannot be observed in the work performed
by a resource.
Availability – it represents the time in percentage taken by a resource to respond to a
request. This metric is only well-defined for active and regularly checked resources.
There are some other metrics such as cache hits, or database locks that can help a complex system to be observable. Collecting data metrics in real-time can help enterprises in taking prompt actions for their software infrastructure on critical matters.
Like this information? Follow our LinkedIn page for more!
For all service related queries visit our website https://www.mcg.dk/
Comments