Skip to content

Commit

Permalink
fix: Reduce CPU usage on Scraper agent scraping large set of Azure ta…
Browse files Browse the repository at this point in the history
…rgets (#2050)

* Revise Promitor scraper task scheduling to organize Cron jobs by schedule instead of resource parameters, then handle multiple resources and/or resource discovery groups per job. Optionally enforce maximum degree of parallelism across all Cron jobs using a mutex shared across the jobs where each operation that requires network access to interact with the cluster is run on the thread pool and each counts as 1 against the degree of parallelism.

# Conflicts:
#	src/Promitor.Agents.Scraper/Promitor.Agents.Scraper.csproj
#	src/Promitor.Agents.Scraper/Scheduling/SchedulingExtensions.cs

* Begin addressing outstanding code scan issues.

* Additional modifications per PR feedback to incorporate more unit tests and match styling. Additionally removed dulicate constants declarations for consistency.

* Additional code analysis warnings remediated.

* Update src/Promitor.Core.Scraping/Configuration/Model/MetricDimension.cs

Co-authored-by: Tom Kerkhove <kerkhove.tom@gmail.com>

* Further PR feedback

* Updated changelog.

* Why is a markdown file limited to a maximum line length? (one smaller than that used by some IDEs for that matter)

* Someone is going to have to explain to me the intent behind linting a documentation file for coding style. This just seems counterproductive.

* Updating unit test organization per PR feedback

Co-authored-by: Tom Kerkhove <kerkhove.tom@gmail.com>
  • Loading branch information
jasonmowry and tomkerkhove authored Jun 23, 2022
1 parent 5064b55 commit ac1154f
Show file tree
Hide file tree
Showing 23 changed files with 1,036 additions and 362 deletions.
2 changes: 2 additions & 0 deletions changelog/content/experimental/unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ version:
- {{% tag added %}} Provide scraper for Azure Database for MySQL Servers ([docs](https://docs.promitor.io/v2.x/scraping/providers/mysql/)
| [#1880](https://github.com/tomkerkhove/promitor/issues/324))
- {{% tag fixed %}} Honor flag not to include timestamps in system metrics for Prometheus ([#1915](https://github.com/tomkerkhove/promitor/pull/1915))
- {{% tag fixed %}} Performance degradation caused by high CPU usage when Promitor-agent-scraper has to scrape large
set of Azure targets ([#1834](https://github.com/tomkerkhove/promitor/pull/2050))

#### Resource Discovery

Expand Down
2 changes: 1 addition & 1 deletion src/Promitor.Agents.Core/AgentProgram.cs
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ protected static int DetermineHttpPort(ServerConfiguration serverConfiguration)
{
Guard.NotNull(serverConfiguration, nameof(serverConfiguration));

return serverConfiguration?.HttpPort ?? 80;
return serverConfiguration?.HttpPort ?? Configuration.Defaults.Server.HttpPort;
}

/// <summary>
Expand Down
10 changes: 9 additions & 1 deletion src/Promitor.Agents.Core/Configuration/Defaults.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using Microsoft.Extensions.Logging;
using System;
using Microsoft.Extensions.Logging;

namespace Promitor.Agents.Core.Configuration
{
Expand All @@ -7,6 +8,13 @@ public static class Defaults
public static class Server
{
public static int HttpPort { get; } = 80;

/// <summary>
/// Default upper limit on the number of concurrent threads between all possible scheduled concurrent scraping jobs,
/// set to a reasonable load per CPU so as not to choke the system with processing overhead while attempting to
/// communicate with cluster hosts and awaiting multiple outstanding API calls.
/// </summary>
public static int MaxDegreeOfParallelism { get; } = Environment.ProcessorCount * 8;
}

public class Telemetry
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,11 @@
public class ServerConfiguration
{
public int HttpPort { get; set; } = Defaults.Server.HttpPort;

/// <summary>
/// Upper limit on the number of concurrent threads between all possible scheduled scraping jobs,
/// where 0 or negative is interpreted as unlimited.
/// </summary>
public int MaxDegreeOfParallelism { get; set; } = Defaults.Server.MaxDegreeOfParallelism;
}
}
10 changes: 5 additions & 5 deletions src/Promitor.Agents.Scraper/AzureMonitorClientFactory.cs
Original file line number Diff line number Diff line change
Expand Up @@ -24,27 +24,27 @@ public class AzureMonitorClientFactory
/// <param name="subscriptionId">Id of the Azure subscription</param>
/// <param name="metricSinkWriter">Writer to send metrics to all configured sinks</param>
/// <param name="azureScrapingPrometheusMetricsCollector">Metrics collector to write metrics to Prometheus</param>
/// <param name="memoryCache">Memory cache to store items in</param>
/// <param name="resourceMetricDefinitionMemoryCache">Memory cache to store items in</param>
/// <param name="configuration">Configuration of Promitor</param>
/// <param name="azureMonitorLoggingConfiguration">Options for Azure Monitor logging</param>
/// <param name="loggerFactory">Factory to create loggers with</param>
public AzureMonitorClient CreateIfNotExists(AzureEnvironment cloud, string tenantId, string subscriptionId, MetricSinkWriter metricSinkWriter, IAzureScrapingPrometheusMetricsCollector azureScrapingPrometheusMetricsCollector, IMemoryCache memoryCache, IConfiguration configuration, IOptions<AzureMonitorLoggingConfiguration> azureMonitorLoggingConfiguration, ILoggerFactory loggerFactory)
public AzureMonitorClient CreateIfNotExists(AzureEnvironment cloud, string tenantId, string subscriptionId, MetricSinkWriter metricSinkWriter, IAzureScrapingPrometheusMetricsCollector azureScrapingPrometheusMetricsCollector, IMemoryCache resourceMetricDefinitionMemoryCache, IConfiguration configuration, IOptions<AzureMonitorLoggingConfiguration> azureMonitorLoggingConfiguration, ILoggerFactory loggerFactory)
{
if (_azureMonitorClients.ContainsKey(subscriptionId))
{
return _azureMonitorClients[subscriptionId];
}

var azureMonitorClient = CreateNewAzureMonitorClient(cloud, tenantId, subscriptionId, metricSinkWriter, azureScrapingPrometheusMetricsCollector, memoryCache, configuration, azureMonitorLoggingConfiguration, loggerFactory);
var azureMonitorClient = CreateNewAzureMonitorClient(cloud, tenantId, subscriptionId, metricSinkWriter, azureScrapingPrometheusMetricsCollector, resourceMetricDefinitionMemoryCache, configuration, azureMonitorLoggingConfiguration, loggerFactory);
_azureMonitorClients.TryAdd(subscriptionId, azureMonitorClient);

return azureMonitorClient;
}

private static AzureMonitorClient CreateNewAzureMonitorClient(AzureEnvironment cloud, string tenantId, string subscriptionId, MetricSinkWriter metricSinkWriter, IAzureScrapingPrometheusMetricsCollector azureScrapingPrometheusMetricsCollector, IMemoryCache memoryCache, IConfiguration configuration, IOptions<AzureMonitorLoggingConfiguration> azureMonitorLoggingConfiguration, ILoggerFactory loggerFactory)
private static AzureMonitorClient CreateNewAzureMonitorClient(AzureEnvironment cloud, string tenantId, string subscriptionId, MetricSinkWriter metricSinkWriter, IAzureScrapingPrometheusMetricsCollector azureScrapingPrometheusMetricsCollector, IMemoryCache resourceMetricDefinitionMemoryCache, IConfiguration configuration, IOptions<AzureMonitorLoggingConfiguration> azureMonitorLoggingConfiguration, ILoggerFactory loggerFactory)
{
var azureCredentials = AzureAuthenticationFactory.GetConfiguredAzureAuthentication(configuration);
var azureMonitorClient = new AzureMonitorClient(cloud, tenantId, subscriptionId, azureCredentials, metricSinkWriter, azureScrapingPrometheusMetricsCollector, memoryCache, loggerFactory, azureMonitorLoggingConfiguration);
var azureMonitorClient = new AzureMonitorClient(cloud, tenantId, subscriptionId, azureCredentials, metricSinkWriter, azureScrapingPrometheusMetricsCollector, resourceMetricDefinitionMemoryCache, loggerFactory, azureMonitorLoggingConfiguration);
return azureMonitorClient;
}
}
Expand Down
26 changes: 1 addition & 25 deletions src/Promitor.Agents.Scraper/Configuration/Defaults.cs
Original file line number Diff line number Diff line change
@@ -1,14 +1,7 @@
using Microsoft.Extensions.Logging;

namespace Promitor.Agents.Scraper.Configuration
namespace Promitor.Agents.Scraper.Configuration
{
public static class Defaults
{
public static class Server
{
public static int HttpPort { get; } = 80;
}

public static class Prometheus
{
public static bool EnableMetricTimestamps { get; set; } = false;
Expand All @@ -20,22 +13,5 @@ public static class MetricsConfiguration
{
public static string AbsolutePath { get; } = "/config/metrics-declaration.yaml";
}

public class Telemetry
{
public static LogLevel? DefaultVerbosity { get; set; } = LogLevel.Error;

public class ContainerLogs
{
public static LogLevel? Verbosity { get; set; } = null;
public static bool IsEnabled { get; set; } = true;
}

public class ApplicationInsights
{
public static LogLevel? Verbosity { get; set; } = null;
public static bool IsEnabled { get; set; } = false;
}
}
}
}
Loading

0 comments on commit ac1154f

Please sign in to comment.