Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove xDS support #347

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@
Lists all changes with user impact.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## [0.20.0]

### Changed
- Remove xds support

## [0.19.29]

### Changed
Expand Down
4 changes: 1 addition & 3 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,9 +79,7 @@ Property
**envoy-control.envoy.snapshot.routes.status.endpoints** | List of endpoints with path or prefix of status routes | /status
**envoy-control.envoy.snapshot.routes.status.create-virtual-cluster** | Create virtual cluster for status route | false
**envoy-control.envoy.snapshot.state-sample-duration** | Duration of state sampling (this is used to prevent surges in consul events overloading control plane) | 1s
**envoy-control.envoy.snapshot.xds-cluster-name** | Name of cluster for xDS operations | envoy-control-xds
**envoy-control.envoy.snapshot.enabled-communication-modes.ads** | Enable or disable support for ADS communication mode | true
**envoy-control.envoy.snapshot.enabled-communication-modes.xds** | Enable or disable support for XDS communication mode | true
**envoy-control.envoy.snapshot.xds-cluster-name** | Name of cluster for xDS operations | envoy-control-xds | true
**envoy-control.envoy.snapshot.should-send-missing-endpoints** | Enable sending missing Endpoints - when Envoy requests for not existing cluster in snapshot control-plane will respond with empty Endpoint definition | false
**envoy-control.envoy.snapshot.cluster-name** | Dynamic forward proxy cluster name | dynamic_forward_proxy_cluster
**envoy-control.envoy.snapshot.dns-lookup-family** | DNS lookup address family | V4_ONLY
Expand Down
1 change: 0 additions & 1 deletion docs/features/permissions.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ An example configuration:

```yaml
metadata:
ads: true
proxy_settings:
outgoing:
dependencies:
Expand Down
10 changes: 0 additions & 10 deletions docs/integrations/envoy.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,6 @@ and the IP can be put in `x-envoy-original-dst-host` header (`x-envoy-original-d
By default, Envoy will respond with `404` status code when it receives a request for a cluster that does not exist.
The behavior is changed so that the `503` status code is returned.

## ADS Support

By default, the xDS is used instead of
[Aggregated Discovery Service](https://www.envoyproxy.io/docs/envoy/latest/configuration/overview/xds_api#aggregated-discovery-service)
(ADS). To use ADS for given node put the
```
ads: true
```
in Envoy metadata config. Envoy Control will pick it up and use ADS for this node.

## Outlier detection

You can configure global
Expand Down
9 changes: 0 additions & 9 deletions docs/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,6 @@ Additionally, the network usage is significantly higher. With 1,000 clusters, we
Assuming that a new snapshot is generated every second. 1,000 Envoys with 1,000 clusters can generate a load of
300 MB/s. When following only a few services, the snapshot is about 5 KB and it's sent much less frequently.

### Use ADS

With xDS, Envoy set up a gRPC stream to Envoy Control per cluster. Let's say there are 1,000 Envoys and 1,000 clusters.
Envoy Control will have to handle 1,000,000 open gRPC streams. This puts pressure on memory, which converts to more
frequent GC runs and higher CPU usage.

With ADS, each Envoy sets up a single gRPC stream for all clusters. With 1,000 Envoys, there are 1,000 streams which
reduces memory usage dramatically.

### Sampling

Envoy Control by default follows changes from the discovery service, batches them and sends to Envoys at most once every second.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,23 +1,20 @@
package pl.allegro.tech.servicemesh.envoycontrol.groups

sealed class Group {
abstract val communicationMode: CommunicationMode
abstract val serviceName: String
abstract val discoveryServiceName: String?
abstract val proxySettings: ProxySettings
abstract val listenersConfig: ListenersConfig?
}

data class ServicesGroup(
override val communicationMode: CommunicationMode,
override val serviceName: String = "",
override val discoveryServiceName: String? = null,
override val proxySettings: ProxySettings = ProxySettings(),
override val listenersConfig: ListenersConfig? = null
) : Group()

data class AllServicesGroup(
override val communicationMode: CommunicationMode,
override val serviceName: String = "",
override val discoveryServiceName: String? = null,
override val proxySettings: ProxySettings = ProxySettings(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -138,15 +138,13 @@ class MetadataNodeGroup(
return when {
hasAllServicesDependencies(nodeMetadata) ->
AllServicesGroup(
nodeMetadata.communicationMode,
serviceName,
discoveryServiceName,
proxySettings,
listenersConfig
)
else ->
ServicesGroup(
nodeMetadata.communicationMode,
serviceName,
discoveryServiceName,
proxySettings,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ class NodeMetadata(metadata: Struct, properties: SnapshotProperties) {
.fieldsMap["discovery_service_name"]
?.stringValue

val communicationMode = getCommunicationMode(metadata.fieldsMap["ads"])

val proxySettings: ProxySettings = ProxySettings(metadata.fieldsMap["proxy_settings"], properties)
}

Expand Down Expand Up @@ -68,17 +66,6 @@ data class ProxySettings(
)
}

private fun getCommunicationMode(proto: Value?): CommunicationMode {
val ads = proto
?.boolValue
?: false

return when (ads) {
true -> CommunicationMode.ADS
else -> CommunicationMode.XDS
}
}

fun Value?.toComparisonFilter(default: String? = null): ComparisonFilterSettings? {
return (this?.stringValue ?: default)?.let {
AccessLogFilterParser.parseComparisonFilter(it.uppercase())
Expand Down Expand Up @@ -753,10 +740,6 @@ enum class PathMatchingType {
PATH, PATH_PREFIX, PATH_REGEX
}

enum class CommunicationMode {
ADS, XDS
}

data class OAuth(
val provider: String = "",
val verification: Verification = Verification.OFFLINE,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
package pl.allegro.tech.servicemesh.envoycontrol.groups

import io.envoyproxy.controlplane.server.DiscoveryServerCallbacks
import pl.allegro.tech.servicemesh.envoycontrol.groups.CommunicationMode.ADS
import pl.allegro.tech.servicemesh.envoycontrol.groups.CommunicationMode.XDS
import pl.allegro.tech.servicemesh.envoycontrol.logger
import pl.allegro.tech.servicemesh.envoycontrol.protocol.HttpMethod
import pl.allegro.tech.servicemesh.envoycontrol.snapshot.SnapshotProperties
Expand Down Expand Up @@ -84,7 +82,6 @@ class NodeMetadataValidator(
validateDependencies(metadata)
validateIncomingEndpoints(metadata)
validateIncomingRateLimitEndpoints(metadata)
validateConfigurationMode(metadata)
}

private fun validateServiceName(metadata: NodeMetadata) {
Expand Down Expand Up @@ -170,13 +167,4 @@ class NodeMetadataValidator(

private fun isAllowedToHaveAllServiceDependencies(metadata: NodeMetadata) = properties
.outgoingPermissions.servicesAllowedToUseWildcard.contains(metadata.serviceName)

private fun validateConfigurationMode(metadata: NodeMetadata) {
if (metadata.communicationMode == ADS && !properties.enabledCommunicationModes.ads) {
throw ConfigurationModeNotSupportedException(metadata.serviceName, "ADS")
}
if (metadata.communicationMode == XDS && !properties.enabledCommunicationModes.xds) {
throw ConfigurationModeNotSupportedException(metadata.serviceName, "XDS")
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ class MetricsDiscoveryServerCallbacks(private val meterRegistry: MeterRegistry)

meterRegistry.gauge("grpc.all-connections", connections)
connectionsByType.forEach { (type, typeConnections) ->
meterRegistry.gauge("grpc.connections.${type.name.toLowerCase()}", typeConnections)
meterRegistry.gauge("grpc.connections.${type.name.lowercase()}", typeConnections)
}
}

Expand All @@ -51,15 +51,15 @@ class MetricsDiscoveryServerCallbacks(private val meterRegistry: MeterRegistry)
}

override fun onV3StreamRequest(streamId: Long, request: V3DiscoveryRequest) {
meterRegistry.counter("grpc.requests.${StreamType.fromTypeUrl(request.typeUrl).name.toLowerCase()}")
meterRegistry.counter("grpc.requests.${StreamType.fromTypeUrl(request.typeUrl).name.lowercase()}")
.increment()
}

override fun onV3StreamDeltaRequest(
streamId: Long,
request: V3DeltaDiscoveryRequest
) {
meterRegistry.counter("grpc.requests.${StreamType.fromTypeUrl(request.typeUrl).name.toLowerCase()}.delta")
meterRegistry.counter("grpc.requests.${StreamType.fromTypeUrl(request.typeUrl).name.lowercase()}.delta")
.increment()
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ import io.envoyproxy.envoy.extensions.transport_sockets.tls.v3.Secret
import io.micrometer.core.instrument.MeterRegistry
import io.micrometer.core.instrument.Timer
import pl.allegro.tech.servicemesh.envoycontrol.groups.AllServicesGroup
import pl.allegro.tech.servicemesh.envoycontrol.groups.CommunicationMode
import pl.allegro.tech.servicemesh.envoycontrol.groups.DependencySettings
import pl.allegro.tech.servicemesh.envoycontrol.groups.Group
import pl.allegro.tech.servicemesh.envoycontrol.groups.IncomingRateLimitEndpoint
Expand Down Expand Up @@ -41,14 +40,12 @@ class EnvoySnapshotFactory(

fun newSnapshot(
servicesStates: MultiClusterState,
clusterConfigurations: Map<String, ClusterConfiguration>,
communicationMode: CommunicationMode
clusterConfigurations: Map<String, ClusterConfiguration>
): GlobalSnapshot {
val sample = Timer.start(meterRegistry)

val clusters = clustersFactory.getClustersForServices(
clusterConfigurations.values,
communicationMode
clusterConfigurations.values
)
val securedClusters = clustersFactory.getSecuredClusters(clusters)

Expand Down Expand Up @@ -328,7 +325,7 @@ class EnvoySnapshotFactory(
routes.add(
egressRoutesFactory.createEgressDomainRoutes(
it.value,
it.key.port.toString().toLowerCase()
it.key.port.toString().lowercase()
)
)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ class SnapshotProperties {
var staticClusterConnectionTimeout: Duration = Duration.ofSeconds(2)
var trustedCaFile = "/etc/ssl/certs/ca-certificates.crt"
var dynamicListeners = ListenersFactoryProperties()
var enabledCommunicationModes = EnabledCommunicationModes()
var shouldSendMissingEndpoints = false
var metrics: MetricsProperties = MetricsProperties()
var dynamicForwardProxy = DynamicForwardProxyProperties()
Expand Down Expand Up @@ -302,11 +301,6 @@ class Http2Properties {
var tagName = "envoy"
}

class EnabledCommunicationModes {
var ads = true
var xds = true
}

class HostHeaderRewritingProperties {
var enabled = false
var customHostHeader = "x-envoy-original-host"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ import io.envoyproxy.controlplane.cache.SnapshotCache
import io.envoyproxy.controlplane.cache.v3.Snapshot
import io.micrometer.core.instrument.MeterRegistry
import io.micrometer.core.instrument.Timer
import pl.allegro.tech.servicemesh.envoycontrol.groups.CommunicationMode.ADS
import pl.allegro.tech.servicemesh.envoycontrol.groups.CommunicationMode.XDS
import pl.allegro.tech.servicemesh.envoycontrol.groups.Group
import pl.allegro.tech.servicemesh.envoycontrol.logger
import pl.allegro.tech.servicemesh.envoycontrol.services.MultiClusterState
Expand Down Expand Up @@ -59,8 +57,7 @@ class SnapshotUpdater(
UpdateResult(
action = newUpdate.action,
groups = newUpdate.groups,
adsSnapshot = newUpdate.adsSnapshot ?: previous.adsSnapshot,
xdsSnapshot = newUpdate.xdsSnapshot ?: previous.xdsSnapshot
snapshot = newUpdate.snapshot ?: previous.snapshot
)
}
// concat map guarantees sequential processing (unlike flatMap)
Expand All @@ -74,7 +71,7 @@ class SnapshotUpdater(
// step 4: update the snapshot for either all groups (if services changed)
// or specific groups (groups changed).
// TODO(dj): on what occasion can this be false?
if (result.adsSnapshot != null || result.xdsSnapshot != null) {
if (result.snapshot != null) {
// Stateful operation! This is the meat of this processing.
updateSnapshotForGroups(groups, result)
} else {
Expand Down Expand Up @@ -111,19 +108,9 @@ class SnapshotUpdater(
.name("snapshot-updater-services-published").metrics()
.createClusterConfigurations()
.map { (states, clusters) ->
var lastXdsSnapshot: GlobalSnapshot? = null
var lastAdsSnapshot: GlobalSnapshot? = null

if (properties.enabledCommunicationModes.xds) {
lastXdsSnapshot = snapshotFactory.newSnapshot(states, clusters, XDS)
}
if (properties.enabledCommunicationModes.ads) {
lastAdsSnapshot = snapshotFactory.newSnapshot(states, clusters, ADS)
}
val updateResult = UpdateResult(
action = Action.ALL_SERVICES_GROUP_ADDED,
adsSnapshot = lastAdsSnapshot,
xdsSnapshot = lastXdsSnapshot
snapshot = snapshotFactory.newSnapshot(states, clusters),
)
globalSnapshot = updateResult
updateResult
Expand Down Expand Up @@ -169,13 +156,11 @@ class SnapshotUpdater(
versions.retainGroups(cache.groups())
val results = Flux.fromIterable(groups)
.doOnNextScheduledOn(groupSnapshotScheduler) { group ->
if (result.adsSnapshot != null && group.communicationMode == ADS) {
updateSnapshotForGroup(group, result.adsSnapshot)
} else if (result.xdsSnapshot != null && group.communicationMode == XDS) {
updateSnapshotForGroup(group, result.xdsSnapshot)
if (result.snapshot != null) {
updateSnapshotForGroup(group, result.snapshot)
} else {
meterRegistry.counter("snapshot-updater.communication-mode.errors").increment()
logger.error("Requested snapshot for ${group.communicationMode.name} mode, but it is not here. " +
logger.error("Requested snapshot, but it is not here. " +
"Handling Envoy with not supported communication mode should have been rejected before." +
" Please report this to EC developers.")
}
Expand Down Expand Up @@ -212,6 +197,5 @@ enum class Action {
data class UpdateResult(
val action: Action,
val groups: List<Group> = listOf(),
val adsSnapshot: GlobalSnapshot? = null,
val xdsSnapshot: GlobalSnapshot? = null
val snapshot: GlobalSnapshot? = null
)
Loading