Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Adaptive sampling #532

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from
Open

WIP: Adaptive sampling #532

wants to merge 3 commits into from

Conversation

mfalt
Copy link
Member

@mfalt mfalt commented May 31, 2021

Implements functionality for adaptive sampling of functions based on the expected use. For example sampling of frequency response, knowing that it will be used in a loglog plot for magnitude and lin-log plot for phase.

All the code in src/freqresp.jl is temporary for now, and only showrthand for bode has been implemented as proof of concept.
I amm happy for any suggestions on the code in src/sampler.jl

@mfalt mfalt changed the base branch from master to dev May 31, 2021 21:08
@codecov
Copy link

codecov bot commented May 31, 2021

Codecov Report

❗ No coverage uploaded for pull request base (dev@b75faea). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head a6113be differs from pull request most recent head 8ec7060. Consider uploading reports for the commit 8ec7060 to get more accurate results
Impacted file tree graph

@@          Coverage Diff           @@
##             dev     #532   +/-   ##
======================================
  Coverage       ?   82.72%           
======================================
  Files          ?       32           
  Lines          ?     3271           
  Branches       ?        0           
======================================
  Hits           ?     2706           
  Misses         ?      565           
  Partials       ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b75faea...8ec7060. Read the comment docs.

@mfalt
Copy link
Member Author

mfalt commented May 31, 2021

There is a significant improvement in finding peaks and notches with the approach, and it should be possible to improve further to better sample when computing margins. (Plots are not updated to use the new code)
The current code computes scalings quite a few more times than nessesary, but is not significantly slower

# Tested on 1x2 StateSpace
# Old behavior
@btime freqresp(sys, exp.(LinRange(-2, 1, 425)))
  1.089 ms (10063 allocations: 1.62 MiB)

# New adaptive with no regard to mag and phase
@btime freqresp(sys, (0.01, 10)) # 425 long grid
  4.908 ms (24063 allocations: 2.96 MiB)

# New adaptive with regard to loglog and linlog of mag and phase
@btime freqresp(sys, (0.01, 10), style=:bode) # 419 long grid
  15.405 ms (79414 allocations: 5.52 MiB)

I think a final version can be expected to be somewhere between the two later alternatives, closer to the first for larger systems. I defininetely think we should try to replace default_freq_vec with default limits and this code.

src/freqresp.jl Outdated Show resolved Hide resolved
src/sampler.jl Outdated Show resolved Hide resolved
src/sampler.jl Outdated
Comment on lines 94 to 95
push_multiple!(values, midvalue)
push_multiple!(values, rightvalue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a specific method for this and not just use the one line implementation in the method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be a weird implementation with 4 different method. Could definetely be inlined now

Copy link
Contributor

@olof3 olof3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, everything would be an improvement.

I defininetely think we should try to replace default_freq_vec with default limits and this code.

Yes, it should definitely be replaced. If it is done properly, I'm not sure that we need all the scalings.

Have you seen that there are some adpative functionalities already in PlotUtils, it is perhaps similar to this PR (?), but I haven't looked to closely. I still think that it could be justified with a version that works well with freqresp. We could consider if we should use the same kwargs in that case.

julia> g = w -> abs.(freqresp(P, [w])[1])
julia> plot(adapted_grid(g, (1e4, 1e8), max_recursions=12, max_curvature=0.005), xscale=:log10, yscale=:log10, legend=false, mark=:dot)

src/freqresp.jl Outdated
# Create imaginary freq vector s
func, xscale, yscales = if iscontinuous(sys)
if style === :none
f = (w) -> (evalfr(sys, w*im),)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code duplication is an excellent case for that we should define freqresp for Real (possibly Number) inputs. I don't see any conflict with that we may return a struct when the input is a vector. This would be an incredible QOL improvement in a lot of other situations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an eval_frequency(sys::LTISystem, w::Real) for now. But I agree that we should do something here.

src/sampler.jl Outdated
end

# Given range (xleft, xright) potentially add more gridpoints. xleft is assumed to be already added, xright will be added
function refine_grid!(values, grid, func, xleft, xright, leftvalue, rightvalue, xscale, yscales, maxgridpoints, mineps, num_gridpoints; reltol_dd=0.05, abstol_dd=1e-2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that , x1, x2, xm/xmid, y2, y2, etc., alternatively xa, xb, ya, yb would make this a lot easier to parse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You partly already do this in is_almost_linear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right

src/ControlSystems.jl Outdated Show resolved Hide resolved
src/freqresp.jl Outdated
@@ -19,6 +19,66 @@ of system `sys` over the frequency vector `w`."""
[evalfr(sys[i,j], s)[] for s in s_vec, i in 1:ny, j in 1:nu]
end

# TODO Most of this logic should be moved to the respective options, e.g. bode
# TODO Less copying of code
@autovec () function freqresp(sys::LTISystem, lims::Tuple; style=:none)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all the different scalings really necessary? It seems quite natural to only focus on just the complex-valued frequency response. It would definitely simplify the code, and also how one thinks about it conceptually. If done properly, I am not sure that there would be much to gain from using different scalings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure that it makes a difference in reducing the number of points in loglog scale for example.

@JuliaControl JuliaControl deleted a comment from JuliaControlBot Jun 1, 2021
@JuliaControl JuliaControl deleted a comment from JuliaControlBot Jun 1, 2021
@mfalt
Copy link
Member Author

mfalt commented Jun 1, 2021

I updated so that the grid is used for bode, nyquist and sigma now. It is currently not used in plots since we get some issues because we (currently) allow multiple systems to be submitted. Here are some illistractions. Still using the same defaults. I think we need to improve nyquist slightly. Otherwise I think it looks good (markers included to show grid). Notice that although the systems are mostly the same, the grids differ slightly.

## Test bode
sys1 = (s+1)/(s^2+ 0.1s + 1)
sys2 = 0.1*(s+1)/(s^2+ 0.1s + 100)
sys = [sys1 sys2]
mag, phase, w = bode(sys)
mag, phase = dropdims.([mag,phase],dims=1)
plot(w, mag', layout=(2,1), m=(:o,2), yscale=:log10, xscale=:log10, subplot=1, lab="", size=(800,600))
plot!(w, phase', yscale=:identity, xscale=:log10, subplot=2, m=:o, lab="")

bode

## Test Nyquist
sys1 = (s+1)/(s^2+ 0.1s + 1)
sys2 = 0.8*(s+1)/(s^2+ 0.1s + 100)
sys = [sys1 sys2]
res, ims, w = nyquist(sys)
res, ims = dropdims.([res,ims],dims=1)
plot(res', ims', m=(:o,2), lab="", size=(600,600))

nyquist

## Test Sigma
sys1 = (s+1)/(s^2+ 0.1s + 1)
sys2 = 0.1*(s+1)/(s^2+ 0.1s + 100)
sys = [sys1 sys2]
sig, w = sigma(sys)
sig = dropdims(sig,dims=1)
plot(w, sig, m=(:o,2), yscale=:log10, xscale=:log10, lab="", size=(600,600))

sigma

@mfalt
Copy link
Member Author

mfalt commented Jun 1, 2021

Some bodeplots of benchmarks provided by @olof3 . I think every single one failed for one reason or another with the old code. I compared to manually sending in 1000 logspaced gridpoints of the old code, with the default of the new code. (Manually selecting y-ranges for problem 3 and 4).

It is clear that we need some improvements, but at least the new code works (compared to the old code), and it is almost always better than manually supplying a grid. Note that the default generates quite few points (about 300).

# Single notch 
Ω = [10]
G = zpk(im*Ω, [0; im*Ω .+ 1], complex(prod(Ω)))

# Single notch, low gain
Ω = [10]
G = 1e-6*zpk(im*Ω, [0; im*Ω .+ 1], complex(prod(Ω)))

# 3 notches
Ω = [10, 100, 1000]
G = zpk(im*Ω, im*Ω .+ 1, complex(prod(Ω)))

# 1/s + 3 notches
Ω = [10, 100, 1000]
G = zpk(im*Ω, [0; im*Ω .+ 1], complex(prod(Ω)))

# FOTD
G = DemoSystems.fotd=100)

Old 1:
bode1_old
New 1:
bode1
Old 2:
bode2_old
New 2:
bode2
Old 3:
bode3_old
New 3: (Note, only 300 sample points by default)
bode3
Old 4:
bode4_old
New 4:
bode4
New 4 with mineps=1e-9, i.e. about 1000 points:
bode4_extra
Old 5:
bode5_old
New 5:
bode5

@@ -1,22 +1,45 @@
function eval_frequency(sys::LTISystem, w::Real)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this inline? If not, we should probably encourage that for performance reasons.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any good reason for not just calling this freqresp? To me, that would make a lot more sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it's exactly freqresp for a scalar frequency

(abs.(fr), angle.(fr))
end
ys, grid = auto_grid(f, lims, (log10, exp10), (log10, identity); kwargs...)
angles = cat(ys[2]...,dims=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This compiles a unique function for each length of the vector. Does it handle 10k points?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually? I would assume that it stops at some length. In any case, I'll compare performance to
reduce((x,y) -> cat(x,y,dims=3), ys[2]) or some left associative equivalent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be hard to beat the simple

function cat3(T::AbstractVector{<:AbstractMatrix})
    N = length(T)
    traj = similar(T[1], size(T[1])..., N)
    for i = 1:N
        traj[:,:,i] = T[i]
    end
    traj
end

@autovec (1) sigma(sys::LTISystem) = sigma(sys, _default_freq_vector(sys, Val{:sigma}()))
# TODO: Not tested, probably broadcast problem on svdvals in auto_grid
@autovec (1) function sigma(sys::LTISystem, lims::Tuple; kwargs...)
f = (w) -> begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f = (w) -> begin
f = function (w)

This is a much nicer syntax to create anonymous function. Same goes in many places.

#if isa(sys, StateSpace)
# sys = _preprocess_for_freqresp(sys)
#end
ny,nu = noutputs(sys), ninputs(sys)
[evalfr(sys[i,j], s)[] for s in s_vec, i in 1:ny, j in 1:nu]
mapfoldl(w -> eval_frequency(sys, w), (x,y) -> cat(x,y,dims=3), w_vec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that this line is horribly slow. Madreduce and friends have a bad rep due to their naive implementation. The reduce with cat will allocate O(N2) memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test how it behaves in practice, i was hoping that it was smart enough to reuse the memory. It is nice to avoid allocating matrices wherever possible, but not if it's significantly more expensive.
I'll also try with a mapcat3 version based on your code above.

@olof3
Copy link
Contributor

olof3 commented Jun 1, 2021

Perhaps using max_recursions (as in adapted_grid in PlotUtils) would be more convenient and handle tricky cases more gracefully, than max_num_gridpoints and keeping track of num_gridpoints?

@mfalt
Copy link
Member Author

mfalt commented Jun 1, 2021

Perhaps using max_recursions (as in adapted_grid in PlotUtils) would be more convenient and handle tricky cases more gracefully, than `` and keeping track of num_gridpoints?

Max recursions would more or less correspond to mineps. The only difference is that mineps can be selected independently on start_gridpoints, and max recursions could be selected independently on xlims. Not sure which is most intuitive.

max_num_gridpoints essentially avoids that the code gets stuck with crazy allocations if the algorithm starts to go down to max_recursions (N) at every point (which would result in 2^N points). The typical case only goes down N steps at very few points.

Edit: the current default on mineps roughly correspond to max_recursions=log2(10000)=13. The typical number of points is significantly lower (200-1000) and the default max_num_gridpoints is 2000.

@olof3
Copy link
Contributor

olof3 commented Jun 2, 2021

The only difference is that mineps can be selected independently on start_gridpoints, and max recursions could be selected independently on xlims. Not sure which is most intuitive.

Not sure, it does not seem trivial to specify it in a good way, e.g., mineps = (xscale[1](lims[2])-xscale[1](lims[1]))/10000. I wouldn't mind an unscaled version though dw_min, again, depends on how the resolution of features should be emphasized.

max_num_gridpoints essentially avoids that the code gets stuck with crazy allocations if the algorithm starts to go down to max_recursions (N)

Unless the sampling is broken, that would have to be a very pathological function. To me, 2000 vs 10_000 seems like a rather small difference. Would also seem annoying if the first part of the response is accurately sampled, and latter part is not. Regardless, the user would have to adjust mineps to get an acceptable plot in acceptable time.

@olof3
Copy link
Contributor

olof3 commented Jun 2, 2021

I am pretty sure that it makes a difference in reducing the number of points in loglog scale for example.

Yes, that is most probably the case, but I don't think it would be a big difference. I think the question is what behavior that one really wants, if the resolution of features in the frequency response should be independent of frequency or not.

I am wondering if it would be feasible to only look at variations of the complex-frequency response, without really considering the frequency as is done in this PR, but otherwise using the same approach. E.g., should bubbles in the Nyquist curve be sampled in the same way regardless of their "width" (which varies the depending on the scaling? Should the three notches in the test cases be resolved similarly? If there are only a finite number of features, I would say that is desirable. However if there is a lot of crazy stuff going on at high frequencies, then perhaps not.

Regardless, we want to decide on some default strategy for just freqresp(sys). What choice are you thinking about for this?
My feeling is that a sensible choice for this would work very well for all the plots that we are interested, but it it is better to test and see.

I guess all of these decisions really depend on what features in the frequency response that people are interested in. Personally I think I would like to have more equal resolution of features. The test systems that I provided are possibly too simplistic, but capture some annoying issues at least.

@mfalt
Copy link
Member Author

mfalt commented Jun 2, 2021

If the goal is to produce a reasonable plot in loglog scale, then that is the natural scale to measure features in. If the goal is something else, then I think that the scale should be adjustable.

For a general plot, yes. For, e.g., a Bode plot of an open loop system, I would like the features that are important for closed loop performance to have good resolution. I.e., the magnitude of sharp, high-frequency resonance peaks should be resolved sufficiently well. Of course any adaption is of course a great improvement, and probably acceptable in most cases.

For bodemag plots of closed-loop transfer functions, scaling as in this PR seems good.

What would be a reasonable default dw_min? I think it's impossible to pick if you want to allow systems with arbitrary time-scales.

It would of course depend on the specified limits on w as well, but perhaps it is better to have a more straightforward meaning of mineps. Although I guess its meaning is close enough to what one would expect.

I agree that the difference could be larger by default. But i would rather get a warning and a result that looks bad than accidentally have my frequency response get stuck for several minutes without knowing why. For other use cases, for example when optimizing in a loop, you would probably pick a fix grid anyway. And if you don't, you still don't want the code to get stuck. Pathological cases will always exist, the question is what we do when we encounter them.

Right, there is a warning. Yeah, it was just the small difference between 2000 and 10_000 that seemed a bit weird.

In the example 3 above, there is "only" a factor 100 between the frequency of the left and right notches. With your approach, that would still require an extra depth of 6 recursions to get the same accuracy, and that is assuming the left notch isn't already hitting the absolute limit on stepsizes.

I guess one would like to get a good idea of the magnitude of sharp high-frequency resonance peaks. I am not sure what you are talking about here.

And for large frequencies we would probably start hitting limits on the floating point numbers quite quickly if we don't consider the log scale.

I am not sure that I follow. We are not talking about notches/peaks that have a relative width close to 1e-16.

As a side note, I am considering making this code a separate package since the code is so general. I would make sure that all defaults could be set easily so that changes there wouldn't affect the CS package.

PlotUtils seems like a good place.

d2 = (y3-y2)
# Essentially low second derivative compared to derivative, so that linear approximation is good.
# Second argument to avoid too small steps when derivatives are small
norm(d1-d2) < max(reltol_dd*max(norm(d1), norm(d2)), abstol_dd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this hold component-wise? Would probably be good with some mimo test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants