Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the sampler interface? #143

Open
sethaxen opened this issue Aug 17, 2021 · 8 comments
Open

Implement the sampler interface? #143

sethaxen opened this issue Aug 17, 2021 · 8 comments

Comments

@sethaxen
Copy link
Collaborator

If we implement the Random.Sampler interface, then we can draw IID samples from the distributions: https://docs.julialang.org/en/v1/stdlib/Random/#Generating-random-values-of-custom-types. Distributions tends to make scalar and array distributions return Arrays, but should we instead return a vector of draws, e.g. a vector of arrays?

@mschauer
Copy link
Member

Yes, we should default to a vector of vector respective svector draws for multivariate distributions per default, but with the feature to allow rand! to write in arrays or reinterpreted arrays instead of vector of (s)vectors.

@cscherrer
Copy link
Collaborator

Offhand I see a few considerations worth... considering.

For scalar distributions, there are big advantages in MeasureTheory over Distributions for log-density computations. But for random sampling, this is less clear. For now, a lot of this goes through distproxy and just reuses code from Distributions. Would this require rebuilding or copying all of that?

As for the multivariate case, I think we can have the contiguous memory benefits of the Distributions approach without the weird semantics. In TupleVectors, you can do

julia> x = TupleVectors.chainvec(randn(3), 10)
10-element ArraysOfArrays.ArrayOfSimilarArrays{Float64, 1, 1, 2, ElasticArrays.ElasticMatrix{Float64, Vector{Float64}}}:
 [0.2543955996319798, 0.5657592060457689, -1.4845945938531642]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]

julia> x.data
3×10 ElasticArrays.ElasticMatrix{Float64, Vector{Float64}}:
  0.254396  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
  0.565759  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 -1.48459   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

So in terms of the memory layout it's the same as Distributions, but we can still do e.g.

julia> x[3]
3-element view(::ElasticArrays.ElasticMatrix{Float64, Vector{Float64}}, :, 3) with eltype Float64:
 0.0
 0.0
 0.0

This also allows for preallocation, meaning we need low-level functionality for rand!

@mschauer by svector do you mean StaticArrays.SVector? I think we should allow for these, but default to SizedVector when the size is known statically. As I understand, SVectors are internally represented as tuples, so you get the compile-time benefits of this but also the cost. This isn't settled yet, so of course we can discuss more.

@sethaxen
Copy link
Collaborator Author

For scalar distributions, there are big advantages in MeasureTheory over Distributions for log-density computations. But for random sampling, this is less clear. For now, a lot of this goes through distproxy and just reuses code from Distributions. Would this require rebuilding or copying all of that?

Probably not. Distributions.jl implements samplers for at least some distributions that can benefit from it, so for those distributions at least, we could just call their samplers. See https://github.com/JuliaStats/Distributions.jl/tree/master/src/samplers

As for the multivariate case, I think we can have the contiguous memory benefits of the Distributions approach without the weird semantics. In TupleVectors, you can do

How can we do something like this without making TupleVectors a dependency?

This also allows for preallocation, meaning we need low-level functionality for rand!

Great! We were discussing having that in #129 anyways.

The challenge that we have relative to Distributions is potential for combinatorial explosion. For example, we might have 3 parameterizations of a distribution, each having their own most efficient ways to draw samples. Now if we need to implement 3 rand! methods each to support drawing one draw, drawing a vector of draws, or populating an array whose rows are draws, this quickly becomes unwieldy. So it would be good to have a robust set of default methods that call each other, so ideally one only needs to implement a single method (but could implement more for greater performance).

sethaxen referenced this issue in JuliaManifolds/ManifoldMeasures.jl Aug 17, 2021
@cscherrer
Copy link
Collaborator

Probably not. Distributions.jl implements samplers for at least some distributions that can benefit from it, so for those distributions at least, we could just call their samplers. See https://github.com/JuliaStats/Distributions.jl/tree/master/src/samplers

👍

How can we do something like this without making TupleVectors a dependency?

The chainvec code is here:
https://github.com/cscherrer/TupleVectors.jl/blob/main/src/chainvec.jl

It's no longer specific to chains, so the method should be renamed anyway. TupleVectors is needed if we want vectors of tuples. But if we just want arrays of arrays, we'd only need ElasticArrays and ArraysOfArrays.

The challenge that we have relative to Distributions is potential for combinatorial explosion. For example, we might have 3 parameterizations of a distribution, each having their own most efficient ways to draw samples. Now if we need to implement 3 rand! methods each to support drawing one draw, drawing a vector of draws, or populating an array whose rows are draws, this quickly becomes unwieldy. So it would be good to have a robust set of default methods that call each other, so ideally one only needs to implement a single method (but could implement more for greater performance).

Right, there might be one parameterization that's most efficient for random sampling, and we could have the Sampler convert to that. Also, affine transformations will cover lots of cases, just as you only need randn without the mean and std, or rand without the upper and lower bounds. More generally anything defined using transformations can be sampled by applying the transformation to each sample of the untransformed version.

@mschauer
Copy link
Member

I really dislike the current state of this, sampling should just provide an iterator and collecting the samples or writing the samples into a container is a problem of the caller, not the callee. But we have no nice idiom for iterating over vectors and writing them into a matrix or a vector of svectors efficiently, even if that has nothing to do with sampling and only with collection.

@cscherrer
Copy link
Collaborator

I think the core of the problem here is that despite that

  1. it's easy to get composability by creating and destroying arrays as you go, and
  2. with some more work you can usually get zero allocations,

there are no common idioms for getting both at once.

The idea of iterating over vectors and having the caller aggregate them sounds good at first, but without doing this carefully each Vector created would allocate.

The ideal solution to this would probably be a purely functional interface for representing array computations. LazyArrays seems close to this, but we'll need nondeterminism with data sharing, and I'm not sure whether LazyArrays can handle that. We'll also need to be able to hand-tune things in some cases, e.g. with LoopVectorization, and I'm not sure about doing that with LazyArrays. Anyway, there's the usual risk of unknowns, and also that it could become a bigger project than MeasureTheory itself.

So to be less ambitious, but to have easier control over what we're doing, the best approach I can see is to do all array sampling in terms of rand!. The caller (which might be rand) creates an array and passes it to rand!, which fills it. Along the way it might require more calls to rand! using views of the original array.

All of this gets weird if you have, say, a vector of named tuples, and one of the entries in the named tuple is an array. By default those "inner" arrays would all be separately allocated. TupleVectors helps with this using the array-of-arrays representation. So if you have a TupleVector where one element is (a = 1, b = [2, 3], c = [4 5; 6 7]), internally it will just be a vector (for the a values across the vector), a matrix, and a 3-array. Then you can again allocate just once and fill with rand!

@cscherrer
Copy link
Collaborator

@mschauer does this address your concerns? It's not clear to me what "I really dislike the current state of this" refers to, since I think Distributions etc has the same problem you describe.

@mschauer
Copy link
Member

Yeah, that was general frustration, even more directed at Base than at Distributions.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants