Wednesday, April 26, 2023

probablistic programming and protocols

 for a long time we've known that a number of processes in the internet are heavy tailed - the distribution of files (web page/video)  is typicalyl zipfian, so transfer times are heavy tailed, and also protocols like TCP induce burstiness from a number of different root causes (timers, AIMD, interface packet scheduling, stat muxing of many flows of different duration and round-trip-time etc etc) 

so people have written analysis tools that capture these stats (e.g. to do measurement based admission control for traffic that cannot simply be described by a mean or peak rate) - which can be quite parsimonious....they've also written generator tools so that synthetic packet traces can be built for simulation or testing - typically some sort of fractional brownian thing - one debate not to get drawn in to is whether "TCP is fractal" - it does't really matter - you just want. a time series that has the requisite self similarity (e.g. right Hurst parameter) so you can dimension buffers for some delay stats for traffic flows...

some very clever maths was done to reduce all that to some simple tools would do 99% of what you want - e.g. see papers from the Measure project.

this was all reminded to me yesterday in a talk about stochastic processes (and cheaper ways to do gaussian processes) for probablistic programming. The way the speakers tackled it was using a variational inferencing approach to split the gaussian into components (standard autoencoder hack - simplify, then synthesize) - the problem with this is that, while it works well (see \pi-VAE for details), it is hard to explain, or more importantly, interpret actually what component stochastic processes are being put together to get the fit!

two ways to tackle this (in my view) - one would be a \phi-ML approach, as above, where you have an explanation, but it is hard to solve, but you can approximate it directly.

The other idea came up in a question from an attendee, which was what about using neural processes, which are in some sense, a basis set of functions (think, like Fourier or Laplace transform) - this is directly speaking to what one is doing (trying to build a particular guassian process) but has unfortunately high costs in the general case, but works well in a lot of other classes of physics problem spaces.

I don't know enough about the stats to fix the neural process approach, although using an XAI technique on the VAE might yield an affordable interpretation of that approach, but I offer the observation about long tailed network traffic, which looked pretty intractable until some folks thought of some simple tricks....and maybe mixing the physics ML neural process with those tricks might yield an idea for how to make the neural process for stochastic behaviours efficient for some (very) common cases?

So what might make a suitable basis set of functions for spatial data ? well how about a poisson point process? so a super-position of a set of poisson point processes with different \lamdas would likely yield almost anything you want - main challenge is to decide how many (different) ones suffice to get a good match

No comments: