more litter from my trail of tears

Feel free to correct me or reply if there’s any obvious mistake I’m making. I might be telling on myself here.

So during my masters, I was assigned an adviser who then assigned me a project. This was a research based masters with graduate level mathematics course requirements and a minimum 70 page thesis, in English, abstract in simplified Chinese characters. There was a committee of radar and signal processing engineers, electrical engineers and a mathematician or two that grilled me during my defense.

I was in my 4th continent after the pandemic started, and I was asked to solve a radar waveform design problem using neural networks for my thesis. I have a math/Bayesian/dev background, so this wasn’t my best area but it was a good idea to start working with these models. After like a few months fitting different architectures on this waveform problem, I thought, is this even possible? So I went back and did some theory. This is something I wrote up during the research portion of my masters, I think i put parts of it in my thesis, but Cui had me take it out.

But yeah, there wasn’t as much literature on it but perhaps I didn’t find the right literature. Cui was a convex optimization guy, and this was a new topic, some of the TensorFlow code I inherited from one of his colleagues. When I was in China we had lot of back and forth, but I couldn’t ask his colleague during the pandemic. So may be things would have been better had the 5th largest pandemic in human history hadn’t happened, my funding didn’t get cut, and I wasn’t moving constantly for 3 years, but hey…

“…I did enough to get the degree.”

But yeah, technically speaking, I know that neural networks, at least MLPs are function approximators, so given my background in simulation based inference, we know that one way to computationally verify something is working (to a certain extent, it’s more of a diagnostic that should be used when fitting computational probabilistic models), is to simulate data, and make sure you can recover the parameters. Visually, in any way possible, as well. So given some background in functional data analysis, I simulated some different non-linear functions, and I was able to recover them from some MLPs. But then I inherited some other code.

But in this case, I had a model with some good runs, and then after that, even after version control, I could’t get the same estimates and I was like, f***, the parameter space is multimodal so I’m just hitting random modes in a high dimensional solution surface.

But in this case in the document below, I was thinking, ok since the objective function is not smooth, so it can’t be solved with the tool I’m using… so I wanted to show this mathematically. Moreover, the constraint of interest for radar engineers is actually a hyper-unit-sphere so I think the constraints made the optimization surface non-identifiable, thus, my optimizer was just converging to random parameters, when I randomly initialized it. If I had initialized the parameters in such the way that it guaranteed the optimizer to converge to the same local mode… anyway. Anyone more familiar with these models please comment. May be I sound stupid? In Bayesian we make sure to mathematically/probabilistically define a probabilistic model such that the posterior is identifiable, and if not, such as in mixture models, you can set priors or initialize parameters such that the optimizer or MCMC sampler will converge to the same mode. But anyway…

Anyone more familiar with these models please comment. I didn’t really have anyone to ask that could answer these questions, and I’m independent, so I just took what tools I had and ran with it.

I also did this in multiple countries and continents while accidentally starting a consulting business and learning a 3rd language (Spanish).

But hey, sink or swim.

Anyway, the work you see below is some garbage I wrote up during my masters. It’s not publishable. It’s dated incorrectly, this was done in 2021.

zapico_mlp_acf_theory Download

4 responses to “more litter from my trail of tears”

likelyandre

December 29, 2025 at 5:13 pm

I just went over this again, this is nonsense. I misunderstood a definition.

LikeLike

likelyandre

December 30, 2025 at 3:09 pm

This was a stupid question. Here’s a proof sketch: UAT states that MLPs can only approximate a function that is continuous. ACF is not continuous. We are done. Whoopsies!

LikeLike

likelyandre

January 6, 2026 at 12:13 pm

More mistakes. I can’t use any specified distribution since I’m not proving this in general for any stochastic process, I’m just showing an example. I need to prove for any discrete time stochastic process. But you can still use a density argument, I think.

LikeLike

likelyandre

January 6, 2026 at 12:22 pm

Yeah, this is embarrasing

LikeLike

likely llc