Kamile Lukosiute
• ## What does BCEWithLogitsLoss actually do?

April 14, 2022

Binary classification will typically involve taking a sigmoid of your final neural network layer outputs and computing the binary cross entropy loss (BCE). Sigmoid has exponentials, and BCE has logarithms, so some clever people who write PyTorch decided that it would probably be wise to combine those two operations into one class, so now we have the function BCEWithLogitsLoss. This blog post aims to explain exactly what is being done by this function and why it is better to use this function rather than computing sigmoid and BCE separately.

• ## Frustration-free Matplotlib

April 4, 2022

This post is written for my Machine Learning for Physics and Astronomy Students, Spring 2022.

It’s a shame that no one ever explains how to use matplotlib to students because it’s “just a tool” and therefore less important than the content of classes. But I find that students usually grasp math, physics, statistics concepts really quickly yet struggle at implementation (and visualization) because the tools were not explained properly to them.

We’re going to do a lot of plotting; physicists always do a lot of plotting. So here I’m offering a small explanation of how “making a figure” with matplotlib works. Once you understand how a tool works, you’ll be able to debug when things go wrong, leading to a much more pleasant user experience.

• ## When can a tensor be view()ed?

March 16, 2022

A common operation in PyTorch is taking a tensor with the same data and giving it a new shape. The usual method to do this is to call torch.Tensor.view(new_shape). This operation is nice because the returned tensor shares the underlying data with the original tensor, which avoids data copy and makes the reshaping memory-efficient. This of course introduces the usual quirk that if you change a value in the original tensor, the corresponding value will change in the .view()ed tensor.

.view() cannot always be used. At minimum, new_shape must have the same total number of elements as the original tensor. There are also additional requirements for the compatibility of the old and new shapes. However, the documentation about this is kinda opaque, so the purpose of this blog post is to try to understand those requirements in more detail.

• ## Neutron star mergers and fast surrogate modeling

December 11, 2021

As I was transitioning out of my Bachelor’s and into my Master’s, I picked up a project on the side. It boils down to “I trained a neural network to do predict some things that are useful for astrophysicists,” or more precisely “I trained conditional variational autoencoders to serve as surrogate models for the spectra of kilonovae.”

As far as machine learning goes, it’s not new, really. Even as far as astrophysics goes, it just applies a method that’s not used in this field. The thing is, though, it’s useful. It speeds up a really common analysis method something like 400 times*. This is important, first, for the discovery of kilonovae for which it latency is critical and second, for increasing the amount of knowledge that can be extracted from any given data set.

I published it as a NeurIPS Machine Learning for the Physical Sciences Workshop paper with the help of some collaborators I picked up along the way. We are also working on getting the work out as a much longer paper in an astronomy/astrophysics journal so that it reaches the intended audience. Additionally, we are making it easy to use by distributing it as a package.

If you’re looking for a fast surrogate model for the spectra of kilonovae, I recommend reading the aforementioned paper. This post is intended for a non-astrophysics audience. Here I want to share, first, the interesting science that motivated me to pick up this project, second, the one interesting machine learning detail that I came across while working on this, and third, some thoughts on working on things like this in general.