June 26-30, 2023

Alright, I know what you’re thinking: “Another condensed blog entry? Logan, what the HECK!?”. First off, we’re all out here on this floating ball of rock doing our best, okay? Everyone needs some slack now and again. Secondly, I’m doing this for YOU. I don’t think anyone wants to read 4 days worth of “I tried this, and it failed, so I tried this, and it broke something else.” I’m sure we all have enough PTSD from the process that we don’t need reminders of our emotional scars. The reason why I’m posting TODAY is:

it works.

Despit all evidence to the contrary, my testing frameworks are working as intended (with a few light warning messages interspersed, but eh. It works.)

SO, let me breakdown my metaphorical setup for you:

  1. Models: I’ll be using the synth files from Because for now. Why? Because they’re premade, integrated, and come with some great documentation. I’m thinking of renaming them (at least for myself) so that we can call exactly what combinations of linearity, topology, noise, etc. we need within the framework. It needs to include a ground-truth definition for whichever dataset is used so that the metrics have something to compare to predictions.
  2. Algorithm choice: I’m going to need to define some stuff here for calling specific algorithms we want to test. Either some skeleton code for new algorithms, or simple calls to the premade stuff. Maybe both. Right now I’m testing using cdt’s PC algorithm, and its working as intended. The future of this part will likely be up to discussion with the entire team and our new algorithms.
  3. Metrics: I am officially getting SHD, SID, precision-recall, and ANM scores. cdt has a couple more that seem to be algorithm specific, but I’m on the fence about using them. This framework is meant to be generalized; we should be able to throw any discovery algorithm at this thing and get some sort of usable output. Then, we can then put on our thinking caps and consider how well it did against a different algorithm until we say “AhA! This algorithm is HORRIBLE with this data compared to this algorithm; someone write that down!”

The name of the game here is ease of use. What’s the most optimized way to run these tests with the fewest amount of inputs? One of my favorite functions in keras is k-fold cross validation, simply because it will take some inputs, you press a button and voila, the code spits out “Hey, these are all bad, but this one is the least bad”. And then you know where to go from there in order to improve things, or at least which parameters actually work best for your model. Ideally, I’d like to develop a similar system: define an algorithm to test, run it against all the model datasets you have, and spit out which ones it worked on the best. Then you can access scores, performance metrics, graphs, etc. from the whole thing and move on with your life. Fingers crossed.

By:

Posted in:


Leave a comment

Design a site like this with WordPress.com
Get started