George Hosu
2 min readMar 27, 2020

--

This sounds incorrect to me.

Either the model is not “state of the art” or a developer can’t fine tune it. I see no way of having an actual state-of-the-art model be used by anyone but a machine learning practitioner, almost by definition… otherwise developers that just tweaked a few hyper parameter would be all over the place on kaggle and paperwithcode toping benchmarks.

Even in a space as old as compilers I still have to know how to compile a compiler and how to tweak a bunch of flags which might make no sense to someone that doesn’t understand compiler in order to get “state of the art” performance on a given metric. Because the “state of the art” version of the compiler only gets the public release a few week, months or even years down the line from it’s inception (by which time a newer and better version is already in the works).

You can get “close to state of the art”, but that’s not quite the same thing. Indeed, getting “state of the art” without knowing anything about ML is basically impossible and getting “close to state of the art” is what I’d wager will be the norm in a few years time.

I say this as someone that works on those open-source 2-lines of code machine learning libraries (mindsdb), it’s literally impossible to fine tune a model 100% as well as someone who know what he’s doing if you are a developer working with an ML library.

But that in itself is NOT a problem, because guess what ? For most of the problems out there you don’t need state of the art, you need “what state of the art was 1 year ago”, and that will probably be good enough. The crux of the problem isn’t going to be that 0.01% accuracy boost, it’s going to be getting the thing to actually work.

--

--

George Hosu
George Hosu

Written by George Hosu

You can find my more recent thoughts at https://www.epistem.ink | I cross-post some of the articles to medium.

No responses yet