A picture of João Santiago.

João Santiago

João Santiago is a data scientist with an interest in simplifying how machine learning is integrated into production systems. He likes R and Clojure for their LISP roots and functional features, and wishes he could ride his bicycle more often.



Just-in-time features in machine learning models: why not Clojure?

By João Santiago

It is common for real-time Machine Learning models to use transformed data, instead of raw inputs from a user or some other system. Currently, this critical step is embedded in frameworks such as sci-kit learn or tidymodels, extra code in the APIs that wrap the models or totally rewritten in another language such as Scala and served via Spark. Such practices lead to duplication of code, decrease reusability and introduce new points of friction. In this talk I want to further explore this problem so common among data science teams, and present Bulgogi, my idea for a Clojure system to fix it. Because if it's data, why not Clojure?