About my current machine learning activities

A (currently sparetime) topic that I've been following over the time again and again is artificial intelligence aka machine learning. You'll find some previous stuff referenced in the "Machine learning" section of my software page; this page is mostly about my current experimenting with neural networks using Tensorflow. This is not an introduction to something, but summarizes in blog-style what I learned from my personal perspective - an experienced programmer who knows about internals of much AI stuff including neural networks, but is new to Tensorflow and Python. As such, it is ongoing work.

I keep a growing collection of Colaboratory Notebooks about some topics I explored, such as how to predict variances, an deliberately as simple as possible LSTM example and an attempt to use the pearson correlation coefficient as custom loss function.

Choice of framework for running / learning with neural networks

There are rather a lot of frameworks for computing with neural nets. After some googling, I selected Tensorflow, because it's in very active development, runs on various platforms from mobile phone to huge clouds, and also because there is the amazing Google Colaboratory, providing a notebook environment that can mix programs, documentation and results, and which also provides you free computing resources: (with in some constraints) you can store your programs and data on Google drive and run your programs on their servers, even using GPUs and TPUs!

Choice of programming language for Tensorflow

Tensorflow does offer rather a lot of interfaces for various languages. I had a look at the Scala interface, but quickly gave up - as far as I understood it's mostly an automated translation of the Python interface, and thus feels foreign, has not the good documentation I have grown to love at least at the Scala standard libraries, so it'd save you little.

Tensorflows more or less native interface language is Python, which also has a wealth of other libraries for data scientists. So, Python 3 it is for, though I miss the clean structure and good documentation of Scala and it's dynamic typing is for me a big pain. I'd reconsider that choice if I'd fortunate enough to do some serious number crunching with Tensorflow professionally.

One thing about Python, originally being conceived as an easy educational / scripting language, is that most of the tutorials are targeted to programming novices, and thus contain lots of noise if you are an experienced programmer. Most helpful for me is the book Python for Programmers: with Big Data and Artificial Intelligence Case Studies..., which also gives many hints where to look for data sciencey libraries and documentation.

Coping with Pythons dynamic typing

I have yet to see advantages of dynamic typing, but there is a hard disadvantage: the IDE has a hard time to support you with documentation and code completion since it usually can't guess what type the variable has. I tried IntelliJ and PyCharm with rather limited success. For me, the best support was provided by Google Colaboratory.

Python provides some features as workarounds for the missing typing: there are functions help, type, dir that print out the features of an object.

Still, you have to make sure that Colaboratory knows the type of the variable where you need documentation. That gap can be filled with the interactive iPython worksheets (e.g. run on collaboratory), where you can call up help while stepping interactively through the program. That works nicely when I observe the following constraints:

  • While developing a program, write it as a iPython worksheet. When it works, move the code into functions / classes and put it into separate modules loaded from there.
  • In a worksheet make all cells idempotent, so that you can run the cell you are currently editing again and again until they do the right thing. When you are working on a cell, you want that "Run before" can be run at any time to reset everything, or do a "run after" when you change a cell and want to check it's effects. When you do some data processing in various steps, this comes rather naturally when you assign the results of each step to new variables.